This class allows to define statistical tables and perturb both count and numerical variables.

ck_setup(x, rkey, dims, w = NULL, countvars = NULL, numvars = NULL)

Arguments

x

an object coercible to a data.frame

rkey

either a column name within x referring to a variable containing record keys or a single integer(ish) number > 5 that referns to the number of digits for record keys that will be generated internally.

dims

a list containing slots for each variable that should be tabulated. Each slot consists should be created/modified using sdcHierarchies::hier_create(), sdcHierarchies::hier_add() and other functionality from package sdcHierarchies.

w

(character) a scalar character referring to a variable in x holding sampling weights. If w is NULL (the default), all weights are assumed to be 1

countvars

(character) an optional vector containing names of binary (0/1 coded) variables withing x that should be included in the problem instance. These variables can later be perturbed.

numvars

(character) an optional vector of numerical variables that can later be tabulated.

Value

A new cellkey_obj object. Such objects (internally) contain the fully computed statistical tables given input microdata (x), the hierarchical definitionals (dims) as well as the remaining inputs. Intermediate results are stored internally and can only be modified / accessed via the exported public methods described below.

Details

Such objects are typically generated using ck_setup().

Methods


Method new()

Create a new table instance

Usage

ck_class$new(x, rkey, dims, w = NULL, countvars = NULL, numvars = NULL)

Arguments

x

an object coercible to a data.frame

rkey

either a column name within x referring to a variable containing record keys or a single integer(ish) number > 5 that referns to the number of digits for record keys that will be generated internally.

dims

a list containing slots for each variable that should be tabulated. Each slot consists should be created/modified using sdcHierarchies::hier_create(), sdcHierarchies::hier_add() and other functionality from package sdcHierarchies.

w

(character) a scalar character referring to a variable in x holding sampling weights. If w is NULL (the default), all weights are assumed to be 1

countvars

(character) an optional vector containing names of binary (0/1 coded) variables withing x that should be included in the problem instance. These variables can later be perturbed.

numvars

(character) an optional vector of numerical variables that can later be tabulated.

Returns

A new cellkey_obj object. Such objects (internally) contain the fully computed statistical tables given input microdata (x), the hierarchical definitionals (dims) as well as the remaining inputs. Intermediate results are stored internally and can only be modified / accessed via the exported public methods described below.


Method perturb()

Perturb a count- or magnitude variable

Usage

ck_class$perturb(v)

Arguments

v

name(s) of count- or magnitude variables that should be perturbed.

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. Updated data can be accessed using other exported methods like $freqtab() or $numtab().


Method freqtab()

Extract results from already perturbed count variables as a data.table

Usage

ck_class$freqtab(v = NULL, path = NULL)

Arguments

v

a vector of variable names for count variables. If NULL (the default), the results are returned for all available count variables. For variables that have not yet perturbed, columns puwc and pwc are filled with NA.

path

if not NULL, a scalar character defining a (relative or absolute) path to which the result table should be written. A csv file will be generated and, if specified, path must have ".csv" as file-ending

Returns

This method returns a data.table containing all combinations of the dimensional variables in the first n columns. Additionally, the following columns are shown:

  • vname: name of the perturbed variable

  • uwc: unweighted counts

  • wc: weighted counts

  • puwc: perturbed unweighted counts or NA if vname was not yet perturbed

  • pwc: perturbed weighted counts or NA if vname was not yet perturbed


Method numtab()

Extract results from already perturbed continuous variables as a data.table.

Usage

ck_class$numtab(v = NULL, mean_before_sum = FALSE, path = NULL)

Arguments

v

a vector of variable names of continuous variables. If NULL (the default), the results are returned for all available numeric variables.

mean_before_sum

(logical); if TRUE, the perturbed values are adjusted by a factor ((n+p))⁄n with

  • n: the original weighted cell value

  • p: the perturbed cell value

This makes sense if the the accuracy of the variable mean is considered to be more important than accuracy of sums of the variable. The default value is FALSE (no adjustment is done)

path

if not NULL, a scalar character defining a (relative or absolute) path to which the result table should be written. A csv file will be generated and, if specified, path must have ".csv" as file-ending

Returns

This method returns a data.table containing all combinations of the dimensional variables in the first n columns. Additionally, the following columns are shown:

  • vname: name of the perturbed variable

  • uws: unweighted sum of the given variable

  • ws: weighted cellsum

  • pws: perturbed weighted sum of the given cell or NA if vname has not not perturbed


Method measures_cnts()

Utility measures for perturbed count variables

Usage

ck_class$measures_cnts(v, exclude_zeros = TRUE)

Arguments

v

name of a count variable for which utility measures should be computed.

exclude_zeros

should empty (zero) cells in the original values be excluded when computing distance measures

Returns

This method returns a list containing a set of utility measures based on some distance functions. For a detailed description of the computed measures, see ck_cnt_measures()


Method measures_nums()

Utility measures for continuous variables (not yet implemented)

Usage

ck_class$measures_nums(v)

Arguments

v

name of a continuous variable for which utility measures should be computed.

Returns

for (now) an empty list; In future versions of the package, the Method will return utility measures for perturbed magnitude tables.


Method allvars()

Names of variables that can be perturbed / tabulated

Usage

ck_class$allvars()

Returns

returns a list with the following two elements:

  • cntvars: character vector with names of available count variables for perturbation

  • numvars: character vector with names of available numerical variables for perturbation


Method cntvars()

Names of count variables that can be perturbed

Usage

ck_class$cntvars()

Returns

a character vector containing variable names


Method numvars()

Names of continuous variables that can be perturbed

Usage

ck_class$numvars()

Returns

a character vector containing variable names


Method hierarchy_info()

Information about hierarchies

Usage

ck_class$hierarchy_info()

Returns

a list (for each dimensional variable) with information on the hierarchies. This may be used to restrict output tables to specific levels or codes. Each list element is a data.table containing the following variables:

  • code: the name of a code within the hierarchy

  • level: number defining the level of the code; the higher the number, the lower the hierarchy with 1 being the overall total

  • is_leaf: if TRUE, this code is a leaf node which means no other codes contribute to it

  • parent: name of the parent code


Method mod_cnts()

Modifications applied to count variables

Usage

ck_class$mod_cnts()

Returns

a data.table containing modifications applied to count variables


Method mod_nums()

Modifications applied to numerical variables

Usage

ck_class$mod_nums()

Returns

a data.table containing modifications applied to numerical variables


Method supp_freq()

Identify sensitive cells based on minimum frequency rule

Usage

ck_class$supp_freq(v, n, weighted = TRUE)

Arguments

v

a single variable name of a continuous variable (see method numvars())

n

a number defining the threshold. All cells <= n are considered as unsafe.

weighted

if TRUE, the weighted number of contributors to a cell are compared to the threshold specified in n (default); else the unweighted number of contributors is used.

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb()).


Method supp_val()

Identify sensitive cells based on weighted or unweighted cell value

Usage

ck_class$supp_val(v, n, weighted = TRUE)

Arguments

v

a single variable name of a continuous variable (see method numvars())

n

a number defining the threshold. All cells <= n are considered as unsafe.

weighted

if TRUE, the weighted cell value of variable v is compared to the threshold specified in n (default); else the unweighted number is used.

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb()).


Method supp_cells()

Identify sensitive cells based on their names

Usage

ck_class$supp_cells(v, inp)

Arguments

v

a single variable name of a continuous variable (see method numvars())

inp

a data.frame where each colum represents a dimensional variable. Each row of this input is then used to compute the relevant cells to be identified as sensitive where NA-values are possible and used to match any characteristics of the dimensional variable.

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb()).


Method supp_p()

Identify sensitive cells based on the p%-rule rule. Please note that this rule can only be applied to positive-only variables.

Usage

ck_class$supp_p(v, p)

Arguments

v

a single variable name of a continuous variable (see method numvars())

p

a number defining a percentage between 1 and 99.

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb()).


Method supp_pq()

Identify sensitive cells based on the pq-rule. Please note that this rule can only be applied to positive-only variables.

Usage

ck_class$supp_pq(v, p, q)

Arguments

v

a single variable name of a continuous variable (see method numvars())

p

a number defining a percentage between 1 and 99.

q

a number defining a percentage between 1 and 99. This value must be larger than p.

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb()).


Method supp_nk()

Identify sensitive cells based on the nk-dominance rule. Please note that this rule can only be applied to positive-only variables.

Usage

ck_class$supp_nk(v, n, k)

Arguments

v

a single variable name of a continuous variable (see method numvars())

n

an integerish number >= 2

k

a number defining a percentage between 1 and 99. All cells to which the top n contributers contribute more than k% is considered unsafe

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb()).


Method params_cnts_get()

Return perturbation parameters of count variables

Usage

ck_class$params_cnts_get()

Returns

a named list in which each list-element contains the active perturbation parameters for the specific count variable defined by the list-name.


Method params_cnts_set()

Set perturbation parameters for count variables

Usage

ck_class$params_cnts_set(val, v = NULL)

Arguments

val

a perturbation object created with ck_params_cnts()

v

a character vector (or NULL). If NULL (the default), the perturbation parameters provided in val are set for all count variables; otherwise one may specify the names of the count variables for which the parameters should be set.

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb()).


Method reset_cntvars()

reset results and parameters for already perturbed count variables

Usage

ck_class$reset_cntvars(v = NULL)

Arguments

v

if v equals NULL (the default), the results are reset for all perturbed count variables; otherwise it is possible to specify the names of already perturbed count variables.

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb() or $freqtab()).


Method reset_numvars()

reset results and parameters for already perturbed numerical variables

Usage

ck_class$reset_numvars(v = NULL)

Arguments

v

if v equals NULL (the default), the results are reset for all perturbed numerical variables; otherwise it is possible to specify the names of already perturbed continuous variables.

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb() or $numtab()).


Method reset_allvars()

reset results and parameters for all already perturbed variables.

Usage

ck_class$reset_allvars()

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb(), $freqtab() or $numtab()).


Method params_nums_get()

Return perturbation parameters of continuous variables

Usage

ck_class$params_nums_get()

Returns

a named list in which each list-element contains the active perturbation parameters for the specific continuous variable defined by the list-name.


Method params_nums_set()

set perturbation parameters for continuous variables.

Usage

ck_class$params_nums_set(val, v = NULL)

Arguments

val

a perturbation object created with ck_params_nums()

v

a character vector (or NULL); if NULL (the default), the perturbation parameters provided in val are set for all continuous variables; otherwise one may specify the names of the numeric variables for which the parameters should be set.

Returns

A modified cellkey_obj object in which private slots were updated for side-effects. These updated values are used by other methods (e.g $perturb()).


Method summary()

some aggregated summary statistics about perturbed variables

Usage

ck_class$summary()

Returns

invisible NULL


Method print()

prints information about the current table

Usage

ck_class$print()

Returns

invisible NULL

Examples

# \donttest{
x <- ck_create_testdata()

# create some 0/1 variables that should be perturbed later
x[, cnt_females := ifelse(sex == "male", 0, 1)]
#>       urbrur roof walls water electcon relat    sex        age hhcivil expend
#>    1:      2    4     3     3        1     1   male age_group3       2   9093
#>    2:      2    4     3     3        1     2 female age_group3       2   2734
#>    3:      2    4     3     3        1     3   male age_group1       1   2652
#>    4:      2    4     3     3        1     3   male age_group1       1   1807
#>    5:      2    4     2     3        1     1   male age_group4       2    671
#>   ---                                                                        
#> 4576:      2    4     3     4        1     2 female age_group3       2   3696
#> 4577:      2    4     3     4        1     3   male age_group1       1    282
#> 4578:      2    4     3     4        1     3   male age_group1       1    840
#> 4579:      2    4     3     4        1     3 female age_group1       1   6258
#> 4580:      2    4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>    1:   5780      12       1              29          25.00000           0
#>    2:   2530      28       1              47          25.00000           1
#>    3:   6920     550       1              88          25.00000           0
#>    4:   7960     870       1              65          25.00000           0
#>    5:   9030      20       2              64          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              80          16.66667           1
#> 4577:   1420     987    1000              39          16.66667           0
#> 4578:   8900     684    1000              80          16.66667           0
#> 4579:   3880     294    1000              33          16.66667           1
#> 4580:   4830     911    1000              25          16.66667           0
x[, cnt_males := ifelse(sex == "male", 1, 0)]
#>       urbrur roof walls water electcon relat    sex        age hhcivil expend
#>    1:      2    4     3     3        1     1   male age_group3       2   9093
#>    2:      2    4     3     3        1     2 female age_group3       2   2734
#>    3:      2    4     3     3        1     3   male age_group1       1   2652
#>    4:      2    4     3     3        1     3   male age_group1       1   1807
#>    5:      2    4     2     3        1     1   male age_group4       2    671
#>   ---                                                                        
#> 4576:      2    4     3     4        1     2 female age_group3       2   3696
#> 4577:      2    4     3     4        1     3   male age_group1       1    282
#> 4578:      2    4     3     4        1     3   male age_group1       1    840
#> 4579:      2    4     3     4        1     3 female age_group1       1   6258
#> 4580:      2    4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>    1:   5780      12       1              29          25.00000           0
#>    2:   2530      28       1              47          25.00000           1
#>    3:   6920     550       1              88          25.00000           0
#>    4:   7960     870       1              65          25.00000           0
#>    5:   9030      20       2              64          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              80          16.66667           1
#> 4577:   1420     987    1000              39          16.66667           0
#> 4578:   8900     684    1000              80          16.66667           0
#> 4579:   3880     294    1000              33          16.66667           1
#> 4580:   4830     911    1000              25          16.66667           0
#>       cnt_males
#>    1:         1
#>    2:         0
#>    3:         1
#>    4:         1
#>    5:         1
#>   ---          
#> 4576:         0
#> 4577:         1
#> 4578:         1
#> 4579:         0
#> 4580:         1
x[, cnt_highincome := ifelse(income >= 9000, 1, 0)]
#>       urbrur roof walls water electcon relat    sex        age hhcivil expend
#>    1:      2    4     3     3        1     1   male age_group3       2   9093
#>    2:      2    4     3     3        1     2 female age_group3       2   2734
#>    3:      2    4     3     3        1     3   male age_group1       1   2652
#>    4:      2    4     3     3        1     3   male age_group1       1   1807
#>    5:      2    4     2     3        1     1   male age_group4       2    671
#>   ---                                                                        
#> 4576:      2    4     3     4        1     2 female age_group3       2   3696
#> 4577:      2    4     3     4        1     3   male age_group1       1    282
#> 4578:      2    4     3     4        1     3   male age_group1       1    840
#> 4579:      2    4     3     4        1     3 female age_group1       1   6258
#> 4580:      2    4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>    1:   5780      12       1              29          25.00000           0
#>    2:   2530      28       1              47          25.00000           1
#>    3:   6920     550       1              88          25.00000           0
#>    4:   7960     870       1              65          25.00000           0
#>    5:   9030      20       2              64          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              80          16.66667           1
#> 4577:   1420     987    1000              39          16.66667           0
#> 4578:   8900     684    1000              80          16.66667           0
#> 4579:   3880     294    1000              33          16.66667           1
#> 4580:   4830     911    1000              25          16.66667           0
#>       cnt_males cnt_highincome
#>    1:         1              0
#>    2:         0              0
#>    3:         1              0
#>    4:         1              0
#>    5:         1              1
#>   ---                         
#> 4576:         0              0
#> 4577:         1              0
#> 4578:         1              0
#> 4579:         0              0
#> 4580:         1              0
# a variable with positive and negative contributions
x[, mixed := sample(-10:10, nrow(x), replace = TRUE)]
#>       urbrur roof walls water electcon relat    sex        age hhcivil expend
#>    1:      2    4     3     3        1     1   male age_group3       2   9093
#>    2:      2    4     3     3        1     2 female age_group3       2   2734
#>    3:      2    4     3     3        1     3   male age_group1       1   2652
#>    4:      2    4     3     3        1     3   male age_group1       1   1807
#>    5:      2    4     2     3        1     1   male age_group4       2    671
#>   ---                                                                        
#> 4576:      2    4     3     4        1     2 female age_group3       2   3696
#> 4577:      2    4     3     4        1     3   male age_group1       1    282
#> 4578:      2    4     3     4        1     3   male age_group1       1    840
#> 4579:      2    4     3     4        1     3 female age_group1       1   6258
#> 4580:      2    4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>    1:   5780      12       1              29          25.00000           0
#>    2:   2530      28       1              47          25.00000           1
#>    3:   6920     550       1              88          25.00000           0
#>    4:   7960     870       1              65          25.00000           0
#>    5:   9030      20       2              64          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              80          16.66667           1
#> 4577:   1420     987    1000              39          16.66667           0
#> 4578:   8900     684    1000              80          16.66667           0
#> 4579:   3880     294    1000              33          16.66667           1
#> 4580:   4830     911    1000              25          16.66667           0
#>       cnt_males cnt_highincome mixed
#>    1:         1              0     7
#>    2:         0              0     2
#>    3:         1              0     8
#>    4:         1              0    -9
#>    5:         1              1     0
#>   ---                               
#> 4576:         0              0    -6
#> 4577:         1              0     5
#> 4578:         1              0    -6
#> 4579:         0              0    -4
#> 4580:         1              0    -1

# create record keys
x$rkey <- ck_generate_rkeys(dat = x)

# define required inputs

# hierarchy with some bogus codes
d_sex <- hier_create(root = "Total", nodes = c("male", "female"))
d_sex <- hier_add(d_sex, root = "female", "f")
d_sex <- hier_add(d_sex, root = "male", "m")

d_age <- hier_create(root = "Total", nodes = paste0("age_group", 1:6))
d_age <- hier_add(d_age, root = "age_group1", "ag1a")
d_age <- hier_add(d_age, root = "age_group2", "ag2a")

# define the cell key object
countvars <- c("cnt_females", "cnt_males", "cnt_highincome")
numvars <- c("expend", "income", "savings", "mixed")
tab <- ck_setup(
  x = x,
  rkey = "rkey",
  dims = list(sex = d_sex, age = d_age),
  w = "sampling_weight",
  countvars = countvars,
  numvars = numvars)
#> computing contributing indices | rawdata <--> table; this might take a while

# show some information about this table instance
tab$print() # identical with print(tab)
#> ── Table Information ───────────────────────────────────────────────────────────
#> ✔ 45 cells in 2 dimensions ('sex', 'age')
#> ✔ weights: yes
#> ── Tabulated / Perturbed countvars ─────────────────────────────────────────────
#> ☐ 'total'
#> ☐ 'cnt_females'
#> ☐ 'cnt_males'
#> ☐ 'cnt_highincome'
#> ── Tabulated / Perturbed numvars ───────────────────────────────────────────────
#> ☐ 'expend'
#> ☐ 'income'
#> ☐ 'savings'
#> ☐ 'mixed'

# information about the hierarchies
tab$hierarchy_info()
#> $sex
#>      code level is_leaf parent
#> 1:  Total     1   FALSE  Total
#> 2:   male     2   FALSE  Total
#> 3:      m     3    TRUE   male
#> 4: female     2   FALSE  Total
#> 5:      f     3    TRUE female
#> 
#> $age
#>          code level is_leaf     parent
#> 1:      Total     1   FALSE      Total
#> 2: age_group1     2   FALSE      Total
#> 3:       ag1a     3    TRUE age_group1
#> 4: age_group2     2   FALSE      Total
#> 5:       ag2a     3    TRUE age_group2
#> 6: age_group3     2    TRUE      Total
#> 7: age_group4     2    TRUE      Total
#> 8: age_group5     2    TRUE      Total
#> 9: age_group6     2    TRUE      Total
#> 

# which variables have been defined?
tab$allvars()
#> $cntvars
#> [1] "total"          "cnt_females"    "cnt_males"      "cnt_highincome"
#> 
#> $numvars
#> [1] "expend"  "income"  "savings" "mixed"  
#> 

# count variables
tab$cntvars()
#> [1] "total"          "cnt_females"    "cnt_males"      "cnt_highincome"

# continuous variables
tab$numvars()
#> [1] "expend"  "income"  "savings" "mixed"  

# create perturbation parameters for "total" variable and
# write to yaml-file

# create a ptable using functionality from the ptable-pkg
f_yaml <- tempfile(fileext = ".yaml")
p_cnts1 <- ck_params_cnts(
  ptab = ptable::pt_ex_cnts(),
  path = f_yaml)
#> yaml configuration '/tmp/RtmpV7PINT/file19622afd0bce.yaml' successfully written.

# read parameters from yaml-file and set them for variable `"total"`
p_cnts1 <- ck_read_yaml(path = f_yaml)

tab$params_cnts_set(val = p_cnts1, v = "total")
#> --> setting perturbation parameters for variable 'total'

# create alternative perturbation parameters by specifying parameters
para2 <- ptable::create_cnt_ptable(
  D = 8, V = 3, js = 2, create = FALSE)

p_cnts2 <- ck_params_cnts(ptab = para2)

# use these ptable it for the remaining variables
tab$params_cnts_set(val = p_cnts2, v = countvars)
#> --> setting perturbation parameters for variable 'cnt_females'
#> --> setting perturbation parameters for variable 'cnt_males'
#> --> setting perturbation parameters for variable 'cnt_highincome'

# perturb a variable
tab$perturb(v = "total")
#> Count variable 'total' was perturbed.

# multiple variables can be perturbed as well
tab$perturb(v = c("cnt_males", "cnt_highincome"))
#> Count variable 'cnt_males' was perturbed.
#> Count variable 'cnt_highincome' was perturbed.

# return weighted and unweighted results
tab$freqtab(v = c("total", "cnt_males"))
#>        sex        age     vname  uwc     wc puwc         pwc
#>  1:  Total      Total     total 4580 276163 4582 276283.5952
#>  2:  Total age_group1     total 1969 119337 1970 119397.6079
#>  3:  Total       ag1a     total 1969 119337 1970 119397.6079
#>  4:  Total age_group2     total 1143  68009 1142  67949.4996
#>  5:  Total       ag2a     total 1143  68009 1142  67949.4996
#>  6:  Total age_group3     total  864  52509  865  52569.7743
#>  7:  Total age_group4     total  423  25225  424  25284.6336
#>  8:  Total age_group5     total  168  10229  168  10229.0000
#>  9:  Total age_group6     total   13    854   14    919.6923
#> 10:   male      Total     total 2296 139992 2295 139931.0279
#> 11:      m      Total     total 2296 139992 2295 139931.0279
#> 12:   male age_group1     total 1015  61607 1016  61667.6966
#> 13:      m age_group1     total 1015  61607 1016  61667.6966
#> 14:   male       ag1a     total 1015  61607 1016  61667.6966
#> 15:      m       ag1a     total 1015  61607 1016  61667.6966
#> 16:   male age_group2     total  571  34328  571  34328.0000
#> 17:      m age_group2     total  571  34328  571  34328.0000
#> 18:   male       ag2a     total  571  34328  571  34328.0000
#> 19:      m       ag2a     total  571  34328  571  34328.0000
#> 20:   male age_group3     total  424  25877  424  25877.0000
#> 21:      m age_group3     total  424  25877  424  25877.0000
#> 22:   male age_group4     total  195  12397  196  12460.5744
#> 23:      m age_group4     total  195  12397  196  12460.5744
#> 24:   male age_group5     total   84   5296   84   5296.0000
#> 25:      m age_group5     total   84   5296   84   5296.0000
#> 26:   male age_group6     total    7    487    6    417.4286
#> 27:      m age_group6     total    7    487    6    417.4286
#> 28: female      Total     total 2284 136171 2285 136230.6195
#> 29:      f      Total     total 2284 136171 2285 136230.6195
#> 30: female age_group1     total  954  57730  955  57790.5136
#> 31:      f age_group1     total  954  57730  955  57790.5136
#> 32: female       ag1a     total  954  57730  955  57790.5136
#> 33:      f       ag1a     total  954  57730  955  57790.5136
#> 34: female age_group2     total  572  33681  574  33798.7657
#> 35:      f age_group2     total  572  33681  574  33798.7657
#> 36: female       ag2a     total  572  33681  574  33798.7657
#> 37:      f       ag2a     total  572  33681  574  33798.7657
#> 38: female age_group3     total  440  26632  440  26632.0000
#> 39:      f age_group3     total  440  26632  440  26632.0000
#> 40: female age_group4     total  228  12828  226  12715.4737
#> 41:      f age_group4     total  228  12828  226  12715.4737
#> 42: female age_group5     total   84   4933   86   5050.4524
#> 43:      f age_group5     total   84   4933   86   5050.4524
#> 44: female age_group6     total    6    367    6    367.0000
#> 45:      f age_group6     total    6    367    6    367.0000
#> 46:  Total      Total cnt_males 2296 139992 2294 139870.0557
#> 47:  Total age_group1 cnt_males 1015  61607 1017  61728.3931
#> 48:  Total       ag1a cnt_males 1015  61607 1017  61728.3931
#> 49:  Total age_group2 cnt_males  571  34328  570  34267.8809
#> 50:  Total       ag2a cnt_males  571  34328  570  34267.8809
#> 51:  Total age_group3 cnt_males  424  25877  423  25815.9693
#> 52:  Total age_group4 cnt_males  195  12397  197  12524.1487
#> 53:  Total age_group5 cnt_males   84   5296   85   5359.0476
#> 54:  Total age_group6 cnt_males    7    487    5    347.8571
#> 55:   male      Total cnt_males 2296 139992 2294 139870.0557
#> 56:      m      Total cnt_males 2296 139992 2294 139870.0557
#> 57:   male age_group1 cnt_males 1015  61607 1017  61728.3931
#> 58:      m age_group1 cnt_males 1015  61607 1017  61728.3931
#> 59:   male       ag1a cnt_males 1015  61607 1017  61728.3931
#> 60:      m       ag1a cnt_males 1015  61607 1017  61728.3931
#> 61:   male age_group2 cnt_males  571  34328  570  34267.8809
#> 62:      m age_group2 cnt_males  571  34328  570  34267.8809
#> 63:   male       ag2a cnt_males  571  34328  570  34267.8809
#> 64:      m       ag2a cnt_males  571  34328  570  34267.8809
#> 65:   male age_group3 cnt_males  424  25877  423  25815.9693
#> 66:      m age_group3 cnt_males  424  25877  423  25815.9693
#> 67:   male age_group4 cnt_males  195  12397  197  12524.1487
#> 68:      m age_group4 cnt_males  195  12397  197  12524.1487
#> 69:   male age_group5 cnt_males   84   5296   85   5359.0476
#> 70:      m age_group5 cnt_males   84   5296   85   5359.0476
#> 71:   male age_group6 cnt_males    7    487    5    347.8571
#> 72:      m age_group6 cnt_males    7    487    5    347.8571
#> 73: female      Total cnt_males    0      0    0      0.0000
#> 74:      f      Total cnt_males    0      0    0      0.0000
#> 75: female age_group1 cnt_males    0      0    0      0.0000
#> 76:      f age_group1 cnt_males    0      0    0      0.0000
#> 77: female       ag1a cnt_males    0      0    0      0.0000
#> 78:      f       ag1a cnt_males    0      0    0      0.0000
#> 79: female age_group2 cnt_males    0      0    0      0.0000
#> 80:      f age_group2 cnt_males    0      0    0      0.0000
#> 81: female       ag2a cnt_males    0      0    0      0.0000
#> 82:      f       ag2a cnt_males    0      0    0      0.0000
#> 83: female age_group3 cnt_males    0      0    0      0.0000
#> 84:      f age_group3 cnt_males    0      0    0      0.0000
#> 85: female age_group4 cnt_males    0      0    0      0.0000
#> 86:      f age_group4 cnt_males    0      0    0      0.0000
#> 87: female age_group5 cnt_males    0      0    0      0.0000
#> 88:      f age_group5 cnt_males    0      0    0      0.0000
#> 89: female age_group6 cnt_males    0      0    0      0.0000
#> 90:      f age_group6 cnt_males    0      0    0      0.0000
#>        sex        age     vname  uwc     wc puwc         pwc

# numerical variables (positive variables using flex-function)
# we also write the config to a yaml file
f_yaml <- tempfile(fileext = ".yaml")

# create a ptable using functionality from the ptable-pkg
# a single ptable for all cells
ptab1 <- ptable::pt_ex_nums(parity = TRUE, separation = FALSE)

# a single ptab for all cells except for very small ones
ptab2 <- ptable::pt_ex_nums(parity = TRUE, separation = TRUE)

# different ptables for cells with even/odd number of contributors
# and very small cells
ptab3 <- ptable::pt_ex_nums(parity = FALSE, separation = TRUE)

p_nums1 <- ck_params_nums(
  ptab = ptab1,
  type = "top_contr",
  top_k = 3,
  mult_params = ck_flexparams(
    fp = 1000,
    p = c(0.30, 0.03),
    epsilon = c(1, 0.5, 0.2),
    q = 3),
  mu_c = 2,
  same_key = FALSE,
  use_zero_rkeys = FALSE,
  path = f_yaml)
#> yaml configuration '/tmp/RtmpV7PINT/file19622b0b57c9.yaml' successfully written.

# we read the parameters from the yaml-file
p_nums1 <- ck_read_yaml(path = f_yaml)

# for variables with positive and negative values
p_nums2 <- ck_params_nums(
  ptab = ptab2,
  type = "top_contr",
  top_k = 3,
  mult_params = ck_flexparams(
    fp = 1000,
    p = c(0.15, 0.02),
    epsilon = c(1, 0.4, 0.15),
    q = 3),
  mu_c = 2,
  same_key = FALSE)

# simple perturbation parameters (not using the flex-function approach)
p_nums3 <- ck_params_nums(
  ptab = ptab3,
  type = "mean",
  mult_params = ck_simpleparams(p = 0.25),
  mu_c = 2,
  same_key = FALSE)

# use `p_nums1` for all variables
tab$params_nums_set(p_nums1, c("savings", "income", "expend"))
#> --> setting perturbation parameters for variable 'savings'
#> --> setting perturbation parameters for variable 'income'
#> --> setting perturbation parameters for variable 'expend'

# use different parameters for variable `mixed`
tab$params_nums_set(p_nums2, v = "mixed")
#> --> setting perturbation parameters for variable 'mixed'

# identify sensitive cells to which extra protection (`mu_c`) is added.
tab$supp_p(v = "income", p = 85)
#> computing contributing indices | rawdata <--> table; this might take a while
#> p%-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_pq(v = "income", p = 85, q = 90)
#> computing contributing indices | rawdata <--> table; this might take a while
#> pq-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_nk(v = "income", n = 2, k = 90)
#> computing contributing indices | rawdata <--> table; this might take a while
#> nk-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_freq(v = "income", n = 14, weighted = FALSE)
#> freq-rule: 5 new sensitive cells (incl. duplicates) found (total: 5)
tab$supp_val(v = "income", n = 10000, weighted = TRUE)
#> val-rule: 0 new sensitive cells (incl. duplicates) found (total: 5)
tab$supp_cells(
  v = "income",
  inp = data.frame(
    sex = c("female", "female"),
    "age" = c("age_group1", "age_group3")
  )
)
#> cell-rule: 2 new sensitive cells (incl. duplicates) found (total: 7)

# perturb variables
tab$perturb(v = c("income", "savings"))
#> Numeric variable 'income' was perturbed.
#> Numeric variable 'savings' was perturbed.

# extract results
tab$numtab("income", mean_before_sum = TRUE)
#>        sex        age  vname      uws         ws        pws
#>  1:  Total      Total income 22952978 1385628863 1385804502
#>  2:  Total age_group1 income  9810547  600332303  600217527
#>  3:  Total       ag1a income  9810547  600332303  600217527
#>  4:  Total age_group2 income  5692119  335570325  335598531
#>  5:  Total       ag2a income  5692119  335570325  335598531
#>  6:  Total age_group3 income  4406946  266236092  266168608
#>  7:  Total age_group4 income  2133543  127826106  127811509
#>  8:  Total age_group5 income   848151   51374174   51340115
#>  9:  Total age_group6 income    61672    4289863    4390404
#> 10:   male      Total income 11262049  682679697  682718158
#> 11:      m      Total income 11262049  682679697  682718158
#> 12:   male age_group1 income  4877164  297605562  297596664
#> 13:      m age_group1 income  4877164  297605562  297596664
#> 14:   male       ag1a income  4877164  297605562  297596664
#> 15:      m       ag1a income  4877164  297605562  297596664
#> 16:   male age_group2 income  2811379  166311378  166301547
#> 17:      m age_group2 income  2811379  166311378  166301547
#> 18:   male       ag2a income  2811379  166311378  166301547
#> 19:      m       ag2a income  2811379  166311378  166301547
#> 20:   male age_group3 income  2168169  129282593  129349250
#> 21:      m age_group3 income  2168169  129282593  129349250
#> 22:   male age_group4 income   978510   62146536   62122308
#> 23:      m age_group4 income   978510   62146536   62122308
#> 24:   male age_group5 income   393134   24801556   24848730
#> 25:      m age_group5 income   393134   24801556   24848730
#> 26:   male age_group6 income    33693    2532072    2511503
#> 27:      m age_group6 income    33693    2532072    2511503
#> 28: female      Total income 11690929  702949166  702952106
#> 29:      f      Total income 11690929  702949166  702952106
#> 30: female age_group1 income  4933383  302726741  302855037
#> 31:      f age_group1 income  4933383  302726741  302855037
#> 32: female       ag1a income  4933383  302726741  302855037
#> 33:      f       ag1a income  4933383  302726741  302855037
#> 34: female age_group2 income  2880740  169258947  169283500
#> 35:      f age_group2 income  2880740  169258947  169283500
#> 36: female       ag2a income  2880740  169258947  169283500
#> 37:      f       ag2a income  2880740  169258947  169283500
#> 38: female age_group3 income  2238777  136953499  136771001
#> 39:      f age_group3 income  2238777  136953499  136771001
#> 40: female age_group4 income  1155033   65679570   65682860
#> 41:      f age_group4 income  1155033   65679570   65682860
#> 42: female age_group5 income   455017   26572618   26554650
#> 43:      f age_group5 income   455017   26572618   26554650
#> 44: female age_group6 income    27979    1757791    1684555
#> 45:      f age_group6 income    27979    1757791    1684555
#>        sex        age  vname      uws         ws        pws
tab$numtab("income", mean_before_sum = FALSE)
#>        sex        age  vname      uws         ws        pws
#>  1:  Total      Total income 22952978 1385628863 1385716680
#>  2:  Total age_group1 income  9810547  600332303  600274912
#>  3:  Total       ag1a income  9810547  600332303  600274912
#>  4:  Total age_group2 income  5692119  335570325  335584428
#>  5:  Total       ag2a income  5692119  335570325  335584428
#>  6:  Total age_group3 income  4406946  266236092  266202348
#>  7:  Total age_group4 income  2133543  127826106  127818807
#>  8:  Total age_group5 income   848151   51374174   51357142
#>  9:  Total age_group6 income    61672    4289863    4339842
#> 10:   male      Total income 11262049  682679697  682698927
#> 11:      m      Total income 11262049  682679697  682698927
#> 12:   male age_group1 income  4877164  297605562  297601113
#> 13:      m age_group1 income  4877164  297605562  297601113
#> 14:   male       ag1a income  4877164  297605562  297601113
#> 15:      m       ag1a income  4877164  297605562  297601113
#> 16:   male age_group2 income  2811379  166311378  166306463
#> 17:      m age_group2 income  2811379  166311378  166306463
#> 18:   male       ag2a income  2811379  166311378  166306463
#> 19:      m       ag2a income  2811379  166311378  166306463
#> 20:   male age_group3 income  2168169  129282593  129315917
#> 21:      m age_group3 income  2168169  129282593  129315917
#> 22:   male age_group4 income   978510   62146536   62134421
#> 23:      m age_group4 income   978510   62146536   62134421
#> 24:   male age_group5 income   393134   24801556   24825132
#> 25:      m age_group5 income   393134   24801556   24825132
#> 26:   male age_group6 income    33693    2532072    2521767
#> 27:      m age_group6 income    33693    2532072    2521767
#> 28: female      Total income 11690929  702949166  702950636
#> 29:      f      Total income 11690929  702949166  702950636
#> 30: female age_group1 income  4933383  302726741  302790882
#> 31:      f age_group1 income  4933383  302726741  302790882
#> 32: female       ag1a income  4933383  302726741  302790882
#> 33:      f       ag1a income  4933383  302726741  302790882
#> 34: female age_group2 income  2880740  169258947  169271223
#> 35:      f age_group2 income  2880740  169258947  169271223
#> 36: female       ag2a income  2880740  169258947  169271223
#> 37:      f       ag2a income  2880740  169258947  169271223
#> 38: female age_group3 income  2238777  136953499  136862219
#> 39:      f age_group3 income  2238777  136953499  136862219
#> 40: female age_group4 income  1155033   65679570   65681215
#> 41:      f age_group4 income  1155033   65679570   65681215
#> 42: female age_group5 income   455017   26572618   26563633
#> 43:      f age_group5 income   455017   26572618   26563633
#> 44: female age_group6 income    27979    1757791    1720783
#> 45:      f age_group6 income    27979    1757791    1720783
#>        sex        age  vname      uws         ws        pws
tab$numtab("savings")
#>        sex        age   vname     uws        ws         pws
#>  1:  Total      Total savings 2273532 136356464 136356464.0
#>  2:  Total age_group1 savings  982386  59300639  59304455.4
#>  3:  Total       ag1a savings  982386  59300639  59304455.4
#>  4:  Total age_group2 savings  552336  32529052  32532255.1
#>  5:  Total       ag2a savings  552336  32529052  32532255.1
#>  6:  Total age_group3 savings  437101  26387359  26389986.7
#>  7:  Total age_group4 savings  214661  12899679  12899113.7
#>  8:  Total age_group5 savings   80451   4797470   4797196.9
#>  9:  Total age_group6 savings    6597    442265    442265.0
#> 10:   male      Total savings 1159816  70165683  70166040.4
#> 11:      m      Total savings 1159816  70165683  70166040.4
#> 12:   male age_group1 savings  517660  31173575  31173533.9
#> 13:      m age_group1 savings  517660  31173575  31173533.9
#> 14:   male       ag1a savings  517660  31173575  31173533.9
#> 15:      m       ag1a savings  517660  31173575  31173533.9
#> 16:   male age_group2 savings  280923  16638011  16638034.6
#> 17:      m age_group2 savings  280923  16638011  16638034.6
#> 18:   male       ag2a savings  280923  16638011  16638034.6
#> 19:      m       ag2a savings  280923  16638011  16638034.6
#> 20:   male age_group3 savings  214970  13155904  13154088.1
#> 21:      m age_group3 savings  214970  13155904  13154088.1
#> 22:   male age_group4 savings   99420   6274684   6271591.7
#> 23:      m age_group4 savings   99420   6274684   6271591.7
#> 24:   male age_group5 savings   43233   2658762   2660200.5
#> 25:      m age_group5 savings   43233   2658762   2660200.5
#> 26:   male age_group6 savings    3610    264747    264576.2
#> 27:      m age_group6 savings    3610    264747    264576.2
#> 28: female      Total savings 1113716  66190781  66190575.9
#> 29:      f      Total savings 1113716  66190781  66190575.9
#> 30: female age_group1 savings  464726  28127064  28122665.4
#> 31:      f age_group1 savings  464726  28127064  28122665.4
#> 32: female       ag1a savings  464726  28127064  28122665.4
#> 33:      f       ag1a savings  464726  28127064  28122665.4
#> 34: female age_group2 savings  271413  15891041  15891729.2
#> 35:      f age_group2 savings  271413  15891041  15891729.2
#> 36: female       ag2a savings  271413  15891041  15891729.2
#> 37:      f       ag2a savings  271413  15891041  15891729.2
#> 38: female age_group3 savings  222131  13231455  13225166.7
#> 39:      f age_group3 savings  222131  13231455  13225166.7
#> 40: female age_group4 savings  115241   6624995   6624652.3
#> 41:      f age_group4 savings  115241   6624995   6624652.3
#> 42: female age_group5 savings   37218   2138708   2138064.9
#> 43:      f age_group5 savings   37218   2138708   2138064.9
#> 44: female age_group6 savings    2987    177518    175307.2
#> 45:      f age_group6 savings    2987    177518    175307.2
#>        sex        age   vname     uws        ws         pws

# results can be resetted, too
tab$reset_cntvars(v = "cnt_males")

# we can then set other parameters and perturb again
tab$params_cnts_set(val = p_cnts1, v = "cnt_males")
#> --> setting perturbation parameters for variable 'cnt_males'

tab$perturb(v = "cnt_males")
#> Count variable 'cnt_males' was perturbed.

# write results to a .csv file
tab$freqtab(
  v = c("total", "cnt_males"),
  path = file.path(tempdir(), "outtab.csv")
)
#> File '/tmp/RtmpV7PINT/outtab.csv' successfully written to disk.
#> NULL

# show results containing weighted and unweighted results
tab$freqtab(v = c("total", "cnt_males"))
#>        sex        age     vname  uwc     wc puwc         pwc
#>  1:  Total      Total     total 4580 276163 4582 276283.5952
#>  2:  Total age_group1     total 1969 119337 1970 119397.6079
#>  3:  Total       ag1a     total 1969 119337 1970 119397.6079
#>  4:  Total age_group2     total 1143  68009 1142  67949.4996
#>  5:  Total       ag2a     total 1143  68009 1142  67949.4996
#>  6:  Total age_group3     total  864  52509  865  52569.7743
#>  7:  Total age_group4     total  423  25225  424  25284.6336
#>  8:  Total age_group5     total  168  10229  168  10229.0000
#>  9:  Total age_group6     total   13    854   14    919.6923
#> 10:   male      Total     total 2296 139992 2295 139931.0279
#> 11:      m      Total     total 2296 139992 2295 139931.0279
#> 12:   male age_group1     total 1015  61607 1016  61667.6966
#> 13:      m age_group1     total 1015  61607 1016  61667.6966
#> 14:   male       ag1a     total 1015  61607 1016  61667.6966
#> 15:      m       ag1a     total 1015  61607 1016  61667.6966
#> 16:   male age_group2     total  571  34328  571  34328.0000
#> 17:      m age_group2     total  571  34328  571  34328.0000
#> 18:   male       ag2a     total  571  34328  571  34328.0000
#> 19:      m       ag2a     total  571  34328  571  34328.0000
#> 20:   male age_group3     total  424  25877  424  25877.0000
#> 21:      m age_group3     total  424  25877  424  25877.0000
#> 22:   male age_group4     total  195  12397  196  12460.5744
#> 23:      m age_group4     total  195  12397  196  12460.5744
#> 24:   male age_group5     total   84   5296   84   5296.0000
#> 25:      m age_group5     total   84   5296   84   5296.0000
#> 26:   male age_group6     total    7    487    6    417.4286
#> 27:      m age_group6     total    7    487    6    417.4286
#> 28: female      Total     total 2284 136171 2285 136230.6195
#> 29:      f      Total     total 2284 136171 2285 136230.6195
#> 30: female age_group1     total  954  57730  955  57790.5136
#> 31:      f age_group1     total  954  57730  955  57790.5136
#> 32: female       ag1a     total  954  57730  955  57790.5136
#> 33:      f       ag1a     total  954  57730  955  57790.5136
#> 34: female age_group2     total  572  33681  574  33798.7657
#> 35:      f age_group2     total  572  33681  574  33798.7657
#> 36: female       ag2a     total  572  33681  574  33798.7657
#> 37:      f       ag2a     total  572  33681  574  33798.7657
#> 38: female age_group3     total  440  26632  440  26632.0000
#> 39:      f age_group3     total  440  26632  440  26632.0000
#> 40: female age_group4     total  228  12828  226  12715.4737
#> 41:      f age_group4     total  228  12828  226  12715.4737
#> 42: female age_group5     total   84   4933   86   5050.4524
#> 43:      f age_group5     total   84   4933   86   5050.4524
#> 44: female age_group6     total    6    367    6    367.0000
#> 45:      f age_group6     total    6    367    6    367.0000
#> 46:  Total      Total cnt_males 2296 139992 2295 139931.0279
#> 47:  Total age_group1 cnt_males 1015  61607 1016  61667.6966
#> 48:  Total       ag1a cnt_males 1015  61607 1016  61667.6966
#> 49:  Total age_group2 cnt_males  571  34328  571  34328.0000
#> 50:  Total       ag2a cnt_males  571  34328  571  34328.0000
#> 51:  Total age_group3 cnt_males  424  25877  424  25877.0000
#> 52:  Total age_group4 cnt_males  195  12397  196  12460.5744
#> 53:  Total age_group5 cnt_males   84   5296   84   5296.0000
#> 54:  Total age_group6 cnt_males    7    487    6    417.4286
#> 55:   male      Total cnt_males 2296 139992 2295 139931.0279
#> 56:      m      Total cnt_males 2296 139992 2295 139931.0279
#> 57:   male age_group1 cnt_males 1015  61607 1016  61667.6966
#> 58:      m age_group1 cnt_males 1015  61607 1016  61667.6966
#> 59:   male       ag1a cnt_males 1015  61607 1016  61667.6966
#> 60:      m       ag1a cnt_males 1015  61607 1016  61667.6966
#> 61:   male age_group2 cnt_males  571  34328  571  34328.0000
#> 62:      m age_group2 cnt_males  571  34328  571  34328.0000
#> 63:   male       ag2a cnt_males  571  34328  571  34328.0000
#> 64:      m       ag2a cnt_males  571  34328  571  34328.0000
#> 65:   male age_group3 cnt_males  424  25877  424  25877.0000
#> 66:      m age_group3 cnt_males  424  25877  424  25877.0000
#> 67:   male age_group4 cnt_males  195  12397  196  12460.5744
#> 68:      m age_group4 cnt_males  195  12397  196  12460.5744
#> 69:   male age_group5 cnt_males   84   5296   84   5296.0000
#> 70:      m age_group5 cnt_males   84   5296   84   5296.0000
#> 71:   male age_group6 cnt_males    7    487    6    417.4286
#> 72:      m age_group6 cnt_males    7    487    6    417.4286
#> 73: female      Total cnt_males    0      0    0      0.0000
#> 74:      f      Total cnt_males    0      0    0      0.0000
#> 75: female age_group1 cnt_males    0      0    0      0.0000
#> 76:      f age_group1 cnt_males    0      0    0      0.0000
#> 77: female       ag1a cnt_males    0      0    0      0.0000
#> 78:      f       ag1a cnt_males    0      0    0      0.0000
#> 79: female age_group2 cnt_males    0      0    0      0.0000
#> 80:      f age_group2 cnt_males    0      0    0      0.0000
#> 81: female       ag2a cnt_males    0      0    0      0.0000
#> 82:      f       ag2a cnt_males    0      0    0      0.0000
#> 83: female age_group3 cnt_males    0      0    0      0.0000
#> 84:      f age_group3 cnt_males    0      0    0      0.0000
#> 85: female age_group4 cnt_males    0      0    0      0.0000
#> 86:      f age_group4 cnt_males    0      0    0      0.0000
#> 87: female age_group5 cnt_males    0      0    0      0.0000
#> 88:      f age_group5 cnt_males    0      0    0      0.0000
#> 89: female age_group6 cnt_males    0      0    0      0.0000
#> 90:      f age_group6 cnt_males    0      0    0      0.0000
#>        sex        age     vname  uwc     wc puwc         pwc

# utility measures for a count variable
tab$measures_cnts(v = "total", exclude_zeros = TRUE)
#> $overview
#>    noise cnt        pct
#> 1:    -2   7 0.15555556
#> 2:    -1  17 0.37777778
#> 3:     0  13 0.28888889
#> 4:     1   6 0.13333333
#> 5:     2   2 0.04444444
#> 
#> $measures
#>       what    d1    d2    d3
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 1.000 0.000 0.010
#>  5:    Q40 1.000 0.000 0.011
#>  6:   Mean 0.911 0.010 0.031
#>  7: Median 1.000 0.001 0.016
#>  8:    Q60 1.000 0.001 0.016
#>  9:    Q70 1.000 0.002 0.023
#> 10:    Q80 1.200 0.004 0.042
#> 11:    Q90 2.000 0.018 0.092
#> 12:    Q95 2.000 0.066 0.131
#> 13:    Q99 2.000 0.143 0.196
#> 14:    Max 2.000 0.143 0.196
#> 
#> $cumdistr_d1
#>    cat cnt       pct
#> 1:   0  13 0.2888889
#> 2:   1  36 0.8000000
#> 3:   2  45 1.0000000
#> 
#> $cumdistr_d2
#>            cat cnt       pct
#> 1:    [0,0.02]  40 0.8888889
#> 2: (0.02,0.05]  42 0.9333333
#> 3:  (0.05,0.1]  43 0.9555556
#> 4:   (0.1,0.2]  45 1.0000000
#> 5:   (0.2,0.3]  45 1.0000000
#> 6:   (0.3,0.4]  45 1.0000000
#> 7:   (0.4,0.5]  45 1.0000000
#> 8:   (0.5,Inf]  45 1.0000000
#> 
#> $cumdistr_d3
#>            cat cnt       pct
#> 1:    [0,0.02]  31 0.6888889
#> 2: (0.02,0.05]  38 0.8444444
#> 3:  (0.05,0.1]  40 0.8888889
#> 4:   (0.1,0.2]  45 1.0000000
#> 5:   (0.2,0.3]  45 1.0000000
#> 6:   (0.3,0.4]  45 1.0000000
#> 7:   (0.4,0.5]  45 1.0000000
#> 8:   (0.5,Inf]  45 1.0000000
#> 
#> $false_zero
#> [1] 0
#> 
#> $false_nonzero
#> [1] 0
#> 
#> $exclude_zeros
#> [1] TRUE
#> 

# modifications for perturbed count variables
tab$mod_cnts()
#>         sex        age row_nr pert      ckey  countvar
#>   1:  Total      Total     17    2 0.9921385     total
#>   2:  Total age_group1     16    1 0.7232436     total
#>   3:  Total       ag1a     16    1 0.7232436     total
#>   4:  Total age_group2     14   -1 0.2842725     total
#>   5:  Total       ag2a     14   -1 0.2842725     total
#>  ---                                                  
#> 131:      f age_group4     -1    0 0.0000000 cnt_males
#> 132: female age_group5     -1    0 0.0000000 cnt_males
#> 133:      f age_group5     -1    0 0.0000000 cnt_males
#> 134: female age_group6     -1    0 0.0000000 cnt_males
#> 135:      f age_group6     -1    0 0.0000000 cnt_males

# display a summary about utility measures
tab$summary()
#> ┌──────────────────────────────────────────────┐
#> │Utility measures for perturbed count variables│
#> └──────────────────────────────────────────────┘
#> ── Distribution statistics of perturbations ────────────────────────────────────
#>          countvar Min  Q10 Q20 Q30 Q40  Mean Median Q60 Q70 Q80 Q90 Q95 Q99 Max
#> 1:          total  -2 -1.0   0   0   0 0.467      1   1   1 1.0   2   2   2   2
#> 2: cnt_highincome  -2 -1.6  -1   0   0 0.267      0   1   1 1.0   2   2   2   2
#> 3:      cnt_males  -1 -1.0   0   0   0 0.067      0   0   0 0.2   1   1   1   1
#> 
#> ── Distance-based measures ─────────────────────────────────────────────────────
#> ✔ Variable: 'total'
#> 
#>       what    d1    d2    d3
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 1.000 0.000 0.010
#>  5:    Q40 1.000 0.000 0.011
#>  6:   Mean 0.911 0.010 0.031
#>  7: Median 1.000 0.001 0.016
#>  8:    Q60 1.000 0.001 0.016
#>  9:    Q70 1.000 0.002 0.023
#> 10:    Q80 1.200 0.004 0.042
#> 11:    Q90 2.000 0.018 0.092
#> 12:    Q95 2.000 0.066 0.131
#> 13:    Q99 2.000 0.143 0.196
#> 14:    Max 2.000 0.143 0.196
#> 
#> ✔ Variable: 'cnt_males'
#> 
#>       what    d1    d2    d3
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 0.000 0.000 0.000
#>  6:   Mean 0.556 0.017 0.030
#>  7: Median 1.000 0.000 0.010
#>  8:    Q60 1.000 0.001 0.016
#>  9:    Q70 1.000 0.001 0.016
#> 10:    Q80 1.000 0.004 0.032
#> 11:    Q90 1.000 0.060 0.100
#> 12:    Q95 1.000 0.143 0.196
#> 13:    Q99 1.000 0.143 0.196
#> 14:    Max 1.000 0.143 0.196
#> 
#> ✔ Variable: 'cnt_highincome'
#> 
#>       what  d1    d2    d3
#>  1:    Min 0.0 0.000 0.000
#>  2:    Q10 0.0 0.000 0.000
#>  3:    Q20 0.0 0.000 0.000
#>  4:    Q30 1.0 0.010 0.053
#>  5:    Q40 1.0 0.011 0.062
#>  6:   Mean 1.1 0.039 0.087
#>  7: Median 1.0 0.015 0.072
#>  8:    Q60 1.0 0.024 0.078
#>  9:    Q70 2.0 0.035 0.113
#> 10:    Q80 2.0 0.053 0.131
#> 11:    Q90 2.0 0.074 0.175
#> 12:    Q95 2.0 0.150 0.191
#> 13:    Q99 2.0 0.286 0.354
#> 14:    Max 2.0 0.286 0.354
#> 
#> ┌──────────────────────────────────────────────────┐
#> │Utility measures for perturbed numerical variables│
#> └──────────────────────────────────────────────────┘
#> ── Distribution statistics of perturbations ────────────────────────────────────
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#>      vname        Min       Q10        Q20      Q30       Q40     Mean
#> 1:  expend        Inf        NA         NA       NA        NA      NaN
#> 2:  income -91279.514 -37007.77 -12115.073 -8648.08 -4915.396 2232.401
#> 3: savings  -6288.311  -4398.60  -2210.794  -627.50  -232.296 -556.204
#> 4:   mixed        Inf        NA         NA       NA        NA      NaN
#>       Median      Q60      Q70       Q80       Q90      Q95       Q99       Max
#> 1:        NA       NA       NA        NA        NA       NA        NA      -Inf
#> 2: -4449.195 5897.367 13737.35 23575.702 58476.343 64141.00 77399.388 87816.696
#> 3:   -41.084    0.000    23.63   688.183  2152.026  3203.07  3816.446  3816.446
#> 4:        NA       NA       NA        NA        NA       NA        NA      -Inf
# }