ck_flexparams() allows to define a flex function that is used to lookup perturbation magnitudes (percentages) used when perturbing continuous variables.

ck_flexparams(fp, p = c(0.25, 0.05), epsilon = 1, q = 3)

Arguments

fp

(numeric scalar); at which point should the noise coefficient function reaches its desired maximum (defined by the first element of p)

p

a numeric vector of length 2 where both elements specify a percentage. The first value refers to the desired maximum perturbation percentage for small cells (depending on fp) while the second element refers to the desired maximum perturbation percentage for large cells. Both values must be between 0 and 1 and need to be in descending order.

epsilon

a numeric vector in descending order with all values >= 0 and <= 1 with the first element forced to equal 1. The length of this vector must correspond with the number top_k specified in ck_params_nums() when creating parameters for type == "top_contr" which is checked at runtime. This setting allows to use different flex-functions for the largest top_k contributors.

q

(numeric scalar); Parameter of the function; q needs to be >= 1

Value

an object suitable as input for ck_params_nums().

Details

details about the flex function can be found in Deliverable D4.2, Part I in SGA "Open Source tools for perturbative confidentiality methods"

Examples

# \donttest{
x <- ck_create_testdata()

# create some 0/1 variables that should be perturbed later
x[, cnt_females := ifelse(sex == "male", 0, 1)]
#>       urbrur roof walls water electcon relat    sex        age hhcivil expend
#>    1:      2    4     3     3        1     1   male age_group3       2   9093
#>    2:      2    4     3     3        1     2 female age_group3       2   2734
#>    3:      2    4     3     3        1     3   male age_group1       1   2652
#>    4:      2    4     3     3        1     3   male age_group1       1   1807
#>    5:      2    4     2     3        1     1   male age_group4       2    671
#>   ---                                                                        
#> 4576:      2    4     3     4        1     2 female age_group3       2   3696
#> 4577:      2    4     3     4        1     3   male age_group1       1    282
#> 4578:      2    4     3     4        1     3   male age_group1       1    840
#> 4579:      2    4     3     4        1     3 female age_group1       1   6258
#> 4580:      2    4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>    1:   5780      12       1              96          25.00000           0
#>    2:   2530      28       1              75          25.00000           1
#>    3:   6920     550       1              68          25.00000           0
#>    4:   7960     870       1              29          25.00000           0
#>    5:   9030      20       2              34          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              93          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              60          16.66667           0
#> 4579:   3880     294    1000              26          16.66667           1
#> 4580:   4830     911    1000              64          16.66667           0
x[, cnt_males := ifelse(sex == "male", 1, 0)]
#>       urbrur roof walls water electcon relat    sex        age hhcivil expend
#>    1:      2    4     3     3        1     1   male age_group3       2   9093
#>    2:      2    4     3     3        1     2 female age_group3       2   2734
#>    3:      2    4     3     3        1     3   male age_group1       1   2652
#>    4:      2    4     3     3        1     3   male age_group1       1   1807
#>    5:      2    4     2     3        1     1   male age_group4       2    671
#>   ---                                                                        
#> 4576:      2    4     3     4        1     2 female age_group3       2   3696
#> 4577:      2    4     3     4        1     3   male age_group1       1    282
#> 4578:      2    4     3     4        1     3   male age_group1       1    840
#> 4579:      2    4     3     4        1     3 female age_group1       1   6258
#> 4580:      2    4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>    1:   5780      12       1              96          25.00000           0
#>    2:   2530      28       1              75          25.00000           1
#>    3:   6920     550       1              68          25.00000           0
#>    4:   7960     870       1              29          25.00000           0
#>    5:   9030      20       2              34          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              93          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              60          16.66667           0
#> 4579:   3880     294    1000              26          16.66667           1
#> 4580:   4830     911    1000              64          16.66667           0
#>       cnt_males
#>    1:         1
#>    2:         0
#>    3:         1
#>    4:         1
#>    5:         1
#>   ---          
#> 4576:         0
#> 4577:         1
#> 4578:         1
#> 4579:         0
#> 4580:         1
x[, cnt_highincome := ifelse(income >= 9000, 1, 0)]
#>       urbrur roof walls water electcon relat    sex        age hhcivil expend
#>    1:      2    4     3     3        1     1   male age_group3       2   9093
#>    2:      2    4     3     3        1     2 female age_group3       2   2734
#>    3:      2    4     3     3        1     3   male age_group1       1   2652
#>    4:      2    4     3     3        1     3   male age_group1       1   1807
#>    5:      2    4     2     3        1     1   male age_group4       2    671
#>   ---                                                                        
#> 4576:      2    4     3     4        1     2 female age_group3       2   3696
#> 4577:      2    4     3     4        1     3   male age_group1       1    282
#> 4578:      2    4     3     4        1     3   male age_group1       1    840
#> 4579:      2    4     3     4        1     3 female age_group1       1   6258
#> 4580:      2    4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>    1:   5780      12       1              96          25.00000           0
#>    2:   2530      28       1              75          25.00000           1
#>    3:   6920     550       1              68          25.00000           0
#>    4:   7960     870       1              29          25.00000           0
#>    5:   9030      20       2              34          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              93          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              60          16.66667           0
#> 4579:   3880     294    1000              26          16.66667           1
#> 4580:   4830     911    1000              64          16.66667           0
#>       cnt_males cnt_highincome
#>    1:         1              0
#>    2:         0              0
#>    3:         1              0
#>    4:         1              0
#>    5:         1              1
#>   ---                         
#> 4576:         0              0
#> 4577:         1              0
#> 4578:         1              0
#> 4579:         0              0
#> 4580:         1              0
# a variable with positive and negative contributions
x[, mixed := sample(-10:10, nrow(x), replace = TRUE)]
#>       urbrur roof walls water electcon relat    sex        age hhcivil expend
#>    1:      2    4     3     3        1     1   male age_group3       2   9093
#>    2:      2    4     3     3        1     2 female age_group3       2   2734
#>    3:      2    4     3     3        1     3   male age_group1       1   2652
#>    4:      2    4     3     3        1     3   male age_group1       1   1807
#>    5:      2    4     2     3        1     1   male age_group4       2    671
#>   ---                                                                        
#> 4576:      2    4     3     4        1     2 female age_group3       2   3696
#> 4577:      2    4     3     4        1     3   male age_group1       1    282
#> 4578:      2    4     3     4        1     3   male age_group1       1    840
#> 4579:      2    4     3     4        1     3 female age_group1       1   6258
#> 4580:      2    4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>    1:   5780      12       1              96          25.00000           0
#>    2:   2530      28       1              75          25.00000           1
#>    3:   6920     550       1              68          25.00000           0
#>    4:   7960     870       1              29          25.00000           0
#>    5:   9030      20       2              34          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              93          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              60          16.66667           0
#> 4579:   3880     294    1000              26          16.66667           1
#> 4580:   4830     911    1000              64          16.66667           0
#>       cnt_males cnt_highincome mixed
#>    1:         1              0    -9
#>    2:         0              0     2
#>    3:         1              0    -5
#>    4:         1              0    -2
#>    5:         1              1    10
#>   ---                               
#> 4576:         0              0    -3
#> 4577:         1              0     3
#> 4578:         1              0    -3
#> 4579:         0              0    -7
#> 4580:         1              0     0

# create record keys
x$rkey <- ck_generate_rkeys(dat = x)

# define required inputs

# hierarchy with some bogus codes
d_sex <- hier_create(root = "Total", nodes = c("male", "female"))
d_sex <- hier_add(d_sex, root = "female", "f")
d_sex <- hier_add(d_sex, root = "male", "m")

d_age <- hier_create(root = "Total", nodes = paste0("age_group", 1:6))
d_age <- hier_add(d_age, root = "age_group1", "ag1a")
d_age <- hier_add(d_age, root = "age_group2", "ag2a")

# define the cell key object
countvars <- c("cnt_females", "cnt_males", "cnt_highincome")
numvars <- c("expend", "income", "savings", "mixed")
tab <- ck_setup(
  x = x,
  rkey = "rkey",
  dims = list(sex = d_sex, age = d_age),
  w = "sampling_weight",
  countvars = countvars,
  numvars = numvars)
#> computing contributing indices | rawdata <--> table; this might take a while

# show some information about this table instance
tab$print() # identical with print(tab)
#> ── Table Information ───────────────────────────────────────────────────────────
#> ✔ 45 cells in 2 dimensions ('sex', 'age')
#> ✔ weights: yes
#> ── Tabulated / Perturbed countvars ─────────────────────────────────────────────
#> ☐ 'total'
#> ☐ 'cnt_females'
#> ☐ 'cnt_males'
#> ☐ 'cnt_highincome'
#> ── Tabulated / Perturbed numvars ───────────────────────────────────────────────
#> ☐ 'expend'
#> ☐ 'income'
#> ☐ 'savings'
#> ☐ 'mixed'

# information about the hierarchies
tab$hierarchy_info()
#> $sex
#>      code level is_leaf parent
#> 1:  Total     1   FALSE  Total
#> 2:   male     2   FALSE  Total
#> 3:      m     3    TRUE   male
#> 4: female     2   FALSE  Total
#> 5:      f     3    TRUE female
#> 
#> $age
#>          code level is_leaf     parent
#> 1:      Total     1   FALSE      Total
#> 2: age_group1     2   FALSE      Total
#> 3:       ag1a     3    TRUE age_group1
#> 4: age_group2     2   FALSE      Total
#> 5:       ag2a     3    TRUE age_group2
#> 6: age_group3     2    TRUE      Total
#> 7: age_group4     2    TRUE      Total
#> 8: age_group5     2    TRUE      Total
#> 9: age_group6     2    TRUE      Total
#> 

# which variables have been defined?
tab$allvars()
#> $cntvars
#> [1] "total"          "cnt_females"    "cnt_males"      "cnt_highincome"
#> 
#> $numvars
#> [1] "expend"  "income"  "savings" "mixed"  
#> 

# count variables
tab$cntvars()
#> [1] "total"          "cnt_females"    "cnt_males"      "cnt_highincome"

# continuous variables
tab$numvars()
#> [1] "expend"  "income"  "savings" "mixed"  

# create perturbation parameters for "total" variable and
# write to yaml-file

# create a ptable using functionality from the ptable-pkg
f_yaml <- tempfile(fileext = ".yaml")
p_cnts1 <- ck_params_cnts(
  ptab = ptable::pt_ex_cnts(),
  path = f_yaml)
#> yaml configuration '/tmp/RtmpV7PINT/file19623d81aba1.yaml' successfully written.

# read parameters from yaml-file and set them for variable `"total"`
p_cnts1 <- ck_read_yaml(path = f_yaml)

tab$params_cnts_set(val = p_cnts1, v = "total")
#> --> setting perturbation parameters for variable 'total'

# create alternative perturbation parameters by specifying parameters
para2 <- ptable::create_cnt_ptable(
  D = 8, V = 3, js = 2, create = FALSE)

p_cnts2 <- ck_params_cnts(ptab = para2)

# use these ptable it for the remaining variables
tab$params_cnts_set(val = p_cnts2, v = countvars)
#> --> setting perturbation parameters for variable 'cnt_females'
#> --> setting perturbation parameters for variable 'cnt_males'
#> --> setting perturbation parameters for variable 'cnt_highincome'

# perturb a variable
tab$perturb(v = "total")
#> Count variable 'total' was perturbed.

# multiple variables can be perturbed as well
tab$perturb(v = c("cnt_males", "cnt_highincome"))
#> Count variable 'cnt_males' was perturbed.
#> Count variable 'cnt_highincome' was perturbed.

# return weighted and unweighted results
tab$freqtab(v = c("total", "cnt_males"))
#>        sex        age     vname  uwc     wc puwc         pwc
#>  1:  Total      Total     total 4580 271794 4580 271794.0000
#>  2:  Total age_group1     total 1969 115924 1971 116041.7491
#>  3:  Total       ag1a     total 1969 115924 1971 116041.7491
#>  4:  Total age_group2     total 1143  68184 1144  68243.6535
#>  5:  Total       ag2a     total 1143  68184 1144  68243.6535
#>  6:  Total age_group3     total  864  52054  866  52174.4954
#>  7:  Total age_group4     total  423  24571  423  24571.0000
#>  8:  Total age_group5     total  168  10321  169  10382.4345
#>  9:  Total age_group6     total   13    740   12    683.0769
#> 10:   male      Total     total 2296 135847 2298 135965.3336
#> 11:      m      Total     total 2296 135847 2298 135965.3336
#> 12:   male age_group1     total 1015  59743 1016  59801.8601
#> 13:      m age_group1     total 1015  59743 1016  59801.8601
#> 14:   male       ag1a     total 1015  59743 1016  59801.8601
#> 15:      m       ag1a     total 1015  59743 1016  59801.8601
#> 16:   male age_group2     total  571  33900  571  33900.0000
#> 17:      m age_group2     total  571  33900  571  33900.0000
#> 18:   male       ag2a     total  571  33900  571  33900.0000
#> 19:      m       ag2a     total  571  33900  571  33900.0000
#> 20:   male age_group3     total  424  25549  425  25609.2571
#> 21:      m age_group3     total  424  25549  425  25609.2571
#> 22:   male age_group4     total  195  11317  194  11258.9641
#> 23:      m age_group4     total  195  11317  194  11258.9641
#> 24:   male age_group5     total   84   4958   84   4958.0000
#> 25:      m age_group5     total   84   4958   84   4958.0000
#> 26:   male age_group6     total    7    380    6    325.7143
#> 27:      m age_group6     total    7    380    6    325.7143
#> 28: female      Total     total 2284 135947 2284 135947.0000
#> 29:      f      Total     total 2284 135947 2284 135947.0000
#> 30: female age_group1     total  954  56181  953  56122.1101
#> 31:      f age_group1     total  954  56181  953  56122.1101
#> 32: female       ag1a     total  954  56181  953  56122.1101
#> 33:      f       ag1a     total  954  56181  953  56122.1101
#> 34: female age_group2     total  572  34284  572  34284.0000
#> 35:      f age_group2     total  572  34284  572  34284.0000
#> 36: female       ag2a     total  572  34284  572  34284.0000
#> 37:      f       ag2a     total  572  34284  572  34284.0000
#> 38: female age_group3     total  440  26505  439  26444.7614
#> 39:      f age_group3     total  440  26505  439  26444.7614
#> 40: female age_group4     total  228  13254  227  13195.8684
#> 41:      f age_group4     total  228  13254  227  13195.8684
#> 42: female age_group5     total   84   5363   84   5363.0000
#> 43:      f age_group5     total   84   5363   84   5363.0000
#> 44: female age_group6     total    6    360    5    300.0000
#> 45:      f age_group6     total    6    360    5    300.0000
#> 46:  Total      Total cnt_males 2296 135847 2300 136083.6672
#> 47:  Total age_group1 cnt_males 1015  59743 1016  59801.8601
#> 48:  Total       ag1a cnt_males 1015  59743 1016  59801.8601
#> 49:  Total age_group2 cnt_males  571  33900  571  33900.0000
#> 50:  Total       ag2a cnt_males  571  33900  571  33900.0000
#> 51:  Total age_group3 cnt_males  424  25549  426  25669.5142
#> 52:  Total age_group4 cnt_males  195  11317  194  11258.9641
#> 53:  Total age_group5 cnt_males   84   4958   84   4958.0000
#> 54:  Total age_group6 cnt_males    7    380    5    271.4286
#> 55:   male      Total cnt_males 2296 135847 2300 136083.6672
#> 56:      m      Total cnt_males 2296 135847 2300 136083.6672
#> 57:   male age_group1 cnt_males 1015  59743 1016  59801.8601
#> 58:      m age_group1 cnt_males 1015  59743 1016  59801.8601
#> 59:   male       ag1a cnt_males 1015  59743 1016  59801.8601
#> 60:      m       ag1a cnt_males 1015  59743 1016  59801.8601
#> 61:   male age_group2 cnt_males  571  33900  571  33900.0000
#> 62:      m age_group2 cnt_males  571  33900  571  33900.0000
#> 63:   male       ag2a cnt_males  571  33900  571  33900.0000
#> 64:      m       ag2a cnt_males  571  33900  571  33900.0000
#> 65:   male age_group3 cnt_males  424  25549  426  25669.5142
#> 66:      m age_group3 cnt_males  424  25549  426  25669.5142
#> 67:   male age_group4 cnt_males  195  11317  194  11258.9641
#> 68:      m age_group4 cnt_males  195  11317  194  11258.9641
#> 69:   male age_group5 cnt_males   84   4958   84   4958.0000
#> 70:      m age_group5 cnt_males   84   4958   84   4958.0000
#> 71:   male age_group6 cnt_males    7    380    5    271.4286
#> 72:      m age_group6 cnt_males    7    380    5    271.4286
#> 73: female      Total cnt_males    0      0    0      0.0000
#> 74:      f      Total cnt_males    0      0    0      0.0000
#> 75: female age_group1 cnt_males    0      0    0      0.0000
#> 76:      f age_group1 cnt_males    0      0    0      0.0000
#> 77: female       ag1a cnt_males    0      0    0      0.0000
#> 78:      f       ag1a cnt_males    0      0    0      0.0000
#> 79: female age_group2 cnt_males    0      0    0      0.0000
#> 80:      f age_group2 cnt_males    0      0    0      0.0000
#> 81: female       ag2a cnt_males    0      0    0      0.0000
#> 82:      f       ag2a cnt_males    0      0    0      0.0000
#> 83: female age_group3 cnt_males    0      0    0      0.0000
#> 84:      f age_group3 cnt_males    0      0    0      0.0000
#> 85: female age_group4 cnt_males    0      0    0      0.0000
#> 86:      f age_group4 cnt_males    0      0    0      0.0000
#> 87: female age_group5 cnt_males    0      0    0      0.0000
#> 88:      f age_group5 cnt_males    0      0    0      0.0000
#> 89: female age_group6 cnt_males    0      0    0      0.0000
#> 90:      f age_group6 cnt_males    0      0    0      0.0000
#>        sex        age     vname  uwc     wc puwc         pwc

# numerical variables (positive variables using flex-function)
# we also write the config to a yaml file
f_yaml <- tempfile(fileext = ".yaml")

# create a ptable using functionality from the ptable-pkg
# a single ptable for all cells
ptab1 <- ptable::pt_ex_nums(parity = TRUE, separation = FALSE)

# a single ptab for all cells except for very small ones
ptab2 <- ptable::pt_ex_nums(parity = TRUE, separation = TRUE)

# different ptables for cells with even/odd number of contributors
# and very small cells
ptab3 <- ptable::pt_ex_nums(parity = FALSE, separation = TRUE)

p_nums1 <- ck_params_nums(
  ptab = ptab1,
  type = "top_contr",
  top_k = 3,
  mult_params = ck_flexparams(
    fp = 1000,
    p = c(0.30, 0.03),
    epsilon = c(1, 0.5, 0.2),
    q = 3),
  mu_c = 2,
  same_key = FALSE,
  use_zero_rkeys = FALSE,
  path = f_yaml)
#> yaml configuration '/tmp/RtmpV7PINT/file196210088919.yaml' successfully written.

# we read the parameters from the yaml-file
p_nums1 <- ck_read_yaml(path = f_yaml)

# for variables with positive and negative values
p_nums2 <- ck_params_nums(
  ptab = ptab2,
  type = "top_contr",
  top_k = 3,
  mult_params = ck_flexparams(
    fp = 1000,
    p = c(0.15, 0.02),
    epsilon = c(1, 0.4, 0.15),
    q = 3),
  mu_c = 2,
  same_key = FALSE)

# simple perturbation parameters (not using the flex-function approach)
p_nums3 <- ck_params_nums(
  ptab = ptab3,
  type = "mean",
  mult_params = ck_simpleparams(p = 0.25),
  mu_c = 2,
  same_key = FALSE)

# use `p_nums1` for all variables
tab$params_nums_set(p_nums1, c("savings", "income", "expend"))
#> --> setting perturbation parameters for variable 'savings'
#> --> setting perturbation parameters for variable 'income'
#> --> setting perturbation parameters for variable 'expend'

# use different parameters for variable `mixed`
tab$params_nums_set(p_nums2, v = "mixed")
#> --> setting perturbation parameters for variable 'mixed'

# identify sensitive cells to which extra protection (`mu_c`) is added.
tab$supp_p(v = "income", p = 85)
#> computing contributing indices | rawdata <--> table; this might take a while
#> p%-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_pq(v = "income", p = 85, q = 90)
#> computing contributing indices | rawdata <--> table; this might take a while
#> pq-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_nk(v = "income", n = 2, k = 90)
#> computing contributing indices | rawdata <--> table; this might take a while
#> nk-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_freq(v = "income", n = 14, weighted = FALSE)
#> freq-rule: 5 new sensitive cells (incl. duplicates) found (total: 5)
tab$supp_val(v = "income", n = 10000, weighted = TRUE)
#> val-rule: 0 new sensitive cells (incl. duplicates) found (total: 5)
tab$supp_cells(
  v = "income",
  inp = data.frame(
    sex = c("female", "female"),
    "age" = c("age_group1", "age_group3")
  )
)
#> cell-rule: 2 new sensitive cells (incl. duplicates) found (total: 7)

# perturb variables
tab$perturb(v = c("income", "savings"))
#> Numeric variable 'income' was perturbed.
#> Numeric variable 'savings' was perturbed.

# extract results
tab$numtab("income", mean_before_sum = TRUE)
#>        sex        age  vname      uws         ws        pws
#>  1:  Total      Total income 22952978 1362164655 1362309587
#>  2:  Total age_group1 income  9810547  576139868  576201080
#>  3:  Total       ag1a income  9810547  576139868  576201080
#>  4:  Total age_group2 income  5692119  339190840  339210606
#>  5:  Total       ag2a income  5692119  339190840  339210606
#>  6:  Total age_group3 income  4406946  266009840  265934534
#>  7:  Total age_group4 income  2133543  124517758  124388087
#>  8:  Total age_group5 income   848151   52502198   52538133
#>  9:  Total age_group6 income    61672    3804151    3997489
#> 10:   male      Total income 11262049  669230647  669162762
#> 11:      m      Total income 11262049  669230647  669162762
#> 12:   male age_group1 income  4877164  287875791  287821837
#> 13:      m age_group1 income  4877164  287875791  287821837
#> 14:   male       ag1a income  4877164  287875791  287821837
#> 15:      m       ag1a income  4877164  287875791  287821837
#> 16:   male age_group2 income  2811379  167710504  167779242
#> 17:      m age_group2 income  2811379  167710504  167779242
#> 18:   male       ag2a income  2811379  167710504  167779242
#> 19:      m       ag2a income  2811379  167710504  167779242
#> 20:   male age_group3 income  2168169  130878177  130799853
#> 21:      m age_group3 income  2168169  130878177  130799853
#> 22:   male age_group4 income   978510   57071677   56976920
#> 23:      m age_group4 income   978510   57071677   56976920
#> 24:   male age_group5 income   393134   23652582   23252574
#> 25:      m age_group5 income   393134   23652582   23252574
#> 26:   male age_group6 income    33693    2041916    2180102
#> 27:      m age_group6 income    33693    2041916    2180102
#> 28: female      Total income 11690929  692934008  693064101
#> 29:      f      Total income 11690929  692934008  693064101
#> 30: female age_group1 income  4933383  288264077  288175524
#> 31:      f age_group1 income  4933383  288264077  288175524
#> 32: female       ag1a income  4933383  288264077  288175524
#> 33:      f       ag1a income  4933383  288264077  288175524
#> 34: female age_group2 income  2880740  171480336  171525493
#> 35:      f age_group2 income  2880740  171480336  171525493
#> 36: female       ag2a income  2880740  171480336  171525493
#> 37:      f       ag2a income  2880740  171480336  171525493
#> 38: female age_group3 income  2238777  135131663  135018748
#> 39:      f age_group3 income  2238777  135131663  135018748
#> 40: female age_group4 income  1155033   67446081   67447018
#> 41:      f age_group4 income  1155033   67446081   67447018
#> 42: female age_group5 income   455017   28849616   28870115
#> 43:      f age_group5 income   455017   28849616   28870115
#> 44: female age_group6 income    27979    1762235    1572808
#> 45:      f age_group6 income    27979    1762235    1572808
#>        sex        age  vname      uws         ws        pws
tab$numtab("income", mean_before_sum = FALSE)
#>        sex        age  vname      uws         ws        pws
#>  1:  Total      Total income 22952978 1362164655 1362237119
#>  2:  Total age_group1 income  9810547  576139868  576170473
#>  3:  Total       ag1a income  9810547  576139868  576170473
#>  4:  Total age_group2 income  5692119  339190840  339200723
#>  5:  Total       ag2a income  5692119  339190840  339200723
#>  6:  Total age_group3 income  4406946  266009840  265972184
#>  7:  Total age_group4 income  2133543  124517758  124452906
#>  8:  Total age_group5 income   848151   52502198   52520162
#>  9:  Total age_group6 income    61672    3804151    3899622
#> 10:   male      Total income 11262049  669230647  669196704
#> 11:      m      Total income 11262049  669230647  669196704
#> 12:   male age_group1 income  4877164  287875791  287848813
#> 13:      m age_group1 income  4877164  287875791  287848813
#> 14:   male       ag1a income  4877164  287875791  287848813
#> 15:      m       ag1a income  4877164  287875791  287848813
#> 16:   male age_group2 income  2811379  167710504  167744869
#> 17:      m age_group2 income  2811379  167710504  167744869
#> 18:   male       ag2a income  2811379  167710504  167744869
#> 19:      m       ag2a income  2811379  167710504  167744869
#> 20:   male age_group3 income  2168169  130878177  130839009
#> 21:      m age_group3 income  2168169  130878177  130839009
#> 22:   male age_group4 income   978510   57071677   57024279
#> 23:      m age_group4 income   978510   57071677   57024279
#> 24:   male age_group5 income   393134   23652582   23451725
#> 25:      m age_group5 income   393134   23652582   23451725
#> 26:   male age_group6 income    33693    2041916    2109878
#> 27:      m age_group6 income    33693    2041916    2109878
#> 28: female      Total income 11690929  692934008  692999051
#> 29:      f      Total income 11690929  692934008  692999051
#> 30: female age_group1 income  4933383  288264077  288219797
#> 31:      f age_group1 income  4933383  288264077  288219797
#> 32: female       ag1a income  4933383  288264077  288219797
#> 33:      f       ag1a income  4933383  288264077  288219797
#> 34: female age_group2 income  2880740  171480336  171502913
#> 35:      f age_group2 income  2880740  171480336  171502913
#> 36: female       ag2a income  2880740  171480336  171502913
#> 37:      f       ag2a income  2880740  171480336  171502913
#> 38: female age_group3 income  2238777  135131663  135075194
#> 39:      f age_group3 income  2238777  135131663  135075194
#> 40: female age_group4 income  1155033   67446081   67446549
#> 41:      f age_group4 income  1155033   67446081   67446549
#> 42: female age_group5 income   455017   28849616   28859864
#> 43:      f age_group5 income   455017   28849616   28859864
#> 44: female age_group6 income    27979    1762235    1664830
#> 45:      f age_group6 income    27979    1762235    1664830
#>        sex        age  vname      uws         ws        pws
tab$numtab("savings")
#>        sex        age   vname     uws        ws         pws
#>  1:  Total      Total savings 2273532 134518023 134513574.8
#>  2:  Total age_group1 savings  982386  57935027  57935605.0
#>  3:  Total       ag1a savings  982386  57935027  57935605.0
#>  4:  Total age_group2 savings  552336  32523348  32525608.1
#>  5:  Total       ag2a savings  552336  32523348  32525608.1
#>  6:  Total age_group3 savings  437101  26265889  26272668.8
#>  7:  Total age_group4 savings  214661  12467508  12464910.1
#>  8:  Total age_group5 savings   80451   4960971   4962553.6
#>  9:  Total age_group6 savings    6597    365280    369217.0
#> 10:   male      Total savings 1159816  68512317  68515227.4
#> 11:      m      Total savings 1159816  68512317  68515227.4
#> 12:   male age_group1 savings  517660  30546216  30543797.5
#> 13:      m age_group1 savings  517660  30546216  30543797.5
#> 14:   male       ag1a savings  517660  30546216  30543797.5
#> 15:      m       ag1a savings  517660  30546216  30543797.5
#> 16:   male age_group2 savings  280923  16551066  16553928.8
#> 17:      m age_group2 savings  280923  16551066  16553928.8
#> 18:   male       ag2a savings  280923  16551066  16553928.8
#> 19:      m       ag2a savings  280923  16551066  16553928.8
#> 20:   male age_group3 savings  214970  12836619  12842366.6
#> 21:      m age_group3 savings  214970  12836619  12842366.6
#> 22:   male age_group4 savings   99420   5795677   5790691.0
#> 23:      m age_group4 savings   99420   5795677   5790691.0
#> 24:   male age_group5 savings   43233   2604461   2589031.6
#> 25:      m age_group5 savings   43233   2604461   2589031.6
#> 26:   male age_group6 savings    3610    178278    179192.5
#> 27:      m age_group6 savings    3610    178278    179192.5
#> 28: female      Total savings 1113716  66005706  66009443.1
#> 29:      f      Total savings 1113716  66005706  66009443.1
#> 30: female age_group1 savings  464726  27388811  27394755.1
#> 31:      f age_group1 savings  464726  27388811  27394755.1
#> 32: female       ag1a savings  464726  27388811  27394755.1
#> 33:      f       ag1a savings  464726  27388811  27394755.1
#> 34: female age_group2 savings  271413  15972282  15974225.2
#> 35:      f age_group2 savings  271413  15972282  15974225.2
#> 36: female       ag2a savings  271413  15972282  15974225.2
#> 37:      f       ag2a savings  271413  15972282  15974225.2
#> 38: female age_group3 savings  222131  13429270  13423492.6
#> 39:      f age_group3 savings  222131  13429270  13423492.6
#> 40: female age_group4 savings  115241   6671831   6673815.4
#> 41:      f age_group4 savings  115241   6671831   6673815.4
#> 42: female age_group5 savings   37218   2356510   2357421.8
#> 43:      f age_group5 savings   37218   2356510   2357421.8
#> 44: female age_group6 savings    2987    187002    180198.5
#> 45:      f age_group6 savings    2987    187002    180198.5
#>        sex        age   vname     uws        ws         pws

# results can be resetted, too
tab$reset_cntvars(v = "cnt_males")

# we can then set other parameters and perturb again
tab$params_cnts_set(val = p_cnts1, v = "cnt_males")
#> --> setting perturbation parameters for variable 'cnt_males'

tab$perturb(v = "cnt_males")
#> Count variable 'cnt_males' was perturbed.

# write results to a .csv file
tab$freqtab(
  v = c("total", "cnt_males"),
  path = file.path(tempdir(), "outtab.csv")
)
#> File '/tmp/RtmpV7PINT/outtab.csv' successfully written to disk.
#> NULL

# show results containing weighted and unweighted results
tab$freqtab(v = c("total", "cnt_males"))
#>        sex        age     vname  uwc     wc puwc         pwc
#>  1:  Total      Total     total 4580 271794 4580 271794.0000
#>  2:  Total age_group1     total 1969 115924 1971 116041.7491
#>  3:  Total       ag1a     total 1969 115924 1971 116041.7491
#>  4:  Total age_group2     total 1143  68184 1144  68243.6535
#>  5:  Total       ag2a     total 1143  68184 1144  68243.6535
#>  6:  Total age_group3     total  864  52054  866  52174.4954
#>  7:  Total age_group4     total  423  24571  423  24571.0000
#>  8:  Total age_group5     total  168  10321  169  10382.4345
#>  9:  Total age_group6     total   13    740   12    683.0769
#> 10:   male      Total     total 2296 135847 2298 135965.3336
#> 11:      m      Total     total 2296 135847 2298 135965.3336
#> 12:   male age_group1     total 1015  59743 1016  59801.8601
#> 13:      m age_group1     total 1015  59743 1016  59801.8601
#> 14:   male       ag1a     total 1015  59743 1016  59801.8601
#> 15:      m       ag1a     total 1015  59743 1016  59801.8601
#> 16:   male age_group2     total  571  33900  571  33900.0000
#> 17:      m age_group2     total  571  33900  571  33900.0000
#> 18:   male       ag2a     total  571  33900  571  33900.0000
#> 19:      m       ag2a     total  571  33900  571  33900.0000
#> 20:   male age_group3     total  424  25549  425  25609.2571
#> 21:      m age_group3     total  424  25549  425  25609.2571
#> 22:   male age_group4     total  195  11317  194  11258.9641
#> 23:      m age_group4     total  195  11317  194  11258.9641
#> 24:   male age_group5     total   84   4958   84   4958.0000
#> 25:      m age_group5     total   84   4958   84   4958.0000
#> 26:   male age_group6     total    7    380    6    325.7143
#> 27:      m age_group6     total    7    380    6    325.7143
#> 28: female      Total     total 2284 135947 2284 135947.0000
#> 29:      f      Total     total 2284 135947 2284 135947.0000
#> 30: female age_group1     total  954  56181  953  56122.1101
#> 31:      f age_group1     total  954  56181  953  56122.1101
#> 32: female       ag1a     total  954  56181  953  56122.1101
#> 33:      f       ag1a     total  954  56181  953  56122.1101
#> 34: female age_group2     total  572  34284  572  34284.0000
#> 35:      f age_group2     total  572  34284  572  34284.0000
#> 36: female       ag2a     total  572  34284  572  34284.0000
#> 37:      f       ag2a     total  572  34284  572  34284.0000
#> 38: female age_group3     total  440  26505  439  26444.7614
#> 39:      f age_group3     total  440  26505  439  26444.7614
#> 40: female age_group4     total  228  13254  227  13195.8684
#> 41:      f age_group4     total  228  13254  227  13195.8684
#> 42: female age_group5     total   84   5363   84   5363.0000
#> 43:      f age_group5     total   84   5363   84   5363.0000
#> 44: female age_group6     total    6    360    5    300.0000
#> 45:      f age_group6     total    6    360    5    300.0000
#> 46:  Total      Total cnt_males 2296 135847 2298 135965.3336
#> 47:  Total age_group1 cnt_males 1015  59743 1016  59801.8601
#> 48:  Total       ag1a cnt_males 1015  59743 1016  59801.8601
#> 49:  Total age_group2 cnt_males  571  33900  571  33900.0000
#> 50:  Total       ag2a cnt_males  571  33900  571  33900.0000
#> 51:  Total age_group3 cnt_males  424  25549  425  25609.2571
#> 52:  Total age_group4 cnt_males  195  11317  194  11258.9641
#> 53:  Total age_group5 cnt_males   84   4958   84   4958.0000
#> 54:  Total age_group6 cnt_males    7    380    6    325.7143
#> 55:   male      Total cnt_males 2296 135847 2298 135965.3336
#> 56:      m      Total cnt_males 2296 135847 2298 135965.3336
#> 57:   male age_group1 cnt_males 1015  59743 1016  59801.8601
#> 58:      m age_group1 cnt_males 1015  59743 1016  59801.8601
#> 59:   male       ag1a cnt_males 1015  59743 1016  59801.8601
#> 60:      m       ag1a cnt_males 1015  59743 1016  59801.8601
#> 61:   male age_group2 cnt_males  571  33900  571  33900.0000
#> 62:      m age_group2 cnt_males  571  33900  571  33900.0000
#> 63:   male       ag2a cnt_males  571  33900  571  33900.0000
#> 64:      m       ag2a cnt_males  571  33900  571  33900.0000
#> 65:   male age_group3 cnt_males  424  25549  425  25609.2571
#> 66:      m age_group3 cnt_males  424  25549  425  25609.2571
#> 67:   male age_group4 cnt_males  195  11317  194  11258.9641
#> 68:      m age_group4 cnt_males  195  11317  194  11258.9641
#> 69:   male age_group5 cnt_males   84   4958   84   4958.0000
#> 70:      m age_group5 cnt_males   84   4958   84   4958.0000
#> 71:   male age_group6 cnt_males    7    380    6    325.7143
#> 72:      m age_group6 cnt_males    7    380    6    325.7143
#> 73: female      Total cnt_males    0      0    0      0.0000
#> 74:      f      Total cnt_males    0      0    0      0.0000
#> 75: female age_group1 cnt_males    0      0    0      0.0000
#> 76:      f age_group1 cnt_males    0      0    0      0.0000
#> 77: female       ag1a cnt_males    0      0    0      0.0000
#> 78:      f       ag1a cnt_males    0      0    0      0.0000
#> 79: female age_group2 cnt_males    0      0    0      0.0000
#> 80:      f age_group2 cnt_males    0      0    0      0.0000
#> 81: female       ag2a cnt_males    0      0    0      0.0000
#> 82:      f       ag2a cnt_males    0      0    0      0.0000
#> 83: female age_group3 cnt_males    0      0    0      0.0000
#> 84:      f age_group3 cnt_males    0      0    0      0.0000
#> 85: female age_group4 cnt_males    0      0    0      0.0000
#> 86:      f age_group4 cnt_males    0      0    0      0.0000
#> 87: female age_group5 cnt_males    0      0    0      0.0000
#> 88:      f age_group5 cnt_males    0      0    0      0.0000
#> 89: female age_group6 cnt_males    0      0    0      0.0000
#> 90:      f age_group6 cnt_males    0      0    0      0.0000
#>        sex        age     vname  uwc     wc puwc         pwc

# utility measures for a count variable
tab$measures_cnts(v = "total", exclude_zeros = TRUE)
#> $overview
#>    noise cnt       pct
#> 1:    -2   5 0.1111111
#> 2:    -1   9 0.2000000
#> 3:     0  16 0.3555556
#> 4:     1  15 0.3333333
#> 
#> $measures
#>       what    d1    d2    d3
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 1.000 0.001 0.015
#>  6:   Mean 0.756 0.017 0.034
#>  7: Median 1.000 0.001 0.016
#>  8:    Q60 1.000 0.001 0.021
#>  9:    Q70 1.000 0.002 0.024
#> 10:    Q80 1.000 0.004 0.033
#> 11:    Q90 1.600 0.049 0.100
#> 12:    Q95 2.000 0.143 0.196
#> 13:    Q99 2.000 0.167 0.213
#> 14:    Max 2.000 0.167 0.213
#> 
#> $cumdistr_d1
#>    cat cnt       pct
#> 1:   0  16 0.3555556
#> 2:   1  40 0.8888889
#> 3:   2  45 1.0000000
#> 
#> $cumdistr_d2
#>            cat cnt       pct
#> 1:    [0,0.02]  40 0.8888889
#> 2: (0.02,0.05]  40 0.8888889
#> 3:  (0.05,0.1]  41 0.9111111
#> 4:   (0.1,0.2]  45 1.0000000
#> 5:   (0.2,0.3]  45 1.0000000
#> 6:   (0.3,0.4]  45 1.0000000
#> 7:   (0.4,0.5]  45 1.0000000
#> 8:   (0.5,Inf]  45 1.0000000
#> 
#> $cumdistr_d3
#>            cat cnt       pct
#> 1:    [0,0.02]  26 0.5777778
#> 2: (0.02,0.05]  40 0.8888889
#> 3:  (0.05,0.1]  40 0.8888889
#> 4:   (0.1,0.2]  43 0.9555556
#> 5:   (0.2,0.3]  45 1.0000000
#> 6:   (0.3,0.4]  45 1.0000000
#> 7:   (0.4,0.5]  45 1.0000000
#> 8:   (0.5,Inf]  45 1.0000000
#> 
#> $false_zero
#> [1] 0
#> 
#> $false_nonzero
#> [1] 0
#> 
#> $exclude_zeros
#> [1] TRUE
#> 

# modifications for perturbed count variables
tab$mod_cnts()
#>         sex        age row_nr pert      ckey  countvar
#>   1:  Total      Total     15    0 0.4611279     total
#>   2:  Total age_group1     17    2 0.9584550     total
#>   3:  Total       ag1a     17    2 0.9584550     total
#>   4:  Total age_group2     16    1 0.9296913     total
#>   5:  Total       ag2a     16    1 0.9296913     total
#>  ---                                                  
#> 131:      f age_group4     -1    0 0.0000000 cnt_males
#> 132: female age_group5     -1    0 0.0000000 cnt_males
#> 133:      f age_group5     -1    0 0.0000000 cnt_males
#> 134: female age_group6     -1    0 0.0000000 cnt_males
#> 135:      f age_group6     -1    0 0.0000000 cnt_males

# display a summary about utility measures
tab$summary()
#> ┌──────────────────────────────────────────────┐
#> │Utility measures for perturbed count variables│
#> └──────────────────────────────────────────────┘
#> ── Distribution statistics of perturbations ────────────────────────────────────
#>          countvar Min Q10 Q20 Q30 Q40  Mean Median Q60 Q70 Q80 Q90 Q95 Q99 Max
#> 1:          total  -1  -1  -1  -1   0 0.089      0   0 0.8   1 1.6 2.0   2   2
#> 2: cnt_highincome  -1  -1  -1   0   0 0.311      0   1 1.0   1 2.0 2.0   2   2
#> 3:      cnt_males  -1  -1   0   0   0 0.200      0   0 0.0   1 1.0 1.8   2   2
#> 
#> ── Distance-based measures ─────────────────────────────────────────────────────
#> ✔ Variable: 'total'
#> 
#>       what    d1    d2    d3
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 1.000 0.001 0.015
#>  6:   Mean 0.756 0.017 0.034
#>  7: Median 1.000 0.001 0.016
#>  8:    Q60 1.000 0.001 0.021
#>  9:    Q70 1.000 0.002 0.024
#> 10:    Q80 1.000 0.004 0.033
#> 11:    Q90 1.600 0.049 0.100
#> 12:    Q95 2.000 0.143 0.196
#> 13:    Q99 2.000 0.167 0.213
#> 14:    Max 2.000 0.167 0.213
#> 
#> ✔ Variable: 'cnt_males'
#> 
#>       what    d1    d2    d3
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 1.000 0.001 0.016
#>  6:   Mean 0.778 0.017 0.034
#>  7: Median 1.000 0.001 0.016
#>  8:    Q60 1.000 0.001 0.021
#>  9:    Q70 1.000 0.002 0.024
#> 10:    Q80 1.000 0.005 0.034
#> 11:    Q90 1.400 0.060 0.100
#> 12:    Q95 2.000 0.143 0.196
#> 13:    Q99 2.000 0.143 0.196
#> 14:    Max 2.000 0.143 0.196
#> 
#> ✔ Variable: 'cnt_highincome'
#> 
#>       what  d1    d2    d3
#>  1:    Min 0.0 0.000 0.000
#>  2:    Q10 0.0 0.000 0.000
#>  3:    Q20 0.0 0.000 0.000
#>  4:    Q30 1.0 0.008 0.045
#>  5:    Q40 1.0 0.010 0.052
#>  6:   Mean 0.9 0.024 0.065
#>  7: Median 1.0 0.011 0.054
#>  8:    Q60 1.0 0.015 0.061
#>  9:    Q70 1.0 0.024 0.078
#> 10:    Q80 1.0 0.035 0.127
#> 11:    Q90 2.0 0.067 0.131
#> 12:    Q95 2.0 0.075 0.135
#> 13:    Q99 2.0 0.143 0.196
#> 14:    Max 2.0 0.143 0.196
#> 
#> ┌──────────────────────────────────────────────────┐
#> │Utility measures for perturbed numerical variables│
#> └──────────────────────────────────────────────────┘
#> ── Distribution statistics of perturbations ────────────────────────────────────
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#>      vname        Min        Q10        Q20        Q30        Q40       Mean
#> 1:  expend        Inf         NA         NA         NA         NA        NaN
#> 2:  income -200856.60 -61499.112 -44903.362 -39167.964 -29764.378 -12354.064
#> 3: savings  -15429.42  -5777.418  -2967.973  -1819.231    913.437    237.221
#> 4:   mixed        Inf         NA         NA         NA         NA        NaN
#>      Median       Q60       Q70       Q80       Q90       Q95       Q99
#> 1:       NA        NA        NA        NA        NA        NA        NA
#> 2:  468.356 10247.663 22577.239 34365.341 65043.398 67961.875 85347.938
#> 3: 1943.244  2094.693  2862.792  3737.088  5865.515  5944.149  6412.108
#> 4:       NA        NA        NA        NA        NA        NA        NA
#>         Max
#> 1:     -Inf
#> 2: 95471.07
#> 3:  6779.79
#> 4:     -Inf
# }