ck_flexparams() allows to define a flex function that is used to lookup perturbation magnitudes (percentages) used when perturbing continuous variables.

ck_flexparams(fp, p = c(0.25, 0.05), epsilon = 1, q = 3)

Arguments

fp

(numeric scalar); at which point should the noise coefficient function reaches its desired maximum (defined by the first element of p)

p

a numeric vector of length 2 where both elements specify a percentage. The first value refers to the desired maximum perturbation percentage for small cells (depending on fp) while the second element refers to the desired maximum perturbation percentage for large cells. Both values must be between 0 and 1 and need to be in descending order.

epsilon

a numeric vector in descending order with all values >= 0 and <= 1 with the first element forced to equal 1. The length of this vector must correspond with the number top_k specified in ck_params_nums() when creating parameters for type == "top_contr" which is checked at runtime. This setting allows to use different flex-functions for the largest top_k contributors.

q

(numeric scalar); Parameter of the function; q needs to be >= 1

Value

an object suitable as input for ck_params_nums().

Details

details about the flex function can be found in Deliverable D4.2, Part I in SGA "Open Source tools for perturbative confidentiality methods"

Examples

# \donttest{
x <- ck_create_testdata()

# create some 0/1 variables that should be perturbed later
x[, cnt_females := ifelse(sex == "male", 0, 1)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              60          25.00000           0
#>    2:   2530      28       1              66          25.00000           1
#>    3:   6920     550       1              30          25.00000           0
#>    4:   7960     870       1              98          25.00000           0
#>    5:   9030      20       2              75          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              41          16.66667           1
#> 4577:   1420     987    1000              30          16.66667           0
#> 4578:   8900     684    1000              43          16.66667           0
#> 4579:   3880     294    1000              41          16.66667           1
#> 4580:   4830     911    1000              39          16.66667           0
x[, cnt_males := ifelse(sex == "male", 1, 0)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              60          25.00000           0
#>    2:   2530      28       1              66          25.00000           1
#>    3:   6920     550       1              30          25.00000           0
#>    4:   7960     870       1              98          25.00000           0
#>    5:   9030      20       2              75          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              41          16.66667           1
#> 4577:   1420     987    1000              30          16.66667           0
#> 4578:   8900     684    1000              43          16.66667           0
#> 4579:   3880     294    1000              41          16.66667           1
#> 4580:   4830     911    1000              39          16.66667           0
#>       cnt_males
#>           <num>
#>    1:         1
#>    2:         0
#>    3:         1
#>    4:         1
#>    5:         1
#>   ---          
#> 4576:         0
#> 4577:         1
#> 4578:         1
#> 4579:         0
#> 4580:         1
x[, cnt_highincome := ifelse(income >= 9000, 1, 0)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              60          25.00000           0
#>    2:   2530      28       1              66          25.00000           1
#>    3:   6920     550       1              30          25.00000           0
#>    4:   7960     870       1              98          25.00000           0
#>    5:   9030      20       2              75          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              41          16.66667           1
#> 4577:   1420     987    1000              30          16.66667           0
#> 4578:   8900     684    1000              43          16.66667           0
#> 4579:   3880     294    1000              41          16.66667           1
#> 4580:   4830     911    1000              39          16.66667           0
#>       cnt_males cnt_highincome
#>           <num>          <num>
#>    1:         1              0
#>    2:         0              0
#>    3:         1              0
#>    4:         1              0
#>    5:         1              1
#>   ---                         
#> 4576:         0              0
#> 4577:         1              0
#> 4578:         1              0
#> 4579:         0              0
#> 4580:         1              0
# a variable with positive and negative contributions
x[, mixed := sample(-10:10, nrow(x), replace = TRUE)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              60          25.00000           0
#>    2:   2530      28       1              66          25.00000           1
#>    3:   6920     550       1              30          25.00000           0
#>    4:   7960     870       1              98          25.00000           0
#>    5:   9030      20       2              75          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              41          16.66667           1
#> 4577:   1420     987    1000              30          16.66667           0
#> 4578:   8900     684    1000              43          16.66667           0
#> 4579:   3880     294    1000              41          16.66667           1
#> 4580:   4830     911    1000              39          16.66667           0
#>       cnt_males cnt_highincome mixed
#>           <num>          <num> <int>
#>    1:         1              0     0
#>    2:         0              0     0
#>    3:         1              0    -5
#>    4:         1              0     3
#>    5:         1              1     2
#>   ---                               
#> 4576:         0              0    10
#> 4577:         1              0    -8
#> 4578:         1              0    -3
#> 4579:         0              0    -8
#> 4580:         1              0    -5

# create record keys
x$rkey <- ck_generate_rkeys(dat = x)

# define required inputs

# hierarchy with some bogus codes
d_sex <- hier_create(root = "Total", nodes = c("male", "female"))
d_sex <- hier_add(d_sex, root = "female", "f")
d_sex <- hier_add(d_sex, root = "male", "m")

d_age <- hier_create(root = "Total", nodes = paste0("age_group", 1:6))
d_age <- hier_add(d_age, root = "age_group1", "ag1a")
d_age <- hier_add(d_age, root = "age_group2", "ag2a")

# define the cell key object
countvars <- c("cnt_females", "cnt_males", "cnt_highincome")
numvars <- c("expend", "income", "savings", "mixed")
tab <- ck_setup(
  x = x,
  rkey = "rkey",
  dims = list(sex = d_sex, age = d_age),
  w = "sampling_weight",
  countvars = countvars,
  numvars = numvars)
#> computing contributing indices | rawdata <--> table; this might take a while

# show some information about this table instance
tab$print() # identical with print(tab)
#> ── Table Information ───────────────────────────────────────────────────────────
#> ✔ 45 cells in 2 dimensions ('sex', 'age')
#> ✔ weights: yes
#> ── Tabulated / Perturbed countvars ─────────────────────────────────────────────
#> ☐ 'total'
#> ☐ 'cnt_females'
#> ☐ 'cnt_males'
#> ☐ 'cnt_highincome'
#> ── Tabulated / Perturbed numvars ───────────────────────────────────────────────
#> ☐ 'expend'
#> ☐ 'income'
#> ☐ 'savings'
#> ☐ 'mixed'

# information about the hierarchies
tab$hierarchy_info()
#> $sex
#>      code level is_leaf parent
#>    <char> <int>  <lgcl> <char>
#> 1:  Total     1   FALSE  Total
#> 2:   male     2   FALSE  Total
#> 3:      m     3    TRUE   male
#> 4: female     2   FALSE  Total
#> 5:      f     3    TRUE female
#> 
#> $age
#>          code level is_leaf     parent
#>        <char> <int>  <lgcl>     <char>
#> 1:      Total     1   FALSE      Total
#> 2: age_group1     2   FALSE      Total
#> 3:       ag1a     3    TRUE age_group1
#> 4: age_group2     2   FALSE      Total
#> 5:       ag2a     3    TRUE age_group2
#> 6: age_group3     2    TRUE      Total
#> 7: age_group4     2    TRUE      Total
#> 8: age_group5     2    TRUE      Total
#> 9: age_group6     2    TRUE      Total
#> 

# which variables have been defined?
tab$allvars()
#> $cntvars
#> [1] "total"          "cnt_females"    "cnt_males"      "cnt_highincome"
#> 
#> $numvars
#> [1] "expend"  "income"  "savings" "mixed"  
#> 

# count variables
tab$cntvars()
#> [1] "total"          "cnt_females"    "cnt_males"      "cnt_highincome"

# continuous variables
tab$numvars()
#> [1] "expend"  "income"  "savings" "mixed"  

# create perturbation parameters for "total" variable and
# write to yaml-file

# create a ptable using functionality from the ptable-pkg
f_yaml <- tempfile(fileext = ".yaml")
p_cnts1 <- ck_params_cnts(
  ptab = ptable::pt_ex_cnts(),
  path = f_yaml)
#> yaml configuration '/tmp/RtmpHZpsuV/file1d2c73f7f41a.yaml' successfully written.

# read parameters from yaml-file and set them for variable `"total"`
p_cnts1 <- ck_read_yaml(path = f_yaml)

tab$params_cnts_set(val = p_cnts1, v = "total")
#> --> setting perturbation parameters for variable 'total'

# create alternative perturbation parameters by specifying parameters
para2 <- ptable::create_cnt_ptable(
  D = 8, V = 3, js = 2, create = FALSE)

p_cnts2 <- ck_params_cnts(ptab = para2)

# use these ptable it for the remaining variables
tab$params_cnts_set(val = p_cnts2, v = countvars)
#> --> setting perturbation parameters for variable 'cnt_females'
#> --> setting perturbation parameters for variable 'cnt_males'
#> --> setting perturbation parameters for variable 'cnt_highincome'

# perturb a variable
tab$perturb(v = "total")
#> Count variable 'total' was perturbed.

# multiple variables can be perturbed as well
tab$perturb(v = c("cnt_males", "cnt_highincome"))
#> Count variable 'cnt_males' was perturbed.
#> Count variable 'cnt_highincome' was perturbed.

# return weighted and unweighted results
tab$freqtab(v = c("total", "cnt_males"))
#>        sex        age     vname   uwc     wc  puwc         pwc
#>     <char>     <char>    <char> <num>  <num> <num>       <num>
#>  1:  Total      Total     total  4580 275617  4580 275617.0000
#>  2:  Total age_group1     total  1969 118674  1969 118674.0000
#>  3:  Total       ag1a     total  1969 118674  1969 118674.0000
#>  4:  Total age_group2     total  1143  68018  1144  68077.5083
#>  5:  Total       ag2a     total  1143  68018  1144  68077.5083
#>  6:  Total age_group3     total   864  52588   864  52588.0000
#>  7:  Total age_group4     total   423  25354   424  25413.9385
#>  8:  Total age_group5     total   168  10148   167  10087.5952
#>  9:  Total age_group6     total    13    835    13    835.0000
#> 10:   male      Total     total  2296 138577  2296 138577.0000
#> 11:      m      Total     total  2296 138577  2296 138577.0000
#> 12:   male age_group1     total  1015  60945  1015  60945.0000
#> 13:      m age_group1     total  1015  60945  1015  60945.0000
#> 14:   male       ag1a     total  1015  60945  1015  60945.0000
#> 15:      m       ag1a     total  1015  60945  1015  60945.0000
#> 16:   male age_group2     total   571  34245   570  34185.0263
#> 17:      m age_group2     total   571  34245   570  34185.0263
#> 18:   male       ag2a     total   571  34245   570  34185.0263
#> 19:      m       ag2a     total   571  34245   570  34185.0263
#> 20:   male age_group3     total   424  25845   425  25905.9552
#> 21:      m age_group3     total   424  25845   425  25905.9552
#> 22:   male age_group4     total   195  11996   196  12057.5179
#> 23:      m age_group4     total   195  11996   196  12057.5179
#> 24:   male age_group5     total    84   5080    84   5080.0000
#> 25:      m age_group5     total    84   5080    84   5080.0000
#> 26:   male age_group6     total     7    466     8    532.5714
#> 27:      m age_group6     total     7    466     8    532.5714
#> 28: female      Total     total  2284 137040  2285 137100.0000
#> 29:      f      Total     total  2284 137040  2285 137100.0000
#> 30: female age_group1     total   954  57729   953  57668.4874
#> 31:      f age_group1     total   954  57729   953  57668.4874
#> 32: female       ag1a     total   954  57729   953  57668.4874
#> 33:      f       ag1a     total   954  57729   953  57668.4874
#> 34: female age_group2     total   572  33773   572  33773.0000
#> 35:      f age_group2     total   572  33773   572  33773.0000
#> 36: female       ag2a     total   572  33773   572  33773.0000
#> 37:      f       ag2a     total   572  33773   572  33773.0000
#> 38: female age_group3     total   440  26743   441  26803.7795
#> 39:      f age_group3     total   440  26743   441  26803.7795
#> 40: female age_group4     total   228  13358   230  13475.1754
#> 41:      f age_group4     total   228  13358   230  13475.1754
#> 42: female age_group5     total    84   5068    84   5068.0000
#> 43:      f age_group5     total    84   5068    84   5068.0000
#> 44: female age_group6     total     6    369     6    369.0000
#> 45:      f age_group6     total     6    369     6    369.0000
#> 46:  Total      Total cnt_males  2296 138577  2297 138637.3558
#> 47:  Total age_group1 cnt_males  1015  60945  1015  60945.0000
#> 48:  Total       ag1a cnt_males  1015  60945  1015  60945.0000
#> 49:  Total age_group2 cnt_males   571  34245   570  34185.0263
#> 50:  Total       ag2a cnt_males   571  34245   570  34185.0263
#> 51:  Total age_group3 cnt_males   424  25845   425  25905.9552
#> 52:  Total age_group4 cnt_males   195  11996   197  12119.0359
#> 53:  Total age_group5 cnt_males    84   5080    84   5080.0000
#> 54:  Total age_group6 cnt_males     7    466     8    532.5714
#> 55:   male      Total cnt_males  2296 138577  2297 138637.3558
#> 56:      m      Total cnt_males  2296 138577  2297 138637.3558
#> 57:   male age_group1 cnt_males  1015  60945  1015  60945.0000
#> 58:      m age_group1 cnt_males  1015  60945  1015  60945.0000
#> 59:   male       ag1a cnt_males  1015  60945  1015  60945.0000
#> 60:      m       ag1a cnt_males  1015  60945  1015  60945.0000
#> 61:   male age_group2 cnt_males   571  34245   570  34185.0263
#> 62:      m age_group2 cnt_males   571  34245   570  34185.0263
#> 63:   male       ag2a cnt_males   571  34245   570  34185.0263
#> 64:      m       ag2a cnt_males   571  34245   570  34185.0263
#> 65:   male age_group3 cnt_males   424  25845   425  25905.9552
#> 66:      m age_group3 cnt_males   424  25845   425  25905.9552
#> 67:   male age_group4 cnt_males   195  11996   197  12119.0359
#> 68:      m age_group4 cnt_males   195  11996   197  12119.0359
#> 69:   male age_group5 cnt_males    84   5080    84   5080.0000
#> 70:      m age_group5 cnt_males    84   5080    84   5080.0000
#> 71:   male age_group6 cnt_males     7    466     8    532.5714
#> 72:      m age_group6 cnt_males     7    466     8    532.5714
#> 73: female      Total cnt_males     0      0     0      0.0000
#> 74:      f      Total cnt_males     0      0     0      0.0000
#> 75: female age_group1 cnt_males     0      0     0      0.0000
#> 76:      f age_group1 cnt_males     0      0     0      0.0000
#> 77: female       ag1a cnt_males     0      0     0      0.0000
#> 78:      f       ag1a cnt_males     0      0     0      0.0000
#> 79: female age_group2 cnt_males     0      0     0      0.0000
#> 80:      f age_group2 cnt_males     0      0     0      0.0000
#> 81: female       ag2a cnt_males     0      0     0      0.0000
#> 82:      f       ag2a cnt_males     0      0     0      0.0000
#> 83: female age_group3 cnt_males     0      0     0      0.0000
#> 84:      f age_group3 cnt_males     0      0     0      0.0000
#> 85: female age_group4 cnt_males     0      0     0      0.0000
#> 86:      f age_group4 cnt_males     0      0     0      0.0000
#> 87: female age_group5 cnt_males     0      0     0      0.0000
#> 88:      f age_group5 cnt_males     0      0     0      0.0000
#> 89: female age_group6 cnt_males     0      0     0      0.0000
#> 90:      f age_group6 cnt_males     0      0     0      0.0000
#>        sex        age     vname   uwc     wc  puwc         pwc

# numerical variables (positive variables using flex-function)
# we also write the config to a yaml file
f_yaml <- tempfile(fileext = ".yaml")

# create a ptable using functionality from the ptable-pkg
# a single ptable for all cells
ptab1 <- ptable::pt_ex_nums(parity = TRUE, separation = FALSE)

# a single ptab for all cells except for very small ones
ptab2 <- ptable::pt_ex_nums(parity = TRUE, separation = TRUE)

# different ptables for cells with even/odd number of contributors
# and very small cells
ptab3 <- ptable::pt_ex_nums(parity = FALSE, separation = TRUE)

p_nums1 <- ck_params_nums(
  ptab = ptab1,
  type = "top_contr",
  top_k = 3,
  mult_params = ck_flexparams(
    fp = 1000,
    p = c(0.30, 0.03),
    epsilon = c(1, 0.5, 0.2),
    q = 3),
  mu_c = 2,
  same_key = FALSE,
  use_zero_rkeys = FALSE,
  path = f_yaml)
#> yaml configuration '/tmp/RtmpHZpsuV/file1d2c5b91f12f.yaml' successfully written.

# we read the parameters from the yaml-file
p_nums1 <- ck_read_yaml(path = f_yaml)

# for variables with positive and negative values
p_nums2 <- ck_params_nums(
  ptab = ptab2,
  type = "top_contr",
  top_k = 3,
  mult_params = ck_flexparams(
    fp = 1000,
    p = c(0.15, 0.02),
    epsilon = c(1, 0.4, 0.15),
    q = 3),
  mu_c = 2,
  same_key = FALSE)

# simple perturbation parameters (not using the flex-function approach)
p_nums3 <- ck_params_nums(
  ptab = ptab3,
  type = "mean",
  mult_params = ck_simpleparams(p = 0.25),
  mu_c = 2,
  same_key = FALSE)

# use `p_nums1` for all variables
tab$params_nums_set(p_nums1, c("savings", "income", "expend"))
#> --> setting perturbation parameters for variable 'savings'
#> --> setting perturbation parameters for variable 'income'
#> --> setting perturbation parameters for variable 'expend'

# use different parameters for variable `mixed`
tab$params_nums_set(p_nums2, v = "mixed")
#> --> setting perturbation parameters for variable 'mixed'

# identify sensitive cells to which extra protection (`mu_c`) is added.
tab$supp_p(v = "income", p = 85)
#> computing contributing indices | rawdata <--> table; this might take a while
#> p%-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_pq(v = "income", p = 85, q = 90)
#> computing contributing indices | rawdata <--> table; this might take a while
#> pq-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_nk(v = "income", n = 2, k = 90)
#> computing contributing indices | rawdata <--> table; this might take a while
#> nk-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_freq(v = "income", n = 14, weighted = FALSE)
#> freq-rule: 5 new sensitive cells (incl. duplicates) found (total: 5)
tab$supp_val(v = "income", n = 10000, weighted = TRUE)
#> val-rule: 0 new sensitive cells (incl. duplicates) found (total: 5)
tab$supp_cells(
  v = "income",
  inp = data.frame(
    sex = c("female", "female"),
    "age" = c("age_group1", "age_group3")
  )
)
#> cell-rule: 2 new sensitive cells (incl. duplicates) found (total: 7)

# perturb variables
tab$perturb(v = c("income", "savings"))
#> Numeric variable 'income' was perturbed.
#> Numeric variable 'savings' was perturbed.

# extract results
tab$numtab("income", mean_before_sum = TRUE)
#>        sex        age  vname      uws         ws        pws
#>     <char>     <char> <char>    <num>      <num>      <num>
#>  1:  Total      Total income 22952978 1378728411 1378803432
#>  2:  Total age_group1 income  9810547  588645646  588641987
#>  3:  Total       ag1a income  9810547  588645646  588641987
#>  4:  Total age_group2 income  5692119  339113890  339004611
#>  5:  Total       ag2a income  5692119  339113890  339004611
#>  6:  Total age_group3 income  4406946  268059460  268125525
#>  7:  Total age_group4 income  2133543  128126535  128233857
#>  8:  Total age_group5 income   848151   50544273   50558527
#>  9:  Total age_group6 income    61672    4238607    4002653
#> 10:   male      Total income 11262049  682659125  682642380
#> 11:      m      Total income 11262049  682659125  682642380
#> 12:   male age_group1 income  4877164  292653191  292688150
#> 13:      m age_group1 income  4877164  292653191  292688150
#> 14:   male       ag1a income  4877164  292653191  292688150
#> 15:      m       ag1a income  4877164  292653191  292688150
#> 16:   male age_group2 income  2811379  169879170  169710875
#> 17:      m age_group2 income  2811379  169879170  169710875
#> 18:   male       ag2a income  2811379  169879170  169710875
#> 19:      m       ag2a income  2811379  169879170  169710875
#> 20:   male age_group3 income  2168169  134579789  134535983
#> 21:      m age_group3 income  2168169  134579789  134535983
#> 22:   male age_group4 income   978510   60132299   60243529
#> 23:      m age_group4 income   978510   60132299   60243529
#> 24:   male age_group5 income   393134   22911025   22891974
#> 25:      m age_group5 income   393134   22911025   22891974
#> 26:   male age_group6 income    33693    2503651    2611269
#> 27:      m age_group6 income    33693    2503651    2611269
#> 28: female      Total income 11690929  696069286  696024828
#> 29:      f      Total income 11690929  696069286  696024828
#> 30: female age_group1 income  4933383  295992455  295984544
#> 31:      f age_group1 income  4933383  295992455  295984544
#> 32: female       ag1a income  4933383  295992455  295984544
#> 33:      f       ag1a income  4933383  295992455  295984544
#> 34: female age_group2 income  2880740  169234720  169277054
#> 35:      f age_group2 income  2880740  169234720  169277054
#> 36: female       ag2a income  2880740  169234720  169277054
#> 37:      f       ag2a income  2880740  169234720  169277054
#> 38: female age_group3 income  2238777  133479671  133524702
#> 39:      f age_group3 income  2238777  133479671  133524702
#> 40: female age_group4 income  1155033   67994236   68013035
#> 41:      f age_group4 income  1155033   67994236   68013035
#> 42: female age_group5 income   455017   27633248   27436426
#> 43:      f age_group5 income   455017   27633248   27436426
#> 44: female age_group6 income    27979    1734956    1884549
#> 45:      f age_group6 income    27979    1734956    1884549
#>        sex        age  vname      uws         ws        pws
tab$numtab("income", mean_before_sum = FALSE)
#>        sex        age  vname      uws         ws        pws
#>     <char>     <char> <char>    <num>      <num>      <num>
#>  1:  Total      Total income 22952978 1378728411 1378765921
#>  2:  Total age_group1 income  9810547  588645646  588643817
#>  3:  Total       ag1a income  9810547  588645646  588643817
#>  4:  Total age_group2 income  5692119  339113890  339059246
#>  5:  Total       ag2a income  5692119  339113890  339059246
#>  6:  Total age_group3 income  4406946  268059460  268092490
#>  7:  Total age_group4 income  2133543  128126535  128180185
#>  8:  Total age_group5 income   848151   50544273   50551399
#>  9:  Total age_group6 income    61672    4238607    4118941
#> 10:   male      Total income 11262049  682659125  682650752
#> 11:      m      Total income 11262049  682659125  682650752
#> 12:   male age_group1 income  4877164  292653191  292670670
#> 13:      m age_group1 income  4877164  292653191  292670670
#> 14:   male       ag1a income  4877164  292653191  292670670
#> 15:      m       ag1a income  4877164  292653191  292670670
#> 16:   male age_group2 income  2811379  169879170  169795002
#> 17:      m age_group2 income  2811379  169879170  169795002
#> 18:   male       ag2a income  2811379  169879170  169795002
#> 19:      m       ag2a income  2811379  169879170  169795002
#> 20:   male age_group3 income  2168169  134579789  134557884
#> 21:      m age_group3 income  2168169  134579789  134557884
#> 22:   male age_group4 income   978510   60132299   60187888
#> 23:      m age_group4 income   978510   60132299   60187888
#> 24:   male age_group5 income   393134   22911025   22901497
#> 25:      m age_group5 income   393134   22911025   22901497
#> 26:   male age_group6 income    33693    2503651    2556894
#> 27:      m age_group6 income    33693    2503651    2556894
#> 28: female      Total income 11690929  696069286  696047056
#> 29:      f      Total income 11690929  696069286  696047056
#> 30: female age_group1 income  4933383  295992455  295988500
#> 31:      f age_group1 income  4933383  295992455  295988500
#> 32: female       ag1a income  4933383  295992455  295988500
#> 33:      f       ag1a income  4933383  295992455  295988500
#> 34: female age_group2 income  2880740  169234720  169255886
#> 35:      f age_group2 income  2880740  169234720  169255886
#> 36: female       ag2a income  2880740  169234720  169255886
#> 37:      f       ag2a income  2880740  169234720  169255886
#> 38: female age_group3 income  2238777  133479671  133502185
#> 39:      f age_group3 income  2238777  133479671  133502185
#> 40: female age_group4 income  1155033   67994236   68003635
#> 41:      f age_group4 income  1155033   67994236   68003635
#> 42: female age_group5 income   455017   27633248   27534661
#> 43:      f age_group5 income   455017   27633248   27534661
#> 44: female age_group6 income    27979    1734956    1808206
#> 45:      f age_group6 income    27979    1734956    1808206
#>        sex        age  vname      uws         ws        pws
tab$numtab("savings")
#>        sex        age   vname     uws        ws         pws
#>     <char>     <char>  <char>   <num>     <num>       <num>
#>  1:  Total      Total savings 2273532 137026795 137032535.3
#>  2:  Total age_group1 savings  982386  59436797  59435344.1
#>  3:  Total       ag1a savings  982386  59436797  59435344.1
#>  4:  Total age_group2 savings  552336  32886105  32875905.7
#>  5:  Total       ag2a savings  552336  32886105  32875905.7
#>  6:  Total age_group3 savings  437101  26457789  26452807.1
#>  7:  Total age_group4 savings  214661  13014851  13024613.8
#>  8:  Total age_group5 savings   80451   4819415   4819584.3
#>  9:  Total age_group6 savings    6597    411838    406425.0
#> 10:   male      Total savings 1159816  70055883  70056754.7
#> 11:      m      Total savings 1159816  70055883  70056754.7
#> 12:   male age_group1 savings  517660  31197472  31200201.3
#> 13:      m age_group1 savings  517660  31197472  31200201.3
#> 14:   male       ag1a savings  517660  31197472  31200201.3
#> 15:      m       ag1a savings  517660  31197472  31200201.3
#> 16:   male age_group2 savings  280923  16723727  16719188.0
#> 17:      m age_group2 savings  280923  16723727  16719188.0
#> 18:   male       ag2a savings  280923  16723727  16719188.0
#> 19:      m       ag2a savings  280923  16723727  16719188.0
#> 20:   male age_group3 savings  214970  13109917  13108526.7
#> 21:      m age_group3 savings  214970  13109917  13108526.7
#> 22:   male age_group4 savings   99420   6192071   6202017.0
#> 23:      m age_group4 savings   99420   6192071   6202017.0
#> 24:   male age_group5 savings   43233   2619083   2618672.9
#> 25:      m age_group5 savings   43233   2619083   2618672.9
#> 26:   male age_group6 savings    3610    213613    213375.7
#> 27:      m age_group6 savings    3610    213613    213375.7
#> 28: female      Total savings 1113716  66970912  66962502.2
#> 29:      f      Total savings 1113716  66970912  66962502.2
#> 30: female age_group1 savings  464726  28239325  28241487.4
#> 31:      f age_group1 savings  464726  28239325  28241487.4
#> 32: female       ag1a savings  464726  28239325  28241487.4
#> 33:      f       ag1a savings  464726  28239325  28241487.4
#> 34: female age_group2 savings  271413  16162378  16165437.9
#> 35:      f age_group2 savings  271413  16162378  16165437.9
#> 36: female       ag2a savings  271413  16162378  16165437.9
#> 37:      f       ag2a savings  271413  16162378  16165437.9
#> 38: female age_group3 savings  222131  13347872  13350884.5
#> 39:      f age_group3 savings  222131  13347872  13350884.5
#> 40: female age_group4 savings  115241   6822780   6824733.3
#> 41:      f age_group4 savings  115241   6822780   6824733.3
#> 42: female age_group5 savings   37218   2200332   2190989.1
#> 43:      f age_group5 savings   37218   2200332   2190989.1
#> 44: female age_group6 savings    2987    198225    200052.5
#> 45:      f age_group6 savings    2987    198225    200052.5
#>        sex        age   vname     uws        ws         pws

# results can be resetted, too
tab$reset_cntvars(v = "cnt_males")

# we can then set other parameters and perturb again
tab$params_cnts_set(val = p_cnts1, v = "cnt_males")
#> --> setting perturbation parameters for variable 'cnt_males'

tab$perturb(v = "cnt_males")
#> Count variable 'cnt_males' was perturbed.

# write results to a .csv file
tab$freqtab(
  v = c("total", "cnt_males"),
  path = file.path(tempdir(), "outtab.csv")
)
#> File '/tmp/RtmpHZpsuV/outtab.csv' successfully written to disk.
#> NULL

# show results containing weighted and unweighted results
tab$freqtab(v = c("total", "cnt_males"))
#>        sex        age     vname   uwc     wc  puwc         pwc
#>     <char>     <char>    <char> <num>  <num> <num>       <num>
#>  1:  Total      Total     total  4580 275617  4580 275617.0000
#>  2:  Total age_group1     total  1969 118674  1969 118674.0000
#>  3:  Total       ag1a     total  1969 118674  1969 118674.0000
#>  4:  Total age_group2     total  1143  68018  1144  68077.5083
#>  5:  Total       ag2a     total  1143  68018  1144  68077.5083
#>  6:  Total age_group3     total   864  52588   864  52588.0000
#>  7:  Total age_group4     total   423  25354   424  25413.9385
#>  8:  Total age_group5     total   168  10148   167  10087.5952
#>  9:  Total age_group6     total    13    835    13    835.0000
#> 10:   male      Total     total  2296 138577  2296 138577.0000
#> 11:      m      Total     total  2296 138577  2296 138577.0000
#> 12:   male age_group1     total  1015  60945  1015  60945.0000
#> 13:      m age_group1     total  1015  60945  1015  60945.0000
#> 14:   male       ag1a     total  1015  60945  1015  60945.0000
#> 15:      m       ag1a     total  1015  60945  1015  60945.0000
#> 16:   male age_group2     total   571  34245   570  34185.0263
#> 17:      m age_group2     total   571  34245   570  34185.0263
#> 18:   male       ag2a     total   571  34245   570  34185.0263
#> 19:      m       ag2a     total   571  34245   570  34185.0263
#> 20:   male age_group3     total   424  25845   425  25905.9552
#> 21:      m age_group3     total   424  25845   425  25905.9552
#> 22:   male age_group4     total   195  11996   196  12057.5179
#> 23:      m age_group4     total   195  11996   196  12057.5179
#> 24:   male age_group5     total    84   5080    84   5080.0000
#> 25:      m age_group5     total    84   5080    84   5080.0000
#> 26:   male age_group6     total     7    466     8    532.5714
#> 27:      m age_group6     total     7    466     8    532.5714
#> 28: female      Total     total  2284 137040  2285 137100.0000
#> 29:      f      Total     total  2284 137040  2285 137100.0000
#> 30: female age_group1     total   954  57729   953  57668.4874
#> 31:      f age_group1     total   954  57729   953  57668.4874
#> 32: female       ag1a     total   954  57729   953  57668.4874
#> 33:      f       ag1a     total   954  57729   953  57668.4874
#> 34: female age_group2     total   572  33773   572  33773.0000
#> 35:      f age_group2     total   572  33773   572  33773.0000
#> 36: female       ag2a     total   572  33773   572  33773.0000
#> 37:      f       ag2a     total   572  33773   572  33773.0000
#> 38: female age_group3     total   440  26743   441  26803.7795
#> 39:      f age_group3     total   440  26743   441  26803.7795
#> 40: female age_group4     total   228  13358   230  13475.1754
#> 41:      f age_group4     total   228  13358   230  13475.1754
#> 42: female age_group5     total    84   5068    84   5068.0000
#> 43:      f age_group5     total    84   5068    84   5068.0000
#> 44: female age_group6     total     6    369     6    369.0000
#> 45:      f age_group6     total     6    369     6    369.0000
#> 46:  Total      Total cnt_males  2296 138577  2296 138577.0000
#> 47:  Total age_group1 cnt_males  1015  60945  1015  60945.0000
#> 48:  Total       ag1a cnt_males  1015  60945  1015  60945.0000
#> 49:  Total age_group2 cnt_males   571  34245   570  34185.0263
#> 50:  Total       ag2a cnt_males   571  34245   570  34185.0263
#> 51:  Total age_group3 cnt_males   424  25845   425  25905.9552
#> 52:  Total age_group4 cnt_males   195  11996   196  12057.5179
#> 53:  Total age_group5 cnt_males    84   5080    84   5080.0000
#> 54:  Total age_group6 cnt_males     7    466     8    532.5714
#> 55:   male      Total cnt_males  2296 138577  2296 138577.0000
#> 56:      m      Total cnt_males  2296 138577  2296 138577.0000
#> 57:   male age_group1 cnt_males  1015  60945  1015  60945.0000
#> 58:      m age_group1 cnt_males  1015  60945  1015  60945.0000
#> 59:   male       ag1a cnt_males  1015  60945  1015  60945.0000
#> 60:      m       ag1a cnt_males  1015  60945  1015  60945.0000
#> 61:   male age_group2 cnt_males   571  34245   570  34185.0263
#> 62:      m age_group2 cnt_males   571  34245   570  34185.0263
#> 63:   male       ag2a cnt_males   571  34245   570  34185.0263
#> 64:      m       ag2a cnt_males   571  34245   570  34185.0263
#> 65:   male age_group3 cnt_males   424  25845   425  25905.9552
#> 66:      m age_group3 cnt_males   424  25845   425  25905.9552
#> 67:   male age_group4 cnt_males   195  11996   196  12057.5179
#> 68:      m age_group4 cnt_males   195  11996   196  12057.5179
#> 69:   male age_group5 cnt_males    84   5080    84   5080.0000
#> 70:      m age_group5 cnt_males    84   5080    84   5080.0000
#> 71:   male age_group6 cnt_males     7    466     8    532.5714
#> 72:      m age_group6 cnt_males     7    466     8    532.5714
#> 73: female      Total cnt_males     0      0     0      0.0000
#> 74:      f      Total cnt_males     0      0     0      0.0000
#> 75: female age_group1 cnt_males     0      0     0      0.0000
#> 76:      f age_group1 cnt_males     0      0     0      0.0000
#> 77: female       ag1a cnt_males     0      0     0      0.0000
#> 78:      f       ag1a cnt_males     0      0     0      0.0000
#> 79: female age_group2 cnt_males     0      0     0      0.0000
#> 80:      f age_group2 cnt_males     0      0     0      0.0000
#> 81: female       ag2a cnt_males     0      0     0      0.0000
#> 82:      f       ag2a cnt_males     0      0     0      0.0000
#> 83: female age_group3 cnt_males     0      0     0      0.0000
#> 84:      f age_group3 cnt_males     0      0     0      0.0000
#> 85: female age_group4 cnt_males     0      0     0      0.0000
#> 86:      f age_group4 cnt_males     0      0     0      0.0000
#> 87: female age_group5 cnt_males     0      0     0      0.0000
#> 88:      f age_group5 cnt_males     0      0     0      0.0000
#> 89: female age_group6 cnt_males     0      0     0      0.0000
#> 90:      f age_group6 cnt_males     0      0     0      0.0000
#>        sex        age     vname   uwc     wc  puwc         pwc

# utility measures for a count variable
tab$measures_cnts(v = "total", exclude_zeros = TRUE)
#> $overview
#>     noise   cnt        pct
#>    <fctr> <int>      <num>
#> 1:     -2     2 0.04444444
#> 2:     -1    13 0.28888889
#> 3:      0    21 0.46666667
#> 4:      1     9 0.20000000
#> 
#> $measures
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 0.000 0.000 0.000
#>  6:   Mean 0.578 0.008 0.021
#>  7: Median 1.000 0.000 0.010
#>  8:    Q60 1.000 0.001 0.016
#>  9:    Q70 1.000 0.002 0.021
#> 10:    Q80 1.000 0.002 0.024
#> 11:    Q90 1.000 0.006 0.037
#> 12:    Q95 1.000 0.009 0.066
#> 13:    Q99 2.000 0.143 0.183
#> 14:    Max 2.000 0.143 0.183
#> 
#> $cumdistr_d1
#>       cat   cnt       pct
#>    <char> <int>     <num>
#> 1:      0    21 0.4666667
#> 2:      1    43 0.9555556
#> 3:      2    45 1.0000000
#> 
#> $cumdistr_d2
#>            cat   cnt       pct
#>         <char> <int>     <num>
#> 1:    [0,0.02]    43 0.9555556
#> 2: (0.02,0.05]    43 0.9555556
#> 3:  (0.05,0.1]    43 0.9555556
#> 4:   (0.1,0.2]    45 1.0000000
#> 5:   (0.2,0.3]    45 1.0000000
#> 6:   (0.3,0.4]    45 1.0000000
#> 7:   (0.4,0.5]    45 1.0000000
#> 8:   (0.5,Inf]    45 1.0000000
#> 
#> $cumdistr_d3
#>            cat   cnt       pct
#>         <char> <int>     <num>
#> 1:    [0,0.02]    29 0.6444444
#> 2: (0.02,0.05]    41 0.9111111
#> 3:  (0.05,0.1]    43 0.9555556
#> 4:   (0.1,0.2]    45 1.0000000
#> 5:   (0.2,0.3]    45 1.0000000
#> 6:   (0.3,0.4]    45 1.0000000
#> 7:   (0.4,0.5]    45 1.0000000
#> 8:   (0.5,Inf]    45 1.0000000
#> 
#> $false_zero
#> [1] 0
#> 
#> $false_nonzero
#> [1] 0
#> 
#> $exclude_zeros
#> [1] TRUE
#> 

# modifications for perturbed count variables
tab$mod_cnts()
#>         sex        age row_nr  pert      ckey  countvar
#>      <char>     <char>  <num> <int>     <num>    <char>
#>   1:  Total      Total     15     0 0.4894645     total
#>   2:  Total age_group1     15     0 0.6373559     total
#>   3:  Total       ag1a     15     0 0.6373559     total
#>   4:  Total age_group2     16     1 0.8031745     total
#>   5:  Total       ag2a     16     1 0.8031745     total
#>  ---                                                   
#> 131:      f age_group4     -1     0 0.0000000 cnt_males
#> 132: female age_group5     -1     0 0.0000000 cnt_males
#> 133:      f age_group5     -1     0 0.0000000 cnt_males
#> 134: female age_group6     -1     0 0.0000000 cnt_males
#> 135:      f age_group6     -1     0 0.0000000 cnt_males

# display a summary about utility measures
tab$summary()
#> ┌──────────────────────────────────────────────┐
#> │Utility measures for perturbed count variables│
#> └──────────────────────────────────────────────┘
#> ── Distribution statistics of perturbations ────────────────────────────────────
#>          countvar   Min   Q10   Q20   Q30   Q40   Mean Median   Q60   Q70   Q80
#>            <char> <num> <num> <num> <num> <num>  <num>  <num> <num> <num> <num>
#> 1:          total    -1  -1.0  -0.2     0     0  0.178      0     0     1   1.0
#> 2: cnt_highincome    -3  -1.6  -1.0     0     0 -0.044      0     0     0   1.0
#> 3:      cnt_males    -1  -1.0   0.0     0     0  0.067      0     0     0   0.2
#>      Q90   Q95   Q99   Max
#>    <num> <num> <num> <num>
#> 1:     1     1  2.00     2
#> 2:     2     2  2.56     3
#> 3:     1     1  1.00     1
#> 
#> ── Distance-based measures ─────────────────────────────────────────────────────
#> ✔ Variable: 'total'
#> 
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 0.000 0.000 0.000
#>  6:   Mean 0.578 0.008 0.021
#>  7: Median 1.000 0.000 0.010
#>  8:    Q60 1.000 0.001 0.016
#>  9:    Q70 1.000 0.002 0.021
#> 10:    Q80 1.000 0.002 0.024
#> 11:    Q90 1.000 0.006 0.037
#> 12:    Q95 1.000 0.009 0.066
#> 13:    Q99 2.000 0.143 0.183
#> 14:    Max 2.000 0.143 0.183
#> 
#> ✔ Variable: 'cnt_males'
#> 
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 0.000 0.000 0.000
#>  6:   Mean 0.556 0.017 0.032
#>  7: Median 1.000 0.002 0.021
#>  8:    Q60 1.000 0.002 0.021
#>  9:    Q70 1.000 0.002 0.024
#> 10:    Q80 1.000 0.005 0.033
#> 11:    Q90 1.000 0.060 0.095
#> 12:    Q95 1.000 0.143 0.183
#> 13:    Q99 1.000 0.143 0.183
#> 14:    Max 1.000 0.143 0.183
#> 
#> ✔ Variable: 'cnt_highincome'
#> 
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min   0.0 0.000 0.000
#>  2:    Q10   0.0 0.000 0.000
#>  3:    Q20   0.0 0.000 0.000
#>  4:    Q30   0.0 0.000 0.000
#>  5:    Q40   1.0 0.005 0.034
#>  6:   Mean   1.0 0.038 0.084
#>  7: Median   1.0 0.009 0.053
#>  8:    Q60   1.0 0.011 0.066
#>  9:    Q70   1.0 0.024 0.131
#> 10:    Q80   2.0 0.052 0.154
#> 11:    Q90   2.1 0.143 0.190
#> 12:    Q95   3.0 0.150 0.266
#> 13:    Q99   3.0 0.286 0.410
#> 14:    Max   3.0 0.286 0.410
#> 
#> ┌──────────────────────────────────────────────────┐
#> │Utility measures for perturbed numerical variables│
#> └──────────────────────────────────────────────────┘
#> ── Distribution statistics of perturbations ────────────────────────────────────
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#>      vname        Min       Q10        Q20       Q30       Q40      Mean
#>     <char>      <num>     <num>      <num>     <num>     <num>     <num>
#> 1:  expend        Inf        NA         NA        NA        NA       NaN
#> 2:  income -119666.44 -84168.15 -28712.405 -9527.670 -3955.398 -4277.001
#> 3: savings  -10199.26  -8409.78  -4538.963 -1440.378  -306.416  -194.113
#> 4:   mixed        Inf        NA         NA        NA        NA       NaN
#>       Median       Q60       Q70       Q80       Q90       Q95       Q99
#>        <num>     <num>     <num>     <num>     <num>     <num>     <num>
#> 1:        NA        NA        NA        NA        NA        NA        NA
#> 2: -1829.318 17479.098 21165.646 24616.965 53486.978 55589.280 73250.364
#> 3:   871.673  2036.928  2615.948  3012.478  3059.882  8958.283  9945.989
#> 4:        NA        NA        NA        NA        NA        NA        NA
#>          Max
#>        <num>
#> 1:      -Inf
#> 2: 73250.364
#> 3:  9945.989
#> 4:      -Inf
# }