ck_simpleparams() allows to define parameters for a simple perturbation approach based on a single magnitude parameter (m). The values of epsilon are used to "weight" parameter m in case type == "top_contr" is set in ck_params_nums().

ck_simpleparams(p, epsilon = 1)

Arguments

p

a percentage value used as magnitude for perturbation

epsilon

a numeric vector in descending order with all values >= 0 and <= 1 with the first element forced to equal 1. The length of this vector must correspond with the number top_k specified in ck_params_nums() when creating parameters for type == "top_contr" which is checked at runtime. This setting allows to use different flex-functions for the largest top_k contributors.

Value

an object suitable as input for ck_params_nums().

Details

details about the flex function can be found in Deliverable D4.2, Part I in SGA "Open Source tools for perturbative confidentiality methods"

Examples

# \donttest{
x <- ck_create_testdata()

# create some 0/1 variables that should be perturbed later
x[, cnt_females := ifelse(sex == "male", 0, 1)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              35          25.00000           0
#>    2:   2530      28       1              40          25.00000           1
#>    3:   6920     550       1              52          25.00000           0
#>    4:   7960     870       1              91          25.00000           0
#>    5:   9030      20       2              70          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              54          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              33          16.66667           0
#> 4579:   3880     294    1000              73          16.66667           1
#> 4580:   4830     911    1000              32          16.66667           0
x[, cnt_males := ifelse(sex == "male", 1, 0)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              35          25.00000           0
#>    2:   2530      28       1              40          25.00000           1
#>    3:   6920     550       1              52          25.00000           0
#>    4:   7960     870       1              91          25.00000           0
#>    5:   9030      20       2              70          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              54          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              33          16.66667           0
#> 4579:   3880     294    1000              73          16.66667           1
#> 4580:   4830     911    1000              32          16.66667           0
#>       cnt_males
#>           <num>
#>    1:         1
#>    2:         0
#>    3:         1
#>    4:         1
#>    5:         1
#>   ---          
#> 4576:         0
#> 4577:         1
#> 4578:         1
#> 4579:         0
#> 4580:         1
x[, cnt_highincome := ifelse(income >= 9000, 1, 0)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              35          25.00000           0
#>    2:   2530      28       1              40          25.00000           1
#>    3:   6920     550       1              52          25.00000           0
#>    4:   7960     870       1              91          25.00000           0
#>    5:   9030      20       2              70          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              54          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              33          16.66667           0
#> 4579:   3880     294    1000              73          16.66667           1
#> 4580:   4830     911    1000              32          16.66667           0
#>       cnt_males cnt_highincome
#>           <num>          <num>
#>    1:         1              0
#>    2:         0              0
#>    3:         1              0
#>    4:         1              0
#>    5:         1              1
#>   ---                         
#> 4576:         0              0
#> 4577:         1              0
#> 4578:         1              0
#> 4579:         0              0
#> 4580:         1              0
# a variable with positive and negative contributions
x[, mixed := sample(-10:10, nrow(x), replace = TRUE)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              35          25.00000           0
#>    2:   2530      28       1              40          25.00000           1
#>    3:   6920     550       1              52          25.00000           0
#>    4:   7960     870       1              91          25.00000           0
#>    5:   9030      20       2              70          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              54          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              33          16.66667           0
#> 4579:   3880     294    1000              73          16.66667           1
#> 4580:   4830     911    1000              32          16.66667           0
#>       cnt_males cnt_highincome mixed
#>           <num>          <num> <int>
#>    1:         1              0    -2
#>    2:         0              0   -10
#>    3:         1              0    -3
#>    4:         1              0     2
#>    5:         1              1     7
#>   ---                               
#> 4576:         0              0    -4
#> 4577:         1              0    -8
#> 4578:         1              0    -7
#> 4579:         0              0    10
#> 4580:         1              0     0

# create record keys
x$rkey <- ck_generate_rkeys(dat = x)

# define required inputs

# hierarchy with some bogus codes
d_sex <- hier_create(root = "Total", nodes = c("male", "female"))
d_sex <- hier_add(d_sex, root = "female", "f")
d_sex <- hier_add(d_sex, root = "male", "m")

d_age <- hier_create(root = "Total", nodes = paste0("age_group", 1:6))
d_age <- hier_add(d_age, root = "age_group1", "ag1a")
d_age <- hier_add(d_age, root = "age_group2", "ag2a")

# define the cell key object
countvars <- c("cnt_females", "cnt_males", "cnt_highincome")
numvars <- c("expend", "income", "savings", "mixed")
tab <- ck_setup(
  x = x,
  rkey = "rkey",
  dims = list(sex = d_sex, age = d_age),
  w = "sampling_weight",
  countvars = countvars,
  numvars = numvars)
#> computing contributing indices | rawdata <--> table; this might take a while

# show some information about this table instance
tab$print() # identical with print(tab)
#> ── Table Information ───────────────────────────────────────────────────────────
#> ✔ 45 cells in 2 dimensions ('sex', 'age')
#> ✔ weights: yes
#> ── Tabulated / Perturbed countvars ─────────────────────────────────────────────
#> ☐ 'total'
#> ☐ 'cnt_females'
#> ☐ 'cnt_males'
#> ☐ 'cnt_highincome'
#> ── Tabulated / Perturbed numvars ───────────────────────────────────────────────
#> ☐ 'expend'
#> ☐ 'income'
#> ☐ 'savings'
#> ☐ 'mixed'

# information about the hierarchies
tab$hierarchy_info()
#> $sex
#>      code level is_leaf parent
#>    <char> <int>  <lgcl> <char>
#> 1:  Total     1   FALSE  Total
#> 2:   male     2   FALSE  Total
#> 3:      m     3    TRUE   male
#> 4: female     2   FALSE  Total
#> 5:      f     3    TRUE female
#> 
#> $age
#>          code level is_leaf     parent
#>        <char> <int>  <lgcl>     <char>
#> 1:      Total     1   FALSE      Total
#> 2: age_group1     2   FALSE      Total
#> 3:       ag1a     3    TRUE age_group1
#> 4: age_group2     2   FALSE      Total
#> 5:       ag2a     3    TRUE age_group2
#> 6: age_group3     2    TRUE      Total
#> 7: age_group4     2    TRUE      Total
#> 8: age_group5     2    TRUE      Total
#> 9: age_group6     2    TRUE      Total
#> 

# which variables have been defined?
tab$allvars()
#> $cntvars
#> [1] "total"          "cnt_females"    "cnt_males"      "cnt_highincome"
#> 
#> $numvars
#> [1] "expend"  "income"  "savings" "mixed"  
#> 

# count variables
tab$cntvars()
#> [1] "total"          "cnt_females"    "cnt_males"      "cnt_highincome"

# continuous variables
tab$numvars()
#> [1] "expend"  "income"  "savings" "mixed"  

# create perturbation parameters for "total" variable and
# write to yaml-file

# create a ptable using functionality from the ptable-pkg
f_yaml <- tempfile(fileext = ".yaml")
p_cnts1 <- ck_params_cnts(
  ptab = ptable::pt_ex_cnts(),
  path = f_yaml)
#> yaml configuration '/tmp/RtmpHZpsuV/file1d2c1e8b70d2.yaml' successfully written.

# read parameters from yaml-file and set them for variable `"total"`
p_cnts1 <- ck_read_yaml(path = f_yaml)

tab$params_cnts_set(val = p_cnts1, v = "total")
#> --> setting perturbation parameters for variable 'total'

# create alternative perturbation parameters by specifying parameters
para2 <- ptable::create_cnt_ptable(
  D = 8, V = 3, js = 2, create = FALSE)

p_cnts2 <- ck_params_cnts(ptab = para2)

# use these ptable it for the remaining variables
tab$params_cnts_set(val = p_cnts2, v = countvars)
#> --> setting perturbation parameters for variable 'cnt_females'
#> --> setting perturbation parameters for variable 'cnt_males'
#> --> setting perturbation parameters for variable 'cnt_highincome'

# perturb a variable
tab$perturb(v = "total")
#> Count variable 'total' was perturbed.

# multiple variables can be perturbed as well
tab$perturb(v = c("cnt_males", "cnt_highincome"))
#> Count variable 'cnt_males' was perturbed.
#> Count variable 'cnt_highincome' was perturbed.

# return weighted and unweighted results
tab$freqtab(v = c("total", "cnt_males"))
#>        sex        age     vname   uwc     wc  puwc         pwc
#>     <char>     <char>    <char> <num>  <num> <num>       <num>
#>  1:  Total      Total     total  4580 275388  4579 275327.8716
#>  2:  Total age_group1     total  1969 119081  1968 119020.5221
#>  3:  Total       ag1a     total  1969 119081  1968 119020.5221
#>  4:  Total age_group2     total  1143  68090  1144  68149.5713
#>  5:  Total       ag2a     total  1143  68090  1144  68149.5713
#>  6:  Total age_group3     total   864  52139   864  52139.0000
#>  7:  Total age_group4     total   423  25276   424  25335.7541
#>  8:  Total age_group5     total   168  10002   169  10061.5357
#>  9:  Total age_group6     total    13    800    12    738.4615
#> 10:   male      Total     total  2296 138648  2296 138648.0000
#> 11:      m      Total     total  2296 138648  2296 138648.0000
#> 12:   male age_group1     total  1015  61556  1013  61434.7074
#> 13:      m age_group1     total  1015  61556  1013  61434.7074
#> 14:   male       ag1a     total  1015  61556  1013  61434.7074
#> 15:      m       ag1a     total  1015  61556  1013  61434.7074
#> 16:   male age_group2     total   571  34425   571  34425.0000
#> 17:      m age_group2     total   571  34425   571  34425.0000
#> 18:   male       ag2a     total   571  34425   571  34425.0000
#> 19:      m       ag2a     total   571  34425   571  34425.0000
#> 20:   male age_group3     total   424  25639   423  25578.5307
#> 21:      m age_group3     total   424  25639   423  25578.5307
#> 22:   male age_group4     total   195  11829   194  11768.3385
#> 23:      m age_group4     total   195  11829   194  11768.3385
#> 24:   male age_group5     total    84   4803    84   4803.0000
#> 25:      m age_group5     total    84   4803    84   4803.0000
#> 26:   male age_group6     total     7    396     6    339.4286
#> 27:      m age_group6     total     7    396     6    339.4286
#> 28: female      Total     total  2284 136740  2285 136799.8687
#> 29:      f      Total     total  2284 136740  2285 136799.8687
#> 30: female age_group1     total   954  57525   953  57464.7013
#> 31:      f age_group1     total   954  57525   953  57464.7013
#> 32: female       ag1a     total   954  57525   953  57464.7013
#> 33:      f       ag1a     total   954  57525   953  57464.7013
#> 34: female age_group2     total   572  33665   572  33665.0000
#> 35:      f age_group2     total   572  33665   572  33665.0000
#> 36: female       ag2a     total   572  33665   572  33665.0000
#> 37:      f       ag2a     total   572  33665   572  33665.0000
#> 38: female age_group3     total   440  26500   439  26439.7727
#> 39:      f age_group3     total   440  26500   439  26439.7727
#> 40: female age_group4     total   228  13447   228  13447.0000
#> 41:      f age_group4     total   228  13447   228  13447.0000
#> 42: female age_group5     total    84   5199    84   5199.0000
#> 43:      f age_group5     total    84   5199    84   5199.0000
#> 44: female age_group6     total     6    404     8    538.6667
#> 45:      f age_group6     total     6    404     8    538.6667
#> 46:  Total      Total cnt_males  2296 138648  2295 138587.6132
#> 47:  Total age_group1 cnt_males  1015  61556  1012  61374.0611
#> 48:  Total       ag1a cnt_males  1015  61556  1012  61374.0611
#> 49:  Total age_group2 cnt_males   571  34425   571  34425.0000
#> 50:  Total       ag2a cnt_males   571  34425   571  34425.0000
#> 51:  Total age_group3 cnt_males   424  25639   422  25518.0613
#> 52:  Total age_group4 cnt_males   195  11829   194  11768.3385
#> 53:  Total age_group5 cnt_males    84   4803    83   4745.8214
#> 54:  Total age_group6 cnt_males     7    396     6    339.4286
#> 55:   male      Total cnt_males  2296 138648  2295 138587.6132
#> 56:      m      Total cnt_males  2296 138648  2295 138587.6132
#> 57:   male age_group1 cnt_males  1015  61556  1012  61374.0611
#> 58:      m age_group1 cnt_males  1015  61556  1012  61374.0611
#> 59:   male       ag1a cnt_males  1015  61556  1012  61374.0611
#> 60:      m       ag1a cnt_males  1015  61556  1012  61374.0611
#> 61:   male age_group2 cnt_males   571  34425   571  34425.0000
#> 62:      m age_group2 cnt_males   571  34425   571  34425.0000
#> 63:   male       ag2a cnt_males   571  34425   571  34425.0000
#> 64:      m       ag2a cnt_males   571  34425   571  34425.0000
#> 65:   male age_group3 cnt_males   424  25639   422  25518.0613
#> 66:      m age_group3 cnt_males   424  25639   422  25518.0613
#> 67:   male age_group4 cnt_males   195  11829   194  11768.3385
#> 68:      m age_group4 cnt_males   195  11829   194  11768.3385
#> 69:   male age_group5 cnt_males    84   4803    83   4745.8214
#> 70:      m age_group5 cnt_males    84   4803    83   4745.8214
#> 71:   male age_group6 cnt_males     7    396     6    339.4286
#> 72:      m age_group6 cnt_males     7    396     6    339.4286
#> 73: female      Total cnt_males     0      0     0      0.0000
#> 74:      f      Total cnt_males     0      0     0      0.0000
#> 75: female age_group1 cnt_males     0      0     0      0.0000
#> 76:      f age_group1 cnt_males     0      0     0      0.0000
#> 77: female       ag1a cnt_males     0      0     0      0.0000
#> 78:      f       ag1a cnt_males     0      0     0      0.0000
#> 79: female age_group2 cnt_males     0      0     0      0.0000
#> 80:      f age_group2 cnt_males     0      0     0      0.0000
#> 81: female       ag2a cnt_males     0      0     0      0.0000
#> 82:      f       ag2a cnt_males     0      0     0      0.0000
#> 83: female age_group3 cnt_males     0      0     0      0.0000
#> 84:      f age_group3 cnt_males     0      0     0      0.0000
#> 85: female age_group4 cnt_males     0      0     0      0.0000
#> 86:      f age_group4 cnt_males     0      0     0      0.0000
#> 87: female age_group5 cnt_males     0      0     0      0.0000
#> 88:      f age_group5 cnt_males     0      0     0      0.0000
#> 89: female age_group6 cnt_males     0      0     0      0.0000
#> 90:      f age_group6 cnt_males     0      0     0      0.0000
#>        sex        age     vname   uwc     wc  puwc         pwc

# numerical variables (positive variables using flex-function)
# we also write the config to a yaml file
f_yaml <- tempfile(fileext = ".yaml")

# create a ptable using functionality from the ptable-pkg
# a single ptable for all cells
ptab1 <- ptable::pt_ex_nums(parity = TRUE, separation = FALSE)

# a single ptab for all cells except for very small ones
ptab2 <- ptable::pt_ex_nums(parity = TRUE, separation = TRUE)

# different ptables for cells with even/odd number of contributors
# and very small cells
ptab3 <- ptable::pt_ex_nums(parity = FALSE, separation = TRUE)

p_nums1 <- ck_params_nums(
  ptab = ptab1,
  type = "top_contr",
  top_k = 3,
  mult_params = ck_flexparams(
    fp = 1000,
    p = c(0.30, 0.03),
    epsilon = c(1, 0.5, 0.2),
    q = 3),
  mu_c = 2,
  same_key = FALSE,
  use_zero_rkeys = FALSE,
  path = f_yaml)
#> yaml configuration '/tmp/RtmpHZpsuV/file1d2c44627d20.yaml' successfully written.

# we read the parameters from the yaml-file
p_nums1 <- ck_read_yaml(path = f_yaml)

# for variables with positive and negative values
p_nums2 <- ck_params_nums(
  ptab = ptab2,
  type = "top_contr",
  top_k = 3,
  mult_params = ck_flexparams(
    fp = 1000,
    p = c(0.15, 0.02),
    epsilon = c(1, 0.4, 0.15),
    q = 3),
  mu_c = 2,
  same_key = FALSE)

# simple perturbation parameters (not using the flex-function approach)
p_nums3 <- ck_params_nums(
  ptab = ptab3,
  type = "mean",
  mult_params = ck_simpleparams(p = 0.25),
  mu_c = 2,
  same_key = FALSE)

# use `p_nums1` for all variables
tab$params_nums_set(p_nums1, c("savings", "income", "expend"))
#> --> setting perturbation parameters for variable 'savings'
#> --> setting perturbation parameters for variable 'income'
#> --> setting perturbation parameters for variable 'expend'

# use different parameters for variable `mixed`
tab$params_nums_set(p_nums2, v = "mixed")
#> --> setting perturbation parameters for variable 'mixed'

# identify sensitive cells to which extra protection (`mu_c`) is added.
tab$supp_p(v = "income", p = 85)
#> computing contributing indices | rawdata <--> table; this might take a while
#> p%-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_pq(v = "income", p = 85, q = 90)
#> computing contributing indices | rawdata <--> table; this might take a while
#> pq-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_nk(v = "income", n = 2, k = 90)
#> computing contributing indices | rawdata <--> table; this might take a while
#> nk-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_freq(v = "income", n = 14, weighted = FALSE)
#> freq-rule: 5 new sensitive cells (incl. duplicates) found (total: 5)
tab$supp_val(v = "income", n = 10000, weighted = TRUE)
#> val-rule: 0 new sensitive cells (incl. duplicates) found (total: 5)
tab$supp_cells(
  v = "income",
  inp = data.frame(
    sex = c("female", "female"),
    "age" = c("age_group1", "age_group3")
  )
)
#> cell-rule: 2 new sensitive cells (incl. duplicates) found (total: 7)

# perturb variables
tab$perturb(v = c("income", "savings"))
#> Numeric variable 'income' was perturbed.
#> Numeric variable 'savings' was perturbed.

# extract results
tab$numtab("income", mean_before_sum = TRUE)
#>        sex        age  vname      uws         ws        pws
#>     <char>     <char> <char>    <num>      <num>      <num>
#>  1:  Total      Total income 22952978 1385479597 1385487844
#>  2:  Total age_group1 income  9810547  599164816  599063528
#>  3:  Total       ag1a income  9810547  599164816  599063528
#>  4:  Total age_group2 income  5692119  339402737  339371490
#>  5:  Total       ag2a income  5692119  339402737  339371490
#>  6:  Total age_group3 income  4406946  266011965  266048324
#>  7:  Total age_group4 income  2133543  127438140  127451574
#>  8:  Total age_group5 income   848151   49555312   49548158
#>  9:  Total age_group6 income    61672    3906627    3874646
#> 10:   male      Total income 11262049  681348589  681368954
#> 11:      m      Total income 11262049  681348589  681368954
#> 12:   male age_group1 income  4877164  298807797  298740870
#> 13:      m age_group1 income  4877164  298807797  298740870
#> 14:   male       ag1a income  4877164  298807797  298740870
#> 15:      m       ag1a income  4877164  298807797  298740870
#> 16:   male age_group2 income  2811379  169773423  169871902
#> 17:      m age_group2 income  2811379  169773423  169871902
#> 18:   male       ag2a income  2811379  169773423  169871902
#> 19:      m       ag2a income  2811379  169773423  169871902
#> 20:   male age_group3 income  2168169  129581484  129513635
#> 21:      m age_group3 income  2168169  129581484  129513635
#> 22:   male age_group4 income   978510   59163672   59055514
#> 23:      m age_group4 income   978510   59163672   59055514
#> 24:   male age_group5 income   393134   21886601   21868447
#> 25:      m age_group5 income   393134   21886601   21868447
#> 26:   male age_group6 income    33693    2135612    2164973
#> 27:      m age_group6 income    33693    2135612    2164973
#> 28: female      Total income 11690929  704131008  704029777
#> 29:      f      Total income 11690929  704131008  704029777
#> 30: female age_group1 income  4933383  300357019  300381267
#> 31:      f age_group1 income  4933383  300357019  300381267
#> 32: female       ag1a income  4933383  300357019  300381267
#> 33:      f       ag1a income  4933383  300357019  300381267
#> 34: female age_group2 income  2880740  169629314  169629342
#> 35:      f age_group2 income  2880740  169629314  169629342
#> 36: female       ag2a income  2880740  169629314  169629342
#> 37:      f       ag2a income  2880740  169629314  169629342
#> 38: female age_group3 income  2238777  136430481  136360693
#> 39:      f age_group3 income  2238777  136430481  136360693
#> 40: female age_group4 income  1155033   68274468   68227703
#> 41:      f age_group4 income  1155033   68274468   68227703
#> 42: female age_group5 income   455017   27668711   27609571
#> 43:      f age_group5 income   455017   27668711   27609571
#> 44: female age_group6 income    27979    1771015    1845064
#> 45:      f age_group6 income    27979    1771015    1845064
#>        sex        age  vname      uws         ws        pws
tab$numtab("income", mean_before_sum = FALSE)
#>        sex        age  vname      uws         ws        pws
#>     <char>     <char> <char>    <num>      <num>      <num>
#>  1:  Total      Total income 22952978 1385479597 1385483720
#>  2:  Total age_group1 income  9810547  599164816  599114170
#>  3:  Total       ag1a income  9810547  599164816  599114170
#>  4:  Total age_group2 income  5692119  339402737  339387113
#>  5:  Total       ag2a income  5692119  339402737  339387113
#>  6:  Total age_group3 income  4406946  266011965  266030144
#>  7:  Total age_group4 income  2133543  127438140  127444857
#>  8:  Total age_group5 income   848151   49555312   49551735
#>  9:  Total age_group6 income    61672    3906627    3890603
#> 10:   male      Total income 11262049  681348589  681358771
#> 11:      m      Total income 11262049  681348589  681358771
#> 12:   male age_group1 income  4877164  298807797  298774331
#> 13:      m age_group1 income  4877164  298807797  298774331
#> 14:   male       ag1a income  4877164  298807797  298774331
#> 15:      m       ag1a income  4877164  298807797  298774331
#> 16:   male age_group2 income  2811379  169773423  169822655
#> 17:      m age_group2 income  2811379  169773423  169822655
#> 18:   male       ag2a income  2811379  169773423  169822655
#> 19:      m       ag2a income  2811379  169773423  169822655
#> 20:   male age_group3 income  2168169  129581484  129547555
#> 21:      m age_group3 income  2168169  129581484  129547555
#> 22:   male age_group4 income   978510   59163672   59109568
#> 23:      m age_group4 income   978510   59163672   59109568
#> 24:   male age_group5 income   393134   21886601   21877522
#> 25:      m age_group5 income   393134   21886601   21877522
#> 26:   male age_group6 income    33693    2135612    2150243
#> 27:      m age_group6 income    33693    2135612    2150243
#> 28: female      Total income 11690929  704131008  704080391
#> 29:      f      Total income 11690929  704131008  704080391
#> 30: female age_group1 income  4933383  300357019  300369143
#> 31:      f age_group1 income  4933383  300357019  300369143
#> 32: female       ag1a income  4933383  300357019  300369143
#> 33:      f       ag1a income  4933383  300357019  300369143
#> 34: female age_group2 income  2880740  169629314  169629328
#> 35:      f age_group2 income  2880740  169629314  169629328
#> 36: female       ag2a income  2880740  169629314  169629328
#> 37:      f       ag2a income  2880740  169629314  169629328
#> 38: female age_group3 income  2238777  136430481  136395583
#> 39:      f age_group3 income  2238777  136430481  136395583
#> 40: female age_group4 income  1155033   68274468   68251082
#> 41:      f age_group4 income  1155033   68274468   68251082
#> 42: female age_group5 income   455017   27668711   27639125
#> 43:      f age_group5 income   455017   27668711   27639125
#> 44: female age_group6 income    27979    1771015    1807660
#> 45:      f age_group6 income    27979    1771015    1807660
#>        sex        age  vname      uws         ws        pws
tab$numtab("savings")
#>        sex        age   vname     uws        ws         pws
#>     <char>     <char>  <char>   <num>     <num>       <num>
#>  1:  Total      Total savings 2273532 136322946 136323147.9
#>  2:  Total age_group1 savings  982386  59440804  59439264.6
#>  3:  Total       ag1a savings  982386  59440804  59439264.6
#>  4:  Total age_group2 savings  552336  32769330  32767170.2
#>  5:  Total       ag2a savings  552336  32769330  32767170.2
#>  6:  Total age_group3 savings  437101  26279763  26276838.7
#>  7:  Total age_group4 savings  214661  12780988  12781410.0
#>  8:  Total age_group5 savings   80451   4686870   4687282.2
#>  9:  Total age_group6 savings    6597    365191    364874.5
#> 10:   male      Total savings 1159816  69712514  69712216.4
#> 11:      m      Total savings 1159816  69712514  69712216.4
#> 12:   male age_group1 savings  517660  31183289  31177654.9
#> 13:      m age_group1 savings  517660  31183289  31177654.9
#> 14:   male       ag1a savings  517660  31183289  31177654.9
#> 15:      m       ag1a savings  517660  31183289  31177654.9
#> 16:   male age_group2 savings  280923  16888662  16892234.7
#> 17:      m age_group2 savings  280923  16888662  16892234.7
#> 18:   male       ag2a savings  280923  16888662  16892234.7
#> 19:      m       ag2a savings  280923  16888662  16892234.7
#> 20:   male age_group3 savings  214970  12985674  12980654.7
#> 21:      m age_group3 savings  214970  12985674  12980654.7
#> 22:   male age_group4 savings   99420   5960930   5956599.3
#> 23:      m age_group4 savings   99420   5960930   5956599.3
#> 24:   male age_group5 savings   43233   2511149   2510044.8
#> 25:      m age_group5 savings   43233   2511149   2510044.8
#> 26:   male age_group6 savings    3610    182810    182156.8
#> 27:      m age_group6 savings    3610    182810    182156.8
#> 28: female      Total savings 1113716  66610432  66604784.4
#> 29:      f      Total savings 1113716  66610432  66604784.4
#> 30: female age_group1 savings  464726  28257515  28265864.5
#> 31:      f age_group1 savings  464726  28257515  28265864.5
#> 32: female       ag1a savings  464726  28257515  28265864.5
#> 33:      f       ag1a savings  464726  28257515  28265864.5
#> 34: female age_group2 savings  271413  15880668  15881162.7
#> 35:      f age_group2 savings  271413  15880668  15881162.7
#> 36: female       ag2a savings  271413  15880668  15881162.7
#> 37:      f       ag2a savings  271413  15880668  15881162.7
#> 38: female age_group3 savings  222131  13294089  13289180.7
#> 39:      f age_group3 savings  222131  13294089  13289180.7
#> 40: female age_group4 savings  115241   6820058   6819140.3
#> 41:      f age_group4 savings  115241   6820058   6819140.3
#> 42: female age_group5 savings   37218   2175721   2173345.2
#> 43:      f age_group5 savings   37218   2175721   2173345.2
#> 44: female age_group6 savings    2987    182381    183525.1
#> 45:      f age_group6 savings    2987    182381    183525.1
#>        sex        age   vname     uws        ws         pws

# results can be resetted, too
tab$reset_cntvars(v = "cnt_males")

# we can then set other parameters and perturb again
tab$params_cnts_set(val = p_cnts1, v = "cnt_males")
#> --> setting perturbation parameters for variable 'cnt_males'

tab$perturb(v = "cnt_males")
#> Count variable 'cnt_males' was perturbed.

# write results to a .csv file
tab$freqtab(
  v = c("total", "cnt_males"),
  path = file.path(tempdir(), "outtab.csv")
)
#> File '/tmp/RtmpHZpsuV/outtab.csv' successfully written to disk.
#> NULL

# show results containing weighted and unweighted results
tab$freqtab(v = c("total", "cnt_males"))
#>        sex        age     vname   uwc     wc  puwc         pwc
#>     <char>     <char>    <char> <num>  <num> <num>       <num>
#>  1:  Total      Total     total  4580 275388  4579 275327.8716
#>  2:  Total age_group1     total  1969 119081  1968 119020.5221
#>  3:  Total       ag1a     total  1969 119081  1968 119020.5221
#>  4:  Total age_group2     total  1143  68090  1144  68149.5713
#>  5:  Total       ag2a     total  1143  68090  1144  68149.5713
#>  6:  Total age_group3     total   864  52139   864  52139.0000
#>  7:  Total age_group4     total   423  25276   424  25335.7541
#>  8:  Total age_group5     total   168  10002   169  10061.5357
#>  9:  Total age_group6     total    13    800    12    738.4615
#> 10:   male      Total     total  2296 138648  2296 138648.0000
#> 11:      m      Total     total  2296 138648  2296 138648.0000
#> 12:   male age_group1     total  1015  61556  1013  61434.7074
#> 13:      m age_group1     total  1015  61556  1013  61434.7074
#> 14:   male       ag1a     total  1015  61556  1013  61434.7074
#> 15:      m       ag1a     total  1015  61556  1013  61434.7074
#> 16:   male age_group2     total   571  34425   571  34425.0000
#> 17:      m age_group2     total   571  34425   571  34425.0000
#> 18:   male       ag2a     total   571  34425   571  34425.0000
#> 19:      m       ag2a     total   571  34425   571  34425.0000
#> 20:   male age_group3     total   424  25639   423  25578.5307
#> 21:      m age_group3     total   424  25639   423  25578.5307
#> 22:   male age_group4     total   195  11829   194  11768.3385
#> 23:      m age_group4     total   195  11829   194  11768.3385
#> 24:   male age_group5     total    84   4803    84   4803.0000
#> 25:      m age_group5     total    84   4803    84   4803.0000
#> 26:   male age_group6     total     7    396     6    339.4286
#> 27:      m age_group6     total     7    396     6    339.4286
#> 28: female      Total     total  2284 136740  2285 136799.8687
#> 29:      f      Total     total  2284 136740  2285 136799.8687
#> 30: female age_group1     total   954  57525   953  57464.7013
#> 31:      f age_group1     total   954  57525   953  57464.7013
#> 32: female       ag1a     total   954  57525   953  57464.7013
#> 33:      f       ag1a     total   954  57525   953  57464.7013
#> 34: female age_group2     total   572  33665   572  33665.0000
#> 35:      f age_group2     total   572  33665   572  33665.0000
#> 36: female       ag2a     total   572  33665   572  33665.0000
#> 37:      f       ag2a     total   572  33665   572  33665.0000
#> 38: female age_group3     total   440  26500   439  26439.7727
#> 39:      f age_group3     total   440  26500   439  26439.7727
#> 40: female age_group4     total   228  13447   228  13447.0000
#> 41:      f age_group4     total   228  13447   228  13447.0000
#> 42: female age_group5     total    84   5199    84   5199.0000
#> 43:      f age_group5     total    84   5199    84   5199.0000
#> 44: female age_group6     total     6    404     8    538.6667
#> 45:      f age_group6     total     6    404     8    538.6667
#> 46:  Total      Total cnt_males  2296 138648  2296 138648.0000
#> 47:  Total age_group1 cnt_males  1015  61556  1013  61434.7074
#> 48:  Total       ag1a cnt_males  1015  61556  1013  61434.7074
#> 49:  Total age_group2 cnt_males   571  34425   571  34425.0000
#> 50:  Total       ag2a cnt_males   571  34425   571  34425.0000
#> 51:  Total age_group3 cnt_males   424  25639   423  25578.5307
#> 52:  Total age_group4 cnt_males   195  11829   194  11768.3385
#> 53:  Total age_group5 cnt_males    84   4803    84   4803.0000
#> 54:  Total age_group6 cnt_males     7    396     6    339.4286
#> 55:   male      Total cnt_males  2296 138648  2296 138648.0000
#> 56:      m      Total cnt_males  2296 138648  2296 138648.0000
#> 57:   male age_group1 cnt_males  1015  61556  1013  61434.7074
#> 58:      m age_group1 cnt_males  1015  61556  1013  61434.7074
#> 59:   male       ag1a cnt_males  1015  61556  1013  61434.7074
#> 60:      m       ag1a cnt_males  1015  61556  1013  61434.7074
#> 61:   male age_group2 cnt_males   571  34425   571  34425.0000
#> 62:      m age_group2 cnt_males   571  34425   571  34425.0000
#> 63:   male       ag2a cnt_males   571  34425   571  34425.0000
#> 64:      m       ag2a cnt_males   571  34425   571  34425.0000
#> 65:   male age_group3 cnt_males   424  25639   423  25578.5307
#> 66:      m age_group3 cnt_males   424  25639   423  25578.5307
#> 67:   male age_group4 cnt_males   195  11829   194  11768.3385
#> 68:      m age_group4 cnt_males   195  11829   194  11768.3385
#> 69:   male age_group5 cnt_males    84   4803    84   4803.0000
#> 70:      m age_group5 cnt_males    84   4803    84   4803.0000
#> 71:   male age_group6 cnt_males     7    396     6    339.4286
#> 72:      m age_group6 cnt_males     7    396     6    339.4286
#> 73: female      Total cnt_males     0      0     0      0.0000
#> 74:      f      Total cnt_males     0      0     0      0.0000
#> 75: female age_group1 cnt_males     0      0     0      0.0000
#> 76:      f age_group1 cnt_males     0      0     0      0.0000
#> 77: female       ag1a cnt_males     0      0     0      0.0000
#> 78:      f       ag1a cnt_males     0      0     0      0.0000
#> 79: female age_group2 cnt_males     0      0     0      0.0000
#> 80:      f age_group2 cnt_males     0      0     0      0.0000
#> 81: female       ag2a cnt_males     0      0     0      0.0000
#> 82:      f       ag2a cnt_males     0      0     0      0.0000
#> 83: female age_group3 cnt_males     0      0     0      0.0000
#> 84:      f age_group3 cnt_males     0      0     0      0.0000
#> 85: female age_group4 cnt_males     0      0     0      0.0000
#> 86:      f age_group4 cnt_males     0      0     0      0.0000
#> 87: female age_group5 cnt_males     0      0     0      0.0000
#> 88:      f age_group5 cnt_males     0      0     0      0.0000
#> 89: female age_group6 cnt_males     0      0     0      0.0000
#> 90:      f age_group6 cnt_males     0      0     0      0.0000
#>        sex        age     vname   uwc     wc  puwc         pwc

# utility measures for a count variable
tab$measures_cnts(v = "total", exclude_zeros = TRUE)
#> $overview
#>     noise   cnt        pct
#>    <fctr> <int>      <num>
#> 1:     -2     2 0.04444444
#> 2:     -1     6 0.13333333
#> 3:      0    17 0.37777778
#> 4:      1    16 0.35555556
#> 5:      2     4 0.08888889
#> 
#> $measures
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 1.000 0.000 0.009
#>  6:   Mean 0.756 0.024 0.040
#>  7: Median 1.000 0.001 0.015
#>  8:    Q60 1.000 0.001 0.016
#>  9:    Q70 1.000 0.002 0.024
#> 10:    Q80 1.000 0.002 0.031
#> 11:    Q90 2.000 0.049 0.100
#> 12:    Q95 2.000 0.143 0.196
#> 13:    Q99 2.000 0.333 0.379
#> 14:    Max 2.000 0.333 0.379
#> 
#> $cumdistr_d1
#>       cat   cnt       pct
#>    <char> <int>     <num>
#> 1:      0    17 0.3777778
#> 2:      1    39 0.8666667
#> 3:      2    45 1.0000000
#> 
#> $cumdistr_d2
#>            cat   cnt       pct
#>         <char> <int>     <num>
#> 1:    [0,0.02]    40 0.8888889
#> 2: (0.02,0.05]    40 0.8888889
#> 3:  (0.05,0.1]    41 0.9111111
#> 4:   (0.1,0.2]    43 0.9555556
#> 5:   (0.2,0.3]    43 0.9555556
#> 6:   (0.3,0.4]    45 1.0000000
#> 7:   (0.4,0.5]    45 1.0000000
#> 8:   (0.5,Inf]    45 1.0000000
#> 
#> $cumdistr_d3
#>            cat   cnt       pct
#>         <char> <int>     <num>
#> 1:    [0,0.02]    28 0.6222222
#> 2: (0.02,0.05]    40 0.8888889
#> 3:  (0.05,0.1]    40 0.8888889
#> 4:   (0.1,0.2]    43 0.9555556
#> 5:   (0.2,0.3]    43 0.9555556
#> 6:   (0.3,0.4]    45 1.0000000
#> 7:   (0.4,0.5]    45 1.0000000
#> 8:   (0.5,Inf]    45 1.0000000
#> 
#> $false_zero
#> [1] 0
#> 
#> $false_nonzero
#> [1] 0
#> 
#> $exclude_zeros
#> [1] TRUE
#> 

# modifications for perturbed count variables
tab$mod_cnts()
#>         sex        age row_nr  pert      ckey  countvar
#>      <char>     <char>  <num> <int>     <num>    <char>
#>   1:  Total      Total     14    -1 0.2343841     total
#>   2:  Total age_group1     14    -1 0.2055782     total
#>   3:  Total       ag1a     14    -1 0.2055782     total
#>   4:  Total age_group2     16     1 0.7604279     total
#>   5:  Total       ag2a     16     1 0.7604279     total
#>  ---                                                   
#> 131:      f age_group4     -1     0 0.0000000 cnt_males
#> 132: female age_group5     -1     0 0.0000000 cnt_males
#> 133:      f age_group5     -1     0 0.0000000 cnt_males
#> 134: female age_group6     -1     0 0.0000000 cnt_males
#> 135:      f age_group6     -1     0 0.0000000 cnt_males

# display a summary about utility measures
tab$summary()
#> ┌──────────────────────────────────────────────┐
#> │Utility measures for perturbed count variables│
#> └──────────────────────────────────────────────┘
#> ── Distribution statistics of perturbations ────────────────────────────────────
#>          countvar   Min   Q10   Q20   Q30   Q40   Mean Median   Q60   Q70   Q80
#>            <char> <num> <num> <num> <num> <num>  <num>  <num> <num> <num> <num>
#> 1:          total    -2    -1    -1    -1    -1 -0.311      0     0     0     0
#> 2: cnt_highincome    -3    -2    -1    -1     0  0.822      0     2     2     3
#> 3:      cnt_males    -2    -2    -1    -1     0 -0.467      0     0     0     0
#>      Q90   Q95   Q99   Max
#>    <num> <num> <num> <num>
#> 1:     1     1     2     2
#> 2:     4     4     4     4
#> 3:     0     0     0     0
#> 
#> ── Distance-based measures ─────────────────────────────────────────────────────
#> ✔ Variable: 'total'
#> 
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 1.000 0.000 0.009
#>  6:   Mean 0.756 0.024 0.040
#>  7: Median 1.000 0.001 0.015
#>  8:    Q60 1.000 0.001 0.016
#>  9:    Q70 1.000 0.002 0.024
#> 10:    Q80 1.000 0.002 0.031
#> 11:    Q90 2.000 0.049 0.100
#> 12:    Q95 2.000 0.143 0.196
#> 13:    Q99 2.000 0.333 0.379
#> 14:    Max 2.000 0.333 0.379
#> 
#> ✔ Variable: 'cnt_males'
#> 
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 0.000 0.000 0.000
#>  6:   Mean 0.778 0.017 0.035
#>  7: Median 1.000 0.002 0.024
#>  8:    Q60 1.000 0.002 0.031
#>  9:    Q70 1.000 0.002 0.031
#> 10:    Q80 1.800 0.005 0.035
#> 11:    Q90 2.000 0.060 0.100
#> 12:    Q95 2.000 0.143 0.196
#> 13:    Q99 2.000 0.143 0.196
#> 14:    Max 2.000 0.143 0.196
#> 
#> ✔ Variable: 'cnt_highincome'
#> 
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 1.000 0.009 0.068
#>  3:    Q20 1.000 0.013 0.077
#>  4:    Q30 2.000 0.022 0.084
#>  5:    Q40 2.000 0.024 0.105
#>  6:   Mean 2.075 0.051 0.137
#>  7: Median 2.000 0.027 0.116
#>  8:    Q60 2.000 0.030 0.124
#>  9:    Q70 2.000 0.035 0.131
#> 10:    Q80 3.000 0.039 0.148
#> 11:    Q90 4.000 0.091 0.212
#> 12:    Q95 4.000 0.268 0.361
#> 13:    Q99 4.000 0.286 0.486
#> 14:    Max 4.000 0.286 0.486
#> 
#> ┌──────────────────────────────────────────────────┐
#> │Utility measures for perturbed numerical variables│
#> └──────────────────────────────────────────────────┘
#> ── Distribution statistics of perturbations ────────────────────────────────────
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#>      vname        Min        Q10       Q20        Q30        Q40      Mean
#>     <char>      <num>      <num>     <num>      <num>      <num>     <num>
#> 1:  expend        Inf         NA        NA         NA         NA       NaN
#> 2:  income -54103.509 -50617.416 -33929.00 -32689.641 -18968.636 -7995.237
#> 3: savings  -5647.586  -5634.091  -4908.27  -2375.849  -1539.437  -682.058
#> 4:   mixed        Inf         NA        NA         NA         NA       NaN
#>       Median      Q60       Q70       Q80       Q90       Q95       Q99
#>        <num>    <num>     <num>     <num>     <num>     <num>     <num>
#> 1:        NA       NA        NA        NA        NA        NA        NA
#> 2: -9078.929   14.129 10182.308 12625.137 36645.246 49232.255 49232.255
#> 3:  -917.712 -297.576   480.159  1144.127  3572.691  8349.515  8349.515
#> 4:        NA       NA        NA        NA        NA        NA        NA
#>          Max
#>        <num>
#> 1:      -Inf
#> 2: 49232.255
#> 3:  8349.515
#> 4:      -Inf
# }