ck_flexparams() allows to define a flex function that is used to lookup perturbation magnitudes (percentages) used when perturbing continuous variables.

ck_flexparams(fp, p = c(0.25, 0.05), epsilon = 1, q = 3)

Arguments

fp

(numeric scalar); at which point should the noise coefficient function reaches its desired maximum (defined by the first element of p)

p

a numeric vector of length 2 where both elements specify a percentage. The first value refers to the desired maximum perturbation percentage for small cells (depending on fp) while the second element refers to the desired maximum perturbation percentage for large cells. Both values must be between 0 and 1 and need to be in descending order.

epsilon

a numeric vector in descending order with all values >= 0 and <= 1 with the first element forced to equal 1. The length of this vector must correspond with the number top_k specified in ck_params_nums() when creating parameters for type == "top_contr" which is checked at runtime. This setting allows to use different flex-functions for the largest top_k contributors.

q

(numeric scalar); Parameter of the function; q needs to be >= 1

Value

an object suitable as input for ck_params_nums().

Details

details about the flex function can be found in Deliverable D4.2, Part I in SGA "Open Source tools for perturbative confidentiality methods"

Examples

# \donttest{
x <- ck_create_testdata()

# create some 0/1 variables that should be perturbed later
x[, cnt_females := ifelse(sex == "male", 0, 1)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              46          25.00000           0
#>    2:   2530      28       1              82          25.00000           1
#>    3:   6920     550       1              38          25.00000           0
#>    4:   7960     870       1              37          25.00000           0
#>    5:   9030      20       2              36          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              65          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              31          16.66667           0
#> 4579:   3880     294    1000              34          16.66667           1
#> 4580:   4830     911    1000              73          16.66667           0
x[, cnt_males := ifelse(sex == "male", 1, 0)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              46          25.00000           0
#>    2:   2530      28       1              82          25.00000           1
#>    3:   6920     550       1              38          25.00000           0
#>    4:   7960     870       1              37          25.00000           0
#>    5:   9030      20       2              36          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              65          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              31          16.66667           0
#> 4579:   3880     294    1000              34          16.66667           1
#> 4580:   4830     911    1000              73          16.66667           0
#>       cnt_males
#>           <num>
#>    1:         1
#>    2:         0
#>    3:         1
#>    4:         1
#>    5:         1
#>   ---          
#> 4576:         0
#> 4577:         1
#> 4578:         1
#> 4579:         0
#> 4580:         1
x[, cnt_highincome := ifelse(income >= 9000, 1, 0)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              46          25.00000           0
#>    2:   2530      28       1              82          25.00000           1
#>    3:   6920     550       1              38          25.00000           0
#>    4:   7960     870       1              37          25.00000           0
#>    5:   9030      20       2              36          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              65          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              31          16.66667           0
#> 4579:   3880     294    1000              34          16.66667           1
#> 4580:   4830     911    1000              73          16.66667           0
#>       cnt_males cnt_highincome
#>           <num>          <num>
#>    1:         1              0
#>    2:         0              0
#>    3:         1              0
#>    4:         1              0
#>    5:         1              1
#>   ---                         
#> 4576:         0              0
#> 4577:         1              0
#> 4578:         1              0
#> 4579:         0              0
#> 4580:         1              0
# a variable with positive and negative contributions
x[, mixed := sample(-10:10, nrow(x), replace = TRUE)]
#>       urbrur  roof walls water electcon relat    sex        age hhcivil expend
#>        <int> <int> <int> <int>    <int> <int> <fctr>     <fctr>   <int>  <num>
#>    1:      2     4     3     3        1     1   male age_group3       2   9093
#>    2:      2     4     3     3        1     2 female age_group3       2   2734
#>    3:      2     4     3     3        1     3   male age_group1       1   2652
#>    4:      2     4     3     3        1     3   male age_group1       1   1807
#>    5:      2     4     2     3        1     1   male age_group4       2    671
#>   ---                                                                         
#> 4576:      2     4     3     4        1     2 female age_group3       2   3696
#> 4577:      2     4     3     4        1     3   male age_group1       1    282
#> 4578:      2     4     3     4        1     3   male age_group1       1    840
#> 4579:      2     4     3     4        1     3 female age_group1       1   6258
#> 4580:      2     4     3     4        1     3   male age_group1       1   7019
#>       income savings ori_hid sampling_weight household_weights cnt_females
#>        <num>   <num>   <int>           <int>             <num>       <num>
#>    1:   5780      12       1              46          25.00000           0
#>    2:   2530      28       1              82          25.00000           1
#>    3:   6920     550       1              38          25.00000           0
#>    4:   7960     870       1              37          25.00000           0
#>    5:   9030      20       2              36          16.66667           0
#>   ---                                                                     
#> 4576:   7900     278    1000              65          16.66667           1
#> 4577:   1420     987    1000              62          16.66667           0
#> 4578:   8900     684    1000              31          16.66667           0
#> 4579:   3880     294    1000              34          16.66667           1
#> 4580:   4830     911    1000              73          16.66667           0
#>       cnt_males cnt_highincome mixed
#>           <num>          <num> <int>
#>    1:         1              0    -7
#>    2:         0              0    -9
#>    3:         1              0     2
#>    4:         1              0    -7
#>    5:         1              1    -3
#>   ---                               
#> 4576:         0              0    -6
#> 4577:         1              0   -10
#> 4578:         1              0     9
#> 4579:         0              0     1
#> 4580:         1              0     0

# create record keys
x$rkey <- ck_generate_rkeys(dat = x)

# define required inputs

# hierarchy with some bogus codes
d_sex <- hier_create(root = "Total", nodes = c("male", "female"))
d_sex <- hier_add(d_sex, root = "female", "f")
d_sex <- hier_add(d_sex, root = "male", "m")

d_age <- hier_create(root = "Total", nodes = paste0("age_group", 1:6))
d_age <- hier_add(d_age, root = "age_group1", "ag1a")
d_age <- hier_add(d_age, root = "age_group2", "ag2a")

# define the cell key object
countvars <- c("cnt_females", "cnt_males", "cnt_highincome")
numvars <- c("expend", "income", "savings", "mixed")
tab <- ck_setup(
  x = x,
  rkey = "rkey",
  dims = list(sex = d_sex, age = d_age),
  w = "sampling_weight",
  countvars = countvars,
  numvars = numvars)
#> computing contributing indices | rawdata <--> table; this might take a while

# show some information about this table instance
tab$print() # identical with print(tab)
#> ── Table Information ───────────────────────────────────────────────────────────
#> ✔ 45 cells in 2 dimensions ('sex', 'age')
#> ✔ weights: yes
#> ── Tabulated / Perturbed countvars ─────────────────────────────────────────────
#> ☐ 'total'
#> ☐ 'cnt_females'
#> ☐ 'cnt_males'
#> ☐ 'cnt_highincome'
#> ── Tabulated / Perturbed numvars ───────────────────────────────────────────────
#> ☐ 'expend'
#> ☐ 'income'
#> ☐ 'savings'
#> ☐ 'mixed'

# information about the hierarchies
tab$hierarchy_info()
#> $sex
#>      code level is_leaf parent
#>    <char> <int>  <lgcl> <char>
#> 1:  Total     1   FALSE  Total
#> 2:   male     2   FALSE  Total
#> 3:      m     3    TRUE   male
#> 4: female     2   FALSE  Total
#> 5:      f     3    TRUE female
#> 
#> $age
#>          code level is_leaf     parent
#>        <char> <int>  <lgcl>     <char>
#> 1:      Total     1   FALSE      Total
#> 2: age_group1     2   FALSE      Total
#> 3:       ag1a     3    TRUE age_group1
#> 4: age_group2     2   FALSE      Total
#> 5:       ag2a     3    TRUE age_group2
#> 6: age_group3     2    TRUE      Total
#> 7: age_group4     2    TRUE      Total
#> 8: age_group5     2    TRUE      Total
#> 9: age_group6     2    TRUE      Total
#> 

# which variables have been defined?
tab$allvars()
#> $cntvars
#> [1] "total"          "cnt_females"    "cnt_males"      "cnt_highincome"
#> 
#> $numvars
#> [1] "expend"  "income"  "savings" "mixed"  
#> 

# count variables
tab$cntvars()
#> [1] "total"          "cnt_females"    "cnt_males"      "cnt_highincome"

# continuous variables
tab$numvars()
#> [1] "expend"  "income"  "savings" "mixed"  

# create perturbation parameters for "total" variable and
# write to yaml-file

# create a ptable using functionality from the ptable-pkg
f_yaml <- tempfile(fileext = ".yaml")
p_cnts1 <- ck_params_cnts(
  ptab = ptable::pt_ex_cnts(),
  path = f_yaml)
#> yaml configuration '/tmp/Rtmpu511Tl/file1fa54433c4bb.yaml' successfully written.

# read parameters from yaml-file and set them for variable `"total"`
p_cnts1 <- ck_read_yaml(path = f_yaml)

tab$params_cnts_set(val = p_cnts1, v = "total")
#> --> setting perturbation parameters for variable 'total'

# create alternative perturbation parameters by specifying parameters
para2 <- ptable::create_cnt_ptable(
  D = 8, V = 3, js = 2, create = FALSE)

p_cnts2 <- ck_params_cnts(ptab = para2)

# use these ptable it for the remaining variables
tab$params_cnts_set(val = p_cnts2, v = countvars)
#> --> setting perturbation parameters for variable 'cnt_females'
#> --> setting perturbation parameters for variable 'cnt_males'
#> --> setting perturbation parameters for variable 'cnt_highincome'

# perturb a variable
tab$perturb(v = "total")
#> Count variable 'total' was perturbed.

# multiple variables can be perturbed as well
tab$perturb(v = c("cnt_males", "cnt_highincome"))
#> Count variable 'cnt_males' was perturbed.
#> Count variable 'cnt_highincome' was perturbed.

# return weighted and unweighted results
tab$freqtab(v = c("total", "cnt_males"))
#>        sex        age     vname   uwc     wc  puwc         pwc
#>     <char>     <char>    <char> <num>  <num> <num>       <num>
#>  1:  Total      Total     total  4580 272686  4581 272745.5384
#>  2:  Total age_group1     total  1969 117288  1970 117347.5673
#>  3:  Total       ag1a     total  1969 117288  1970 117347.5673
#>  4:  Total age_group2     total  1143  68941  1142  68880.6842
#>  5:  Total       ag2a     total  1143  68941  1142  68880.6842
#>  6:  Total age_group3     total   864  51132   863  51072.8194
#>  7:  Total age_group4     total   423  24549   421  24432.9291
#>  8:  Total age_group5     total   168  10075   168  10075.0000
#>  9:  Total age_group6     total    13    701    12    647.0769
#> 10:   male      Total     total  2296 137868  2295 137807.9530
#> 11:      m      Total     total  2296 137868  2295 137807.9530
#> 12:   male age_group1     total  1015  60570  1015  60570.0000
#> 13:      m age_group1     total  1015  60570  1015  60570.0000
#> 14:   male       ag1a     total  1015  60570  1015  60570.0000
#> 15:      m       ag1a     total  1015  60570  1015  60570.0000
#> 16:   male age_group2     total   571  34704   571  34704.0000
#> 17:      m age_group2     total   571  34704   571  34704.0000
#> 18:   male       ag2a     total   571  34704   571  34704.0000
#> 19:      m       ag2a     total   571  34704   571  34704.0000
#> 20:   male age_group3     total   424  25995   424  25995.0000
#> 21:      m age_group3     total   424  25995   424  25995.0000
#> 22:   male age_group4     total   195  11353   197  11469.4410
#> 23:      m age_group4     total   195  11353   197  11469.4410
#> 24:   male age_group5     total    84   4788    82   4674.0000
#> 25:      m age_group5     total    84   4788    82   4674.0000
#> 26:   male age_group6     total     7    458     6    392.5714
#> 27:      m age_group6     total     7    458     6    392.5714
#> 28: female      Total     total  2284 134818  2284 134818.0000
#> 29:      f      Total     total  2284 134818  2284 134818.0000
#> 30: female age_group1     total   954  56718   953  56658.5472
#> 31:      f age_group1     total   954  56718   953  56658.5472
#> 32: female       ag1a     total   954  56718   953  56658.5472
#> 33:      f       ag1a     total   954  56718   953  56658.5472
#> 34: female age_group2     total   572  34237   573  34296.8549
#> 35:      f age_group2     total   572  34237   573  34296.8549
#> 36: female       ag2a     total   572  34237   573  34296.8549
#> 37:      f       ag2a     total   572  34237   573  34296.8549
#> 38: female age_group3     total   440  25137   441  25194.1295
#> 39:      f age_group3     total   440  25137   441  25194.1295
#> 40: female age_group4     total   228  13196   227  13138.1228
#> 41:      f age_group4     total   228  13196   227  13138.1228
#> 42: female age_group5     total    84   5287    84   5287.0000
#> 43:      f age_group5     total    84   5287    84   5287.0000
#> 44: female age_group6     total     6    243     8    324.0000
#> 45:      f age_group6     total     6    243     8    324.0000
#> 46:  Total      Total cnt_males  2296 137868  2295 137807.9530
#> 47:  Total age_group1 cnt_males  1015  60570  1015  60570.0000
#> 48:  Total       ag1a cnt_males  1015  60570  1015  60570.0000
#> 49:  Total age_group2 cnt_males   571  34704   570  34643.2224
#> 50:  Total       ag2a cnt_males   571  34704   570  34643.2224
#> 51:  Total age_group3 cnt_males   424  25995   423  25933.6910
#> 52:  Total age_group4 cnt_males   195  11353   198  11527.6615
#> 53:  Total age_group5 cnt_males    84   4788    81   4617.0000
#> 54:  Total age_group6 cnt_males     7    458     5    327.1429
#> 55:   male      Total cnt_males  2296 137868  2295 137807.9530
#> 56:      m      Total cnt_males  2296 137868  2295 137807.9530
#> 57:   male age_group1 cnt_males  1015  60570  1015  60570.0000
#> 58:      m age_group1 cnt_males  1015  60570  1015  60570.0000
#> 59:   male       ag1a cnt_males  1015  60570  1015  60570.0000
#> 60:      m       ag1a cnt_males  1015  60570  1015  60570.0000
#> 61:   male age_group2 cnt_males   571  34704   570  34643.2224
#> 62:      m age_group2 cnt_males   571  34704   570  34643.2224
#> 63:   male       ag2a cnt_males   571  34704   570  34643.2224
#> 64:      m       ag2a cnt_males   571  34704   570  34643.2224
#> 65:   male age_group3 cnt_males   424  25995   423  25933.6910
#> 66:      m age_group3 cnt_males   424  25995   423  25933.6910
#> 67:   male age_group4 cnt_males   195  11353   198  11527.6615
#> 68:      m age_group4 cnt_males   195  11353   198  11527.6615
#> 69:   male age_group5 cnt_males    84   4788    81   4617.0000
#> 70:      m age_group5 cnt_males    84   4788    81   4617.0000
#> 71:   male age_group6 cnt_males     7    458     5    327.1429
#> 72:      m age_group6 cnt_males     7    458     5    327.1429
#> 73: female      Total cnt_males     0      0     0      0.0000
#> 74:      f      Total cnt_males     0      0     0      0.0000
#> 75: female age_group1 cnt_males     0      0     0      0.0000
#> 76:      f age_group1 cnt_males     0      0     0      0.0000
#> 77: female       ag1a cnt_males     0      0     0      0.0000
#> 78:      f       ag1a cnt_males     0      0     0      0.0000
#> 79: female age_group2 cnt_males     0      0     0      0.0000
#> 80:      f age_group2 cnt_males     0      0     0      0.0000
#> 81: female       ag2a cnt_males     0      0     0      0.0000
#> 82:      f       ag2a cnt_males     0      0     0      0.0000
#> 83: female age_group3 cnt_males     0      0     0      0.0000
#> 84:      f age_group3 cnt_males     0      0     0      0.0000
#> 85: female age_group4 cnt_males     0      0     0      0.0000
#> 86:      f age_group4 cnt_males     0      0     0      0.0000
#> 87: female age_group5 cnt_males     0      0     0      0.0000
#> 88:      f age_group5 cnt_males     0      0     0      0.0000
#> 89: female age_group6 cnt_males     0      0     0      0.0000
#> 90:      f age_group6 cnt_males     0      0     0      0.0000
#>        sex        age     vname   uwc     wc  puwc         pwc

# numerical variables (positive variables using flex-function)
# we also write the config to a yaml file
f_yaml <- tempfile(fileext = ".yaml")

# create a ptable using functionality from the ptable-pkg
# a single ptable for all cells
ptab1 <- ptable::pt_ex_nums(parity = TRUE, separation = FALSE)

# a single ptab for all cells except for very small ones
ptab2 <- ptable::pt_ex_nums(parity = TRUE, separation = TRUE)

# different ptables for cells with even/odd number of contributors
# and very small cells
ptab3 <- ptable::pt_ex_nums(parity = FALSE, separation = TRUE)

p_nums1 <- ck_params_nums(
  ptab = ptab1,
  type = "top_contr",
  top_k = 3,
  mult_params = ck_flexparams(
    fp = 1000,
    p = c(0.30, 0.03),
    epsilon = c(1, 0.5, 0.2),
    q = 3),
  mu_c = 2,
  same_key = FALSE,
  use_zero_rkeys = FALSE,
  path = f_yaml)
#> yaml configuration '/tmp/Rtmpu511Tl/file1fa56c97848d.yaml' successfully written.

# we read the parameters from the yaml-file
p_nums1 <- ck_read_yaml(path = f_yaml)

# for variables with positive and negative values
p_nums2 <- ck_params_nums(
  ptab = ptab2,
  type = "top_contr",
  top_k = 3,
  mult_params = ck_flexparams(
    fp = 1000,
    p = c(0.15, 0.02),
    epsilon = c(1, 0.4, 0.15),
    q = 3),
  mu_c = 2,
  same_key = FALSE)

# simple perturbation parameters (not using the flex-function approach)
p_nums3 <- ck_params_nums(
  ptab = ptab3,
  type = "mean",
  mult_params = ck_simpleparams(p = 0.25),
  mu_c = 2,
  same_key = FALSE)

# use `p_nums1` for all variables
tab$params_nums_set(p_nums1, c("savings", "income", "expend"))
#> --> setting perturbation parameters for variable 'savings'
#> --> setting perturbation parameters for variable 'income'
#> --> setting perturbation parameters for variable 'expend'

# use different parameters for variable `mixed`
tab$params_nums_set(p_nums2, v = "mixed")
#> --> setting perturbation parameters for variable 'mixed'

# identify sensitive cells to which extra protection (`mu_c`) is added.
tab$supp_p(v = "income", p = 85)
#> computing contributing indices | rawdata <--> table; this might take a while
#> p%-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_pq(v = "income", p = 85, q = 90)
#> computing contributing indices | rawdata <--> table; this might take a while
#> pq-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_nk(v = "income", n = 2, k = 90)
#> computing contributing indices | rawdata <--> table; this might take a while
#> nk-rule: 0 new sensitive cells (incl. duplicates) found (total: 0)
tab$supp_freq(v = "income", n = 14, weighted = FALSE)
#> freq-rule: 5 new sensitive cells (incl. duplicates) found (total: 5)
tab$supp_val(v = "income", n = 10000, weighted = TRUE)
#> val-rule: 0 new sensitive cells (incl. duplicates) found (total: 5)
tab$supp_cells(
  v = "income",
  inp = data.frame(
    sex = c("female", "female"),
    "age" = c("age_group1", "age_group3")
  )
)
#> cell-rule: 2 new sensitive cells (incl. duplicates) found (total: 7)

# perturb variables
tab$perturb(v = c("income", "savings"))
#> Numeric variable 'income' was perturbed.
#> Numeric variable 'savings' was perturbed.

# extract results
tab$numtab("income", mean_before_sum = TRUE)
#>        sex        age  vname      uws         ws        pws
#>     <char>     <char> <char>    <num>      <num>      <num>
#>  1:  Total      Total income 22952978 1367927750 1367774737
#>  2:  Total age_group1 income  9810547  586291459  586497067
#>  3:  Total       ag1a income  9810547  586291459  586497067
#>  4:  Total age_group2 income  5692119  341887456  342013470
#>  5:  Total       ag2a income  5692119  341887456  342013470
#>  6:  Total age_group3 income  4406946  260895581  261028817
#>  7:  Total age_group4 income  2133543  123691715  123775978
#>  8:  Total age_group5 income   848151   51976816   51972573
#>  9:  Total age_group6 income    61672    3184723    3026105
#> 10:   male      Total income 11262049  677390855  677499342
#> 11:      m      Total income 11262049  677390855  677499342
#> 12:   male age_group1 income  4877164  290785366  290749848
#> 13:      m age_group1 income  4877164  290785366  290749848
#> 14:   male       ag1a income  4877164  290785366  290749848
#> 15:      m       ag1a income  4877164  290785366  290749848
#> 16:   male age_group2 income  2811379  171754899  171787838
#> 17:      m age_group2 income  2811379  171754899  171787838
#> 18:   male       ag2a income  2811379  171754899  171787838
#> 19:      m       ag2a income  2811379  171754899  171787838
#> 20:   male age_group3 income  2168169  132170466  132182213
#> 21:      m age_group3 income  2168169  132170466  132182213
#> 22:   male age_group4 income   978510   57876940   57915815
#> 23:      m age_group4 income   978510   57876940   57915815
#> 24:   male age_group5 income   393134   22740934   22753298
#> 25:      m age_group5 income   393134   22740934   22753298
#> 26:   male age_group6 income    33693    2062250    2169036
#> 27:      m age_group6 income    33693    2062250    2169036
#> 28: female      Total income 11690929  690536895  690529591
#> 29:      f      Total income 11690929  690536895  690529591
#> 30: female age_group1 income  4933383  295506093  295561848
#> 31:      f age_group1 income  4933383  295506093  295561848
#> 32: female       ag1a income  4933383  295506093  295561848
#> 33:      f       ag1a income  4933383  295506093  295561848
#> 34: female age_group2 income  2880740  170132557  170012752
#> 35:      f age_group2 income  2880740  170132557  170012752
#> 36: female       ag2a income  2880740  170132557  170012752
#> 37:      f       ag2a income  2880740  170132557  170012752
#> 38: female age_group3 income  2238777  128725115  128680643
#> 39:      f age_group3 income  2238777  128725115  128680643
#> 40: female age_group4 income  1155033   65814775   65778356
#> 41:      f age_group4 income  1155033   65814775   65778356
#> 42: female age_group5 income   455017   29235882   29125162
#> 43:      f age_group5 income   455017   29235882   29125162
#> 44: female age_group6 income    27979    1122473    1135584
#> 45:      f age_group6 income    27979    1122473    1135584
#>        sex        age  vname      uws         ws        pws
tab$numtab("income", mean_before_sum = FALSE)
#>        sex        age  vname      uws         ws        pws
#>     <char>     <char> <char>    <num>      <num>      <num>
#>  1:  Total      Total income 22952978 1367927750 1367851241
#>  2:  Total age_group1 income  9810547  586291459  586394254
#>  3:  Total       ag1a income  9810547  586291459  586394254
#>  4:  Total age_group2 income  5692119  341887456  341950457
#>  5:  Total       ag2a income  5692119  341887456  341950457
#>  6:  Total age_group3 income  4406946  260895581  260962191
#>  7:  Total age_group4 income  2133543  123691715  123733840
#>  8:  Total age_group5 income   848151   51976816   51974694
#>  9:  Total age_group6 income    61672    3184723    3104401
#> 10:   male      Total income 11262049  677390855  677445096
#> 11:      m      Total income 11262049  677390855  677445096
#> 12:   male age_group1 income  4877164  290785366  290767607
#> 13:      m age_group1 income  4877164  290785366  290767607
#> 14:   male       ag1a income  4877164  290785366  290767607
#> 15:      m       ag1a income  4877164  290785366  290767607
#> 16:   male age_group2 income  2811379  171754899  171771367
#> 17:      m age_group2 income  2811379  171754899  171771367
#> 18:   male       ag2a income  2811379  171754899  171771367
#> 19:      m       ag2a income  2811379  171754899  171771367
#> 20:   male age_group3 income  2168169  132170466  132176340
#> 21:      m age_group3 income  2168169  132170466  132176340
#> 22:   male age_group4 income   978510   57876940   57896374
#> 23:      m age_group4 income   978510   57876940   57896374
#> 24:   male age_group5 income   393134   22740934   22747115
#> 25:      m age_group5 income   393134   22740934   22747115
#> 26:   male age_group6 income    33693    2062250    2114969
#> 27:      m age_group6 income    33693    2062250    2114969
#> 28: female      Total income 11690929  690536895  690533243
#> 29:      f      Total income 11690929  690536895  690533243
#> 30: female age_group1 income  4933383  295506093  295533969
#> 31:      f age_group1 income  4933383  295506093  295533969
#> 32: female       ag1a income  4933383  295506093  295533969
#> 33:      f       ag1a income  4933383  295506093  295533969
#> 34: female age_group2 income  2880740  170132557  170072644
#> 35:      f age_group2 income  2880740  170132557  170072644
#> 36: female       ag2a income  2880740  170132557  170072644
#> 37:      f       ag2a income  2880740  170132557  170072644
#> 38: female age_group3 income  2238777  128725115  128702877
#> 39:      f age_group3 income  2238777  128725115  128702877
#> 40: female age_group4 income  1155033   65814775   65796563
#> 41:      f age_group4 income  1155033   65814775   65796563
#> 42: female age_group5 income   455017   29235882   29180469
#> 43:      f age_group5 income   455017   29235882   29180469
#> 44: female age_group6 income    27979    1122473    1129009
#> 45:      f age_group6 income    27979    1122473    1129009
#>        sex        age  vname      uws         ws        pws
tab$numtab("savings")
#>        sex        age   vname     uws        ws         pws
#>     <char>     <char>  <char>   <num>     <num>       <num>
#>  1:  Total      Total savings 2273532 136102860 136097984.5
#>  2:  Total age_group1 savings  982386  59040646  59031680.8
#>  3:  Total       ag1a savings  982386  59040646  59031680.8
#>  4:  Total age_group2 savings  552336  33278908  33281863.8
#>  5:  Total       ag2a savings  552336  33278908  33281863.8
#>  6:  Total age_group3 savings  437101  26060353  26065012.5
#>  7:  Total age_group4 savings  214661  12539034  12542412.2
#>  8:  Total age_group5 savings   80451   4835524   4835218.1
#>  9:  Total age_group6 savings    6597    348395    344777.2
#> 10:   male      Total savings 1159816  70215220  70217700.1
#> 11:      m      Total savings 1159816  70215220  70217700.1
#> 12:   male age_group1 savings  517660  31254331  31250218.6
#> 13:      m age_group1 savings  517660  31254331  31250218.6
#> 14:   male       ag1a savings  517660  31254331  31250218.6
#> 15:      m       ag1a savings  517660  31254331  31250218.6
#> 16:   male age_group2 savings  280923  17085858  17089070.8
#> 17:      m age_group2 savings  280923  17085858  17089070.8
#> 18:   male       ag2a savings  280923  17085858  17089070.8
#> 19:      m       ag2a savings  280923  17085858  17089070.8
#> 20:   male age_group3 savings  214970  13300653  13299129.9
#> 21:      m age_group3 savings  214970  13300653  13299129.9
#> 22:   male age_group4 savings   99420   5826952   5828408.7
#> 23:      m age_group4 savings   99420   5826952   5828408.7
#> 24:   male age_group5 savings   43233   2511432   2511919.0
#> 25:      m age_group5 savings   43233   2511432   2511919.0
#> 26:   male age_group6 savings    3610    235994    236862.2
#> 27:      m age_group6 savings    3610    235994    236862.2
#> 28: female      Total savings 1113716  65887640  65885982.8
#> 29:      f      Total savings 1113716  65887640  65885982.8
#> 30: female age_group1 savings  464726  27786315  27785528.7
#> 31:      f age_group1 savings  464726  27786315  27785528.7
#> 32: female       ag1a savings  464726  27786315  27785528.7
#> 33:      f       ag1a savings  464726  27786315  27785528.7
#> 34: female age_group2 savings  271413  16193050  16189740.6
#> 35:      f age_group2 savings  271413  16193050  16189740.6
#> 36: female       ag2a savings  271413  16193050  16189740.6
#> 37:      f       ag2a savings  271413  16193050  16189740.6
#> 38: female age_group3 savings  222131  12759700  12758283.1
#> 39:      f age_group3 savings  222131  12759700  12758283.1
#> 40: female age_group4 savings  115241   6712082   6712501.1
#> 41:      f age_group4 savings  115241   6712082   6712501.1
#> 42: female age_group5 savings   37218   2324092   2318978.4
#> 43:      f age_group5 savings   37218   2324092   2318978.4
#> 44: female age_group6 savings    2987    112401    112641.0
#> 45:      f age_group6 savings    2987    112401    112641.0
#>        sex        age   vname     uws        ws         pws

# results can be resetted, too
tab$reset_cntvars(v = "cnt_males")

# we can then set other parameters and perturb again
tab$params_cnts_set(val = p_cnts1, v = "cnt_males")
#> --> setting perturbation parameters for variable 'cnt_males'

tab$perturb(v = "cnt_males")
#> Count variable 'cnt_males' was perturbed.

# write results to a .csv file
tab$freqtab(
  v = c("total", "cnt_males"),
  path = file.path(tempdir(), "outtab.csv")
)
#> File '/tmp/Rtmpu511Tl/outtab.csv' successfully written to disk.
#> NULL

# show results containing weighted and unweighted results
tab$freqtab(v = c("total", "cnt_males"))
#>        sex        age     vname   uwc     wc  puwc         pwc
#>     <char>     <char>    <char> <num>  <num> <num>       <num>
#>  1:  Total      Total     total  4580 272686  4581 272745.5384
#>  2:  Total age_group1     total  1969 117288  1970 117347.5673
#>  3:  Total       ag1a     total  1969 117288  1970 117347.5673
#>  4:  Total age_group2     total  1143  68941  1142  68880.6842
#>  5:  Total       ag2a     total  1143  68941  1142  68880.6842
#>  6:  Total age_group3     total   864  51132   863  51072.8194
#>  7:  Total age_group4     total   423  24549   421  24432.9291
#>  8:  Total age_group5     total   168  10075   168  10075.0000
#>  9:  Total age_group6     total    13    701    12    647.0769
#> 10:   male      Total     total  2296 137868  2295 137807.9530
#> 11:      m      Total     total  2296 137868  2295 137807.9530
#> 12:   male age_group1     total  1015  60570  1015  60570.0000
#> 13:      m age_group1     total  1015  60570  1015  60570.0000
#> 14:   male       ag1a     total  1015  60570  1015  60570.0000
#> 15:      m       ag1a     total  1015  60570  1015  60570.0000
#> 16:   male age_group2     total   571  34704   571  34704.0000
#> 17:      m age_group2     total   571  34704   571  34704.0000
#> 18:   male       ag2a     total   571  34704   571  34704.0000
#> 19:      m       ag2a     total   571  34704   571  34704.0000
#> 20:   male age_group3     total   424  25995   424  25995.0000
#> 21:      m age_group3     total   424  25995   424  25995.0000
#> 22:   male age_group4     total   195  11353   197  11469.4410
#> 23:      m age_group4     total   195  11353   197  11469.4410
#> 24:   male age_group5     total    84   4788    82   4674.0000
#> 25:      m age_group5     total    84   4788    82   4674.0000
#> 26:   male age_group6     total     7    458     6    392.5714
#> 27:      m age_group6     total     7    458     6    392.5714
#> 28: female      Total     total  2284 134818  2284 134818.0000
#> 29:      f      Total     total  2284 134818  2284 134818.0000
#> 30: female age_group1     total   954  56718   953  56658.5472
#> 31:      f age_group1     total   954  56718   953  56658.5472
#> 32: female       ag1a     total   954  56718   953  56658.5472
#> 33:      f       ag1a     total   954  56718   953  56658.5472
#> 34: female age_group2     total   572  34237   573  34296.8549
#> 35:      f age_group2     total   572  34237   573  34296.8549
#> 36: female       ag2a     total   572  34237   573  34296.8549
#> 37:      f       ag2a     total   572  34237   573  34296.8549
#> 38: female age_group3     total   440  25137   441  25194.1295
#> 39:      f age_group3     total   440  25137   441  25194.1295
#> 40: female age_group4     total   228  13196   227  13138.1228
#> 41:      f age_group4     total   228  13196   227  13138.1228
#> 42: female age_group5     total    84   5287    84   5287.0000
#> 43:      f age_group5     total    84   5287    84   5287.0000
#> 44: female age_group6     total     6    243     8    324.0000
#> 45:      f age_group6     total     6    243     8    324.0000
#> 46:  Total      Total cnt_males  2296 137868  2295 137807.9530
#> 47:  Total age_group1 cnt_males  1015  60570  1015  60570.0000
#> 48:  Total       ag1a cnt_males  1015  60570  1015  60570.0000
#> 49:  Total age_group2 cnt_males   571  34704   571  34704.0000
#> 50:  Total       ag2a cnt_males   571  34704   571  34704.0000
#> 51:  Total age_group3 cnt_males   424  25995   424  25995.0000
#> 52:  Total age_group4 cnt_males   195  11353   197  11469.4410
#> 53:  Total age_group5 cnt_males    84   4788    82   4674.0000
#> 54:  Total age_group6 cnt_males     7    458     6    392.5714
#> 55:   male      Total cnt_males  2296 137868  2295 137807.9530
#> 56:      m      Total cnt_males  2296 137868  2295 137807.9530
#> 57:   male age_group1 cnt_males  1015  60570  1015  60570.0000
#> 58:      m age_group1 cnt_males  1015  60570  1015  60570.0000
#> 59:   male       ag1a cnt_males  1015  60570  1015  60570.0000
#> 60:      m       ag1a cnt_males  1015  60570  1015  60570.0000
#> 61:   male age_group2 cnt_males   571  34704   571  34704.0000
#> 62:      m age_group2 cnt_males   571  34704   571  34704.0000
#> 63:   male       ag2a cnt_males   571  34704   571  34704.0000
#> 64:      m       ag2a cnt_males   571  34704   571  34704.0000
#> 65:   male age_group3 cnt_males   424  25995   424  25995.0000
#> 66:      m age_group3 cnt_males   424  25995   424  25995.0000
#> 67:   male age_group4 cnt_males   195  11353   197  11469.4410
#> 68:      m age_group4 cnt_males   195  11353   197  11469.4410
#> 69:   male age_group5 cnt_males    84   4788    82   4674.0000
#> 70:      m age_group5 cnt_males    84   4788    82   4674.0000
#> 71:   male age_group6 cnt_males     7    458     6    392.5714
#> 72:      m age_group6 cnt_males     7    458     6    392.5714
#> 73: female      Total cnt_males     0      0     0      0.0000
#> 74:      f      Total cnt_males     0      0     0      0.0000
#> 75: female age_group1 cnt_males     0      0     0      0.0000
#> 76:      f age_group1 cnt_males     0      0     0      0.0000
#> 77: female       ag1a cnt_males     0      0     0      0.0000
#> 78:      f       ag1a cnt_males     0      0     0      0.0000
#> 79: female age_group2 cnt_males     0      0     0      0.0000
#> 80:      f age_group2 cnt_males     0      0     0      0.0000
#> 81: female       ag2a cnt_males     0      0     0      0.0000
#> 82:      f       ag2a cnt_males     0      0     0      0.0000
#> 83: female age_group3 cnt_males     0      0     0      0.0000
#> 84:      f age_group3 cnt_males     0      0     0      0.0000
#> 85: female age_group4 cnt_males     0      0     0      0.0000
#> 86:      f age_group4 cnt_males     0      0     0      0.0000
#> 87: female age_group5 cnt_males     0      0     0      0.0000
#> 88:      f age_group5 cnt_males     0      0     0      0.0000
#> 89: female age_group6 cnt_males     0      0     0      0.0000
#> 90:      f age_group6 cnt_males     0      0     0      0.0000
#>        sex        age     vname   uwc     wc  puwc         pwc

# utility measures for a count variable
tab$measures_cnts(v = "total", exclude_zeros = TRUE)
#> $overview
#>     noise   cnt        pct
#>    <fctr> <int>      <num>
#> 1:     -2     4 0.08888889
#> 2:     -1     9 0.20000000
#> 3:      0    15 0.33333333
#> 4:      1    14 0.31111111
#> 5:      2     3 0.06666667
#> 
#> $measures
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 1.000 0.000 0.011
#>  6:   Mean 0.822 0.025 0.046
#>  7: Median 1.000 0.001 0.016
#>  8:    Q60 1.000 0.001 0.019
#>  9:    Q70 1.000 0.002 0.023
#> 10:    Q80 1.000 0.006 0.053
#> 11:    Q90 2.000 0.056 0.129
#> 12:    Q95 2.000 0.143 0.196
#> 13:    Q99 2.000 0.333 0.379
#> 14:    Max 2.000 0.333 0.379
#> 
#> $cumdistr_d1
#>       cat   cnt       pct
#>    <char> <int>     <num>
#> 1:      0    15 0.3333333
#> 2:      1    38 0.8444444
#> 3:      2    45 1.0000000
#> 
#> $cumdistr_d2
#>            cat   cnt       pct
#>         <char> <int>     <num>
#> 1:    [0,0.02]    38 0.8444444
#> 2: (0.02,0.05]    40 0.8888889
#> 3:  (0.05,0.1]    41 0.9111111
#> 4:   (0.1,0.2]    43 0.9555556
#> 5:   (0.2,0.3]    43 0.9555556
#> 6:   (0.3,0.4]    45 1.0000000
#> 7:   (0.4,0.5]    45 1.0000000
#> 8:   (0.5,Inf]    45 1.0000000
#> 
#> $cumdistr_d3
#>            cat   cnt       pct
#>         <char> <int>     <num>
#> 1:    [0,0.02]    27 0.6000000
#> 2: (0.02,0.05]    36 0.8000000
#> 3:  (0.05,0.1]    38 0.8444444
#> 4:   (0.1,0.2]    43 0.9555556
#> 5:   (0.2,0.3]    43 0.9555556
#> 6:   (0.3,0.4]    45 1.0000000
#> 7:   (0.4,0.5]    45 1.0000000
#> 8:   (0.5,Inf]    45 1.0000000
#> 
#> $false_zero
#> [1] 0
#> 
#> $false_nonzero
#> [1] 0
#> 
#> $exclude_zeros
#> [1] TRUE
#> 

# modifications for perturbed count variables
tab$mod_cnts()
#>         sex        age row_nr  pert      ckey  countvar
#>      <char>     <char>  <num> <int>     <num>    <char>
#>   1:  Total      Total     16     1 0.7027340     total
#>   2:  Total age_group1     16     1 0.6973654     total
#>   3:  Total       ag1a     16     1 0.6973654     total
#>   4:  Total age_group2     14    -1 0.1872797     total
#>   5:  Total       ag2a     14    -1 0.1872797     total
#>  ---                                                   
#> 131:      f age_group4     -1     0 0.0000000 cnt_males
#> 132: female age_group5     -1     0 0.0000000 cnt_males
#> 133:      f age_group5     -1     0 0.0000000 cnt_males
#> 134: female age_group6     -1     0 0.0000000 cnt_males
#> 135:      f age_group6     -1     0 0.0000000 cnt_males

# display a summary about utility measures
tab$summary()
#> ┌──────────────────────────────────────────────┐
#> │Utility measures for perturbed count variables│
#> └──────────────────────────────────────────────┘
#> ── Distribution statistics of perturbations ────────────────────────────────────
#>          countvar   Min   Q10   Q20   Q30   Q40   Mean Median   Q60   Q70   Q80
#>            <char> <num> <num> <num> <num> <num>  <num>  <num> <num> <num> <num>
#> 1:          total    -2    -1  -1.0    -1     0 -0.067      0     0     0     1
#> 2: cnt_highincome    -4    -2  -1.0     0     0  0.289      0     1     1     2
#> 3:      cnt_males    -2    -1  -0.2     0     0 -0.133      0     0     0     0
#>      Q90   Q95   Q99   Max
#>    <num> <num> <num> <num>
#> 1:     1   2.0     2     2
#> 2:     2   2.0     5     5
#> 3:     0   1.6     2     2
#> 
#> ── Distance-based measures ─────────────────────────────────────────────────────
#> ✔ Variable: 'total'
#> 
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 1.000 0.000 0.011
#>  6:   Mean 0.822 0.025 0.046
#>  7: Median 1.000 0.001 0.016
#>  8:    Q60 1.000 0.001 0.019
#>  9:    Q70 1.000 0.002 0.023
#> 10:    Q80 1.000 0.006 0.053
#> 11:    Q90 2.000 0.056 0.129
#> 12:    Q95 2.000 0.143 0.196
#> 13:    Q99 2.000 0.333 0.379
#> 14:    Max 2.000 0.333 0.379
#> 
#> ✔ Variable: 'cnt_males'
#> 
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 0.000 0.000 0.000
#>  4:    Q30 0.000 0.000 0.000
#>  5:    Q40 0.000 0.000 0.000
#>  6:   Mean 0.667 0.020 0.043
#>  7: Median 0.000 0.000 0.000
#>  8:    Q60 1.000 0.000 0.010
#>  9:    Q70 1.000 0.010 0.071
#> 10:    Q80 1.800 0.021 0.102
#> 11:    Q90 2.000 0.071 0.144
#> 12:    Q95 2.000 0.143 0.196
#> 13:    Q99 2.000 0.143 0.196
#> 14:    Max 2.000 0.143 0.196
#> 
#> ✔ Variable: 'cnt_highincome'
#> 
#>       what    d1    d2    d3
#>     <char> <num> <num> <num>
#>  1:    Min 0.000 0.000 0.000
#>  2:    Q10 0.000 0.000 0.000
#>  3:    Q20 1.000 0.005 0.034
#>  4:    Q30 1.000 0.009 0.066
#>  5:    Q40 1.000 0.018 0.066
#>  6:   Mean 1.475 0.038 0.101
#>  7: Median 1.000 0.020 0.088
#>  8:    Q60 2.000 0.022 0.102
#>  9:    Q70 2.000 0.025 0.112
#> 10:    Q80 2.000 0.051 0.158
#> 11:    Q90 2.100 0.143 0.183
#> 12:    Q95 4.050 0.143 0.196
#> 13:    Q99 5.000 0.230 0.430
#> 14:    Max 5.000 0.286 0.579
#> 
#> ┌──────────────────────────────────────────────────┐
#> │Utility measures for perturbed numerical variables│
#> └──────────────────────────────────────────────────┘
#> ── Distribution statistics of perturbations ────────────────────────────────────
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#>      vname        Min        Q10        Q20        Q30       Q40     Mean
#>     <char>      <num>      <num>      <num>      <num>     <num>    <num>
#> 1:  expend        Inf         NA         NA         NA        NA      NaN
#> 2:  income -80322.053 -59913.190 -22237.695 -17759.356 -2733.890 5311.218
#> 3: savings  -8965.189  -4570.259  -3716.747  -2978.984 -1459.402 -895.135
#> 4:   mixed        Inf         NA         NA         NA        NA      NaN
#>      Median       Q60       Q70      Q80       Q90       Q95        Q99
#>       <num>     <num>     <num>    <num>     <num>     <num>      <num>
#> 1:       NA        NA        NA       NA        NA        NA         NA
#> 2: 6181.025 16468.481 26187.802 44243.43 59497.280 65888.031 102794.835
#> 3: -786.321   311.611   791.997  2480.08  3212.818  3212.818   4095.742
#> 4:       NA        NA        NA       NA        NA        NA         NA
#>           Max
#>         <num>
#> 1:       -Inf
#> 2: 102794.835
#> 3:   4659.522
#> 4:       -Inf
# }