Distance-based disclosure risk estimation via standard deviation-based intervals around observations.
dRisk(obj, ...)
a data.frame
or object of class sdcMicroObj-class
possible arguments are:
xm
:perturbed data
k
:percentage of the standard deviation
The disclosure risk or/and the modified sdcMicroObj-class
An interval (based on the standard deviation) is built around each value of the perturbed value. Then we look if the original values lay in these intervals or not. With parameter k one can enlarge or down scale the interval.
see method SDID in Mateo-Sanz, Sebe, Domingo-Ferrer. Outlier Protection in Continuous Microdata Masking. International Workshop on Privacy in Statistical Databases. PSD 2004: Privacy in Statistical Databases pp 201-215.
Templ, M. Statistical Disclosure Control for Microdata: Methods and Applications in R. Springer International Publishing, 287 pages, 2017. ISBN 978-3-319-50272-4. doi:10.1007/978-3-319-50272-4
data(free1)
free1 <- as.data.frame(free1)
# \donttest{
m1 <- microaggregation(free1[, 31:34], method="onedims", aggr=3)
m2 <- microaggregation(free1[, 31:34], method="pca", aggr=3)
dRisk(obj=free1[, 31:34], xm=m1$mx)
#> [1] 0.9955
dRisk(obj=free1[, 31:34], xm=m2$mx)
#> [1] 0
dUtility(obj=free1[, 31:34], xm=m1$mx)
#> [1] 7.971673
dUtility(obj=free1[, 31:34], xm=m2$mx)
#> [1] 6335.63
## for objects of class sdcMicro:
data(testdata2)
sdc <- createSdcObj(testdata2,
keyVars=c('urbrur','roof','walls','water','electcon','relat','sex'),
numVars=c('expend','income','savings'), w='sampling_weight')
## this is already made internally: sdc <- dRisk(sdc)
## and already stored in sdc
# }