The sample frequencies are assumed to be independent and following a Poisson distribution. The parameters of the corresponding parameters are estimated by a log-linear model including the main effects and possible interactions.

`modRisk(obj, method = "default", weights, formulaM, bound = Inf, ...)`

- obj
An

`sdcMicroObj-class`

-object or a numeric matrix or data.frame containing all variables required in the specified model.- method
chose method for model-based risk-estimation. Currently, the following methods can be selected:

"default": the standard log-linear model.

"CE": the Clogg Eliason method, additionally, considers survey weights by using an offset term.

"PML": the pseudo maximum likelihood method.

"weightedLLM": the weighted maximum likelihood method, considers survey weights by including them as one of the predictors.

"IPF": iterative proportional fitting as used in deprecated method 'LLmodGlobalRisk'.

- weights
a variable name specifying sampling weights

- formulaM
A formula specifying the model.

- bound
a number specifying a threshold for 'risky' observations in the sample.

- ...
additional parameters passed through, currently ignored.

Two global risk measures and some model output given the specified model. If this method
is applied to an `sdcMicroObj-class`

-object, the slot 'risk' in the object ist updated
with the result of the model-based risk-calculation.

This measure aims to (1) calculate the number of sample uniques that are population uniques with a probabilistic Poisson model and (2) to estimate the expected number of correct matches for sample uniques.

ad 1) this risk measure is defined over all sample uniques as $$ \tau_1 = \sum\limits_{j:f_j=1} P(F_j=1 | f_j=1) \quad , $$ i.e. the expected number of sample uniques that are population uniques.

ad 2) this risk measure is defined over all sample uniques as $$ \tau_2 = \sum\limits_{j:f_j=1} P(1 / F_j | f_j=1) \quad . $$

Since population frequencies \(F_k\) are unknown, they need to be estimated.

The iterative proportional fitting method is used to fit the parameters of the Poisson distributed frequency counts related to the model specified to fit the frequency counts. The obtained parameters are used to estimate a global risk, defined in Skinner and Holmes (1998).

Skinner, C.J. and Holmes, D.J. (1998) *Estimating the
re-identification risk per record in microdata*. Journal of Official
Statistics, 14:361-372, 1998.

Rinott, Y. and Shlomo, N. (1998). *A Generalized Negative Binomial
Smoothing Model for Sample Disclosure Risk Estimation*. Privacy in
Statistical Databases. Lecture Notes in Computer Science. Springer-Verlag,
82–93.

Clogg, C.C. and Eliasson, S.R. (1987). *Some Common Problems in Log-Linear Analysis*. Sociological Methods and Research, 8-44.

`loglm`

, `measure_risk`

```
## data.frame method
data(testdata2)
form <- ~sex+water+roof
w <- "sampling_weight"
# \donttest{
(modRisk(testdata2, method = "default", formulaM = form, weights = w))
#> The estimated model (using method 'default') was:
#> ~ sex + water + roof
#> global risk-measures:
#> Risk-Measure 1: 0.244 (24.436 %)
#> Risk-Measure 2: 0.384 (38.400 %)
(modRisk(testdata2, method = "CE", formulaM = form, weights = w))
#> The estimated model (using method 'CE') was:
#> ~ sex + water + roof
#> global risk-measures:
#> Risk-Measure 1: 0.237 (23.740 %)
#> Risk-Measure 2: 0.379 (37.936 %)
(modRisk(testdata2, method = "PML", formulaM = form, weights = w))
#> The estimated model (using method 'PML') was:
#> ~ sex + water + roof
#> global risk-measures:
#> Risk-Measure 1: 0.244 (24.436 %)
#> Risk-Measure 2: 0.384 (38.400 %)
(modRisk(testdata2, method = "weightedLLM", formulaM = form, weights = w))
#> The estimated model (using method 'weightedLLM') was:
#> ~ sex + water + roof
#> global risk-measures:
#> Risk-Measure 1: 0.314 (31.424 %)
#> Risk-Measure 2: 0.442 (44.249 %)
(modRisk(testdata2, method = "IPF", formulaM = form, weights = w))
#> The estimated model (using method 'IPF') was:
#> ~ sex + water + roof
#> global risk-measures:
#> Risk-Measure 1: 0.274 (27.354 %)
#> Risk-Measure 2: 0.410 (41.038 %)
## application to a sdcMicroObj
data(testdata2)
sdc <- createSdcObj(testdata2,
keyVars = c("urbrur", "roof", "walls", "electcon", "relat", "sex"),
numVars = c("expend", "income", "savings"),
w = "sampling_weight")
sdc <- modRisk(sdc, form = ~sex+water+roof)
slot(sdc, "risk")$model
#> The estimated model (using method 'default') was:
#> ~ sex + water + roof
#> global risk-measures:
#> Risk-Measure 1: 0.244 (24.436 %)
#> Risk-Measure 2: 0.384 (38.400 %)
# }
# \donttest{
# an example using data from the laeken-pkg
library(laeken)
data(eusilc)
f <- as.formula(paste(" ~ ", "db040 + hsize + rb090 +
age + pb220a + age:rb090 + age:hsize +
hsize:rb090"))
w <- "rb050"
(modRisk(eusilc, method = "default", weights = w, formulaM = f, bound = 5))
#> The estimated model (using method 'default') was:
#> ~ db040 + hsize + rb090 + age + pb220a + age:rb090 + age:hsize + hsize:rb090
#> global risk-measures:
#> Risk-Measure 1: 0.296 (29.641 %)
#> Risk-Measure 2: 0.360 (36.043 %)
(modRisk(eusilc, method = "CE", weights = w, formulaM = f, bound = 5))
#> The estimated model (using method 'CE') was:
#> ~ db040 + hsize + rb090 + age + pb220a + age:rb090 + age:hsize + hsize:rb090
#> global risk-measures:
#> Risk-Measure 1: 0.295 (29.526 %)
#> Risk-Measure 2: 0.360 (35.966 %)
(modRisk(eusilc, method = "PML", weights = w, formulaM = f, bound = 5))
#> The estimated model (using method 'PML') was:
#> ~ db040 + hsize + rb090 + age + pb220a + age:rb090 + age:hsize + hsize:rb090
#> global risk-measures:
#> Risk-Measure 1: 0.296 (29.609 %)
#> Risk-Measure 2: 0.360 (35.963 %)
(modRisk(eusilc, method = "weightedLLM", weights = w, formulaM = f, bound = 5))
#> The estimated model (using method 'weightedLLM') was:
#> ~ db040 + hsize + rb090 + age + pb220a + age:rb090 + age:hsize + hsize:rb090
#> global risk-measures:
#> Risk-Measure 1: 0.324 (32.411 %)
#> Risk-Measure 2: 0.383 (38.300 %)
# }
```