This function uses a Large Language Model (LLM) to automatically classify variables in a dataset into quasi-identifiers, sensitive variables, numerical variables, and more, and passes the result to `createSdcObj()`. It optionally uses a codebook and policy context.

KI_createSdcObj(
  dat,
  codebook = NULL,
  policy = c("open", "restricted", "confidential"),
  model = "gpt-4",
  api_key = Sys.getenv("OPENAI_API_KEY"),
  ...
)

Arguments

dat

A data.frame containing the microdata.

codebook

Optional path to a codebook file (currently not parsed; placeholder for future use).

policy

Data sharing policy context: `"open"` (default), `"restricted"`, or `"confidential"`.

model

The LLM model to use (default: `"gpt-4"`).

api_key

OpenAI API key, defaulting to the `OPENAI_API_KEY` environment variable.

...

Additional arguments passed to `createSdcObj()`.

Value

An object of class `sdcMicroObj`.

Author

Matthias Templ

Examples

if (FALSE) { # \dontrun{
data(testdata)
sdc <- KI_createSdcObj(dat = testdata, policy = "open")
sdc
} # }