This function uses a Large Language Model (LLM) to automatically classify variables in a dataset into quasi-identifiers, sensitive variables, numerical variables, and more, and passes the result to createSdcObj(). It optionally uses a codebook and policy context.

AI_createSdcObj(
  dat,
  codebook = NULL,
  policy = c("open", "restricted", "confidential"),
  model = NULL,
  api_key = NULL,
  provider = c("openai", "anthropic", "custom"),
  base_url = NULL,
  confirm = TRUE,
  info = TRUE,
  ...
)

Arguments

dat

A data.frame containing the microdata.

codebook

Optional path to a codebook file (currently not parsed; placeholder for future use).

policy

Data sharing policy context: "open" (default), "restricted", or "confidential".

model

The LLM model to use. If NULL, a default is chosen per provider.

api_key

API key. If NULL, auto-detected from environment variables.

provider

LLM provider: "openai" (default), "anthropic", or "custom" for any OpenAI-compatible endpoint (Ollama, Azure, vLLM, Groq, etc.).

base_url

Base URL for the API endpoint. Required when provider = "custom".

confirm

Logical; if TRUE (default) and session is interactive, shows the proposed classification and asks for confirmation before creating the sdcMicroObj.

info

Logical; if TRUE, prints the LLM classification result and reasoning.

...

Additional arguments passed to createSdcObj().

Value

An object of class sdcMicroObj.

Author

Matthias Templ

Examples

if (FALSE) { # \dontrun{
data(testdata)
sdc <- AI_createSdcObj(dat = testdata, policy = "open")
sdc
} # }