AI_applyAnonymization: Automatically apply anonymization strategy using LLM

Uses an agentic loop to explore multiple anonymization strategies. The LLM proposes strategies as structured tool calls, each is evaluated with a combined utility score, and the best is selected.

AI_applyAnonymization(
  sdcObj,
  k = 3,
  verbose = TRUE,
  model = NULL,
  api_key = NULL,
  provider = c("openai", "anthropic", "custom"),
  base_url = NULL,
  confirm = TRUE,
  max_iter = 2,
  n_strategies = 3,
  weights = c(1/3, 1/3, 1/3),
  tol = 0.001,
  patience = 1L,
  generateReport = TRUE
)

Arguments

sdcObj: An object of class sdcMicroObj.
k: Desired k-anonymity level (default 3).
verbose: If TRUE, prints progress and scores for each strategy.
model: LLM model identifier. If NULL, a default is chosen per provider.
api_key: API key. If NULL, auto-detected from environment variables.
provider: LLM provider: "openai" (default), "anthropic", or "custom" for any OpenAI-compatible endpoint.
base_url: Base URL for the API endpoint. Required when provider = "custom".
confirm: Logical; if TRUE (default) and session is interactive, shows the best strategy and asks for confirmation before applying.
max_iter: Number of refinement iterations after the initial batch (default 2).
n_strategies: Number of strategies in the initial batch (default 3).
weights: Numeric vector of length 3: weights for suppression rate, category loss, and IL1 in the utility score. Default c(1/3, 1/3, 1/3).
tol: Minimum reduction in the combined utility score \(U\) for a refinement iteration to count as an improvement; refinements that lower \(U\) by less than tol are treated as a stall. Default 1e-3.
patience: Number of consecutive refinement iterations without an improvement greater than tol that triggers early stopping of the refinement loop. Default 1. The initial batch phase always runs in full; max_iter remains the upper bound on the number of refinement iterations.
generateReport: If TRUE, generates internal and external reports.

Value

Modified sdcMicroObj with the best anonymization strategy applied.

Author

Matthias Templ

Examples

if (FALSE) { # \dontrun{
if (interactive() && nzchar(Sys.getenv("OPENAI_API_KEY"))) {
  library(sdcMicro)
  data(testdata)
  sdc <- AI_createSdcObj(dat = testdata, policy = "open", confirm = FALSE)
  sdc <- AI_applyAnonymization(sdcObj = sdc, k = 3, verbose = TRUE, confirm = FALSE)
}
} # }