Build a gold-standard set with split provenance and a sealed holdout split

Usage

gold_set(
  data,
  text,
  label,
  split = c(dev = 0.6, test = 0.4),
  holdout = "test",
  stratify = TRUE,
  seal_holdout = TRUE,
  coders = NULL,
  id = NULL
)

Arguments

data: A data frame of human-labeled units. Gold labels must be complete: a missing label is not gold, and construction fails on NAs.
text: Name of the text column (character scalar).
label: Name of the (adjudicated) label column.
split: Named numeric vector of proportions, e.g. c(dev = 0.6, test = 0.4). Must sum to 1. Sizes are allocated by largest remainder (exact, no rounding loss); assignment is random within the allocation – set a seed beforehand for a repeatable draw, and keep the saved object either way, since the split is stored, not recomputed. Split names are yours to choose; name the held-out one in holdout when it is not "test".
holdout: Name of the held-out split (character scalar, default "test"). This is the split that validate_protocol() evaluates by default, that gold_correct() audits, that tune_protocol() refuses, and that the seal and the ledger act on. The name is stored on the object, so downstream functions follow it without repetition.
stratify: If TRUE (default), the split is stratified by the label column, so every split carries the same class composition – the methods default for evaluation splits.
seal_holdout: If TRUE (default), the holdout split is sealed: every validate_protocol() or gold_correct() run against it is recorded in the ledger and printed by LLMR::report(). The seal is visibility, not enforcement; save the gold set with the study and archive the LLM call log (see the archive workflow) when tamper evidence is needed.
coders: Optional character vector naming columns holding individual coder labels (pre-adjudication), used by coder_agreement().
id: Optional name of a column holding a stable unit identifier. When supplied, gold_correct() links audit units to the coded corpus by this id, which is the only way to disambiguate units that share identical text. The same column must be present in the corpus passed to code_corpus(). When omitted, linkage falls back to a content hash of the text, and duplicate texts among audited units are refused rather than matched arbitrarily.

Value

A gold_set: the data plus split assignment, the stored holdout split name, seal status, and an evaluation ledger.

Examples

set.seed(110)   # the split assignment draws locally
g <- gold_set(
  data.frame(text  = paste0("unit", seq_len(40)),
             label = rep(c("x", "y"), each = 20)),
  text = "text", label = "label", split = c(dev = 0.5, test = 0.5)
)
g
table(gold_split(g, "dev")$label)   # stratified: same class mix as test
gold_ledger(g)   # empty until something evaluates on the holdout split

Build a gold-standard set with split provenance and a sealed holdout split

Usage

Arguments

Value

See also

Examples