Negative controls for the audit. A placebo asks whether the pipeline can produce the reported number when, by construction, it should not: when the link between labels and units is broken, or when the measured texts do not contain the construct. A pipeline that "finds" the effect under either condition is measuring its instrument.
Usage
audit_placebo(
audit,
type = c("label_permutation", "irrelevant_text"),
reps = 200L,
texts = NULL,
.runner = NULL,
...
)Arguments
- audit
An
auditreturned byaudit_run().- type
Placebo construction.
"label_permutation"permutes labels within each audited cell, with no new model calls."irrelevant_text"replaces the plan's text column with researcher-supplied construct-free texts and re-runs the grid.- reps
Number of label permutations per cell for
"label_permutation".- texts
For
"irrelevant_text": a character vector of texts in which the construct is absent by design (weather reports for a partisanship estimand, say). Choosing them is a research decision the package cannot make for you. The vector is recycled deterministically (rep_len()) to the number of units.- .runner
Internal seam for tests, passed to
audit_run()for the irrelevant-text rerun.- ...
Passed to
audit_run()for the irrelevant-text rerun.
Value
An audit_placebo object: a list with type, cells, reps,
and n_units, with a print method. For "label_permutation",
cells has one row per audit cell with the observed estimate, the
centered permutation p, the null interval (null_lo, null_hi,
the 2.5% and 97.5% permutation quantiles), a degenerate flag, and
permutation counts. For "irrelevant_text", cells has the real
estimate, estimate_placebo, and parse_failures_placebo per
cell; the comparison is descriptive, with no p-values.
Details
The permutation placebo holds each cell's label marginal fixed and
shuffles which unit got which label, recomputing the estimator each
time. Estimators that use only the marginal (a share, say) are
permutation-invariant: every permuted estimate equals the observed one,
the cell is flagged degenerate = TRUE with p = NA, and the print
method says the placebo is uninformative for that estimand. For
association estimands the permutation distribution is the no-association
null, and the p-value is centered on its median:
(1 + #(|perm - m| >= |obs - m|)) / (#valid + 1). Estimator errors
inside a permutation become NA, are excluded, and are counted.
Permutations use the current RNG state; the function never sets a seed. Set one beforehand when the draw must be reproducible.
The irrelevant-text placebo re-runs the full grid and therefore costs
calls unless a .runner is injected.
Examples
if (FALSE) { # \dontrun{
speeches <- data.frame(
text = c("cut taxes now", "deregulate markets",
"fund the schools", "expand care"),
half = c("first", "first", "second", "second"))
plan <- audit_plan(
data = speeches, text = "text",
estimator = function(d) {
mean(d$label[d$half == "first"] == "conservative", na.rm = TRUE) -
mean(d$label[d$half == "second"] == "conservative", na.rm = TRUE)
},
labels = c("conservative", "progressive"),
prompt = "Classify as one of: {labels}.\n\n{text}\n\nLabel:")
plan <- audit_add_models(plan,
list(oss = LLMR::llm_config("groq", "openai/gpt-oss-20b", temperature = 0)))
audit <- audit_run(plan)
set.seed(110) # the permutation draws locally
audit_placebo(audit, reps = 199L)
audit_placebo(audit, type = "irrelevant_text",
texts = c("Rain is likely this afternoon.",
"Winds stay light overnight."))
} # }