Draws a stratified sample of archived calls, re-issues them, and reports
exact-match rates and any change in the served model_version. Exact
reproduction is expected only for temperature-zero calls on pinned
open-weight backends; for sampled calls disagreement is sampling, not
necessarily drift, and the print method says so.
Usage
archive_verify(
archive,
sample = 0.05,
strata = c("provider", "model"),
.runner = NULL,
...
)Arguments
- archive
An
archive.- sample
Fraction in
(0, 1], or a total count of calls to re-issue.- strata
Manifest columns to stratify by.
- .runner
Internal seam for tests: a function taking an experiments tibble (
configandmessageslist-columns) and returning it with aresponse_textcolumn. DefaultLLMR::call_llm_par().- ...
Passed to the runner.
Value
An archive_drift object: table is a per-stratum tibble that
begins with the chosen strata columns (provider, model by default),
followed by n_eligible, n_sampled, n_exact, exact_rate,
n_temperature0, n_version_changed; details is a per-record
tibble (idx, provider, model, temperature, exact,
archived_version, served_version).
Details
With sample <= 1 the value is a fraction and each stratum contributes
ceiling(sample * n) records. With sample > 1 the value is a total count,
allocated across strata by largest remainder in proportion to stratum size,
with at least one record per nonempty stratum when the total allows. The
draw uses sample(); set a seed beforehand for a reproducible sample.
Examples
log <- tempfile(fileext = ".jsonl")
writeLines(paste0('{"ts":"2026-06-01T10:00:01+0000","schema_version":"1.0",',
'"kind":"call","provider":"openai","model":"gpt-4o-mini","status":200,',
'"request":{"messages":[{"role":"user","content":"Capital of France?"}],',
'"temperature":0},"model_version":"gpt-4o-mini-2026-06-01",',
'"usage":{"sent":5,"rec":1},"response_id":"r-1","text":"Paris"}'), log)
a <- archive_build(log)
# Live use issues real calls; here a runner that echoes the archived text
# stands in through the .runner seam so the example needs no key.
echo <- function(experiments, ...) {
experiments$response_text <- "Paris"
experiments$model_version <- "gpt-4o-mini-2026-06-01"
experiments
}
archive_verify(a, sample = 1, .runner = echo)$table