Skip to contents

Integrity: every record hash is recomputed from the stored verbatim lines and compared to the manifest – against the original record hashes for a full archive, against the public_record_hash family for a redacted one (whose original hashes attest content held elsewhere; see archive_redact()). For a sealed archive the root is recomputed too; the root always refers to the original content. Completeness: given a results frame that carries response_id (as LLMR::call_llm_par() returns), reports how many of the paper's results map to a logged call – numbers without provenance are exactly what this package exists to make visible.

Usage

archive_check(archive, results = NULL)

Arguments

archive

An archive.

results

Optional data frame with a response_id column.

Value

A list of class archive_check: intact, redacted (a logical redaction flag; when TRUE, intact was checked against the public_record_hash family rather than the original record hashes), root_ok, n_records, bad_records (indices), duplicate_response_ids and duplicate_request_hashes (each a character vector of any values that appear on more than one record – a sign of a malformed or merged log), and, when results was given, n_results, n_matched, unmatched_ids.

Examples

log <- tempfile(fileext = ".jsonl")
writeLines(paste0('{"ts":"2026-06-01T10:00:01+0000","schema_version":"1.0",',
  '"kind":"call","provider":"groq","model":"openai/gpt-oss-20b",',
  '"request":{"q":1},"usage":{"sent":5,"rec":2},',
  '"response_id":"r-1","text":"reply"}'), log)
a <- archive_seal(archive_build(log))
archive_check(a)
archive_check(a, results = data.frame(response_id = c("r-1", "orphan")))