One hash convention for the whole LLMR ecosystem: prompts, codebooks,
coding protocols, archived requests. The object is reduced to a canonical
form (classes stripped, named lists sorted by name, functions replaced by
their deparsed source), rendered as canonical JSON, and hashed with
SHA-256 over the UTF-8 bytes of that string. Two consequences worth
stating: equal content hashes equally regardless of construction order or
S3 class, and the hash does not depend on R's serialization format, so a
value recorded in a paper today is checkable on any machine later.
Arguments
- x
An R object: list, character, config, codebook – anything whose
canonical JSON form is well defined. Environments are not hashable.
Value
A 64-character lowercase SHA-256 hex string.
Details
Downstream packages treat these hashes as identifiers of record
(LLMRcontent::protocol_lock(), LLMRcontent::archive_build()); the
convention is versioned by this function's documentation – any future
change would be a new function, not a silent edit.
Examples
llm_hash(list(model = "gpt-oss-20b", temperature = 0))
# construction order does not matter:
identical(llm_hash(list(a = 1, b = 2)), llm_hash(list(b = 2, a = 1)))
# any content change does:
identical(llm_hash(list(a = 1)), llm_hash(list(a = 2)))