Skip to contents

One hash convention for the whole LLMR ecosystem: prompts, codebooks, coding protocols, archived requests. The object is reduced to a canonical form (classes stripped, named lists sorted by name, functions replaced by their deparsed source), rendered as canonical JSON, and hashed with SHA-256 over the UTF-8 bytes of that string. Two consequences worth stating: equal content hashes equally regardless of construction order or S3 class, and the hash does not depend on R's serialization format, so a value recorded in a paper today is checkable on any machine later.

Usage

llm_hash(x)

Arguments

x

An R object: list, character, config, codebook – anything whose canonical JSON form is well defined. Environments are not hashable.

Value

A 64-character lowercase SHA-256 hex string.

Details

Downstream packages treat these hashes as identifiers of record (LLMRcontent::protocol_lock(), LLMRcontent::archive_build()); the convention is versioned by this function's documentation – any future change would be a new function, not a silent edit.

Examples

llm_hash(list(model = "gpt-oss-20b", temperature = 0))
# construction order does not matter:
identical(llm_hash(list(a = 1, b = 2)), llm_hash(list(b = 2, a = 1)))
# any content change does:
identical(llm_hash(list(a = 1)), llm_hash(list(a = 2)))