Calls the model .times times for every row of .data (all replicates run
through the parallel engine in one pass) and appends one column per
replicate: <output>_1, <output>_2, .... Feed the result to
llm_agreement() for per-row majority labels and overall reliability.
Arguments
- .data
A data.frame / tibble.
- output
Unquoted base name for the replicate columns.
- prompt
A glue template string evaluated against the columns of
.data(as inllm_mutate()).- .config
An llm_config object (generative).
- .times
Number of replicates (default 3).
- .system_prompt
Optional system message.
- ...
Passed to
call_llm_par()(e.g.tries,progress).
Value
.data with .times new character columns,
<output>_1 ... <output>_<.times> (NA where a call failed).
Details
Replication only measures sampling variability if the model can vary: with
temperature = 0 (or a fixed seed) most providers return nearly identical
draws, which inflates agreement. Conversely, for measurement purposes you
may want exactly that check: high disagreement at low temperature signals a
prompt the model finds genuinely ambiguous.
Examples
if (FALSE) { # \dontrun{
cfg <- llm_config("groq", "openai/gpt-oss-20b", temperature = 1)
df <- tibble::tibble(text = c("I loved it", "Meh", "Terrible service"))
reps <- df |>
llm_replicate(sentiment,
prompt = "Sentiment of '{text}'. One word: positive, negative, or neutral.",
.config = cfg, .times = 5)
llm_agreement(reps, prefix = "sentiment")
} # }