Computes per-row majority labels and overall reliability for replicate
columns produced by llm_replicate() (or any set of columns holding
repeated codings of the same units, including codings by different models
or by humans). Reliability is reported as average pairwise percent
agreement and Krippendorff's alpha for nominal data, the statistic
reviewers most often ask for; alpha handles missing values (failed calls)
gracefully.
Arguments
- .data
A data frame holding the replicate columns.
- cols
Character vector naming the replicate columns. Alternatively supply
prefix.- prefix
Base name: columns matching
<prefix>_1,<prefix>_2, ... are used.- normalize
If
TRUE(default), values are compared after trimming whitespace and lowercasing, so "Positive" and " positive" agree. Set toFALSEfor exact string comparison.
Value
An object of class llmr_agreement: a list with
by_rowa tibble with one row per unit:
majority(modal label,NAon ties),share(modal share of non-missing replicates),n_distinct,unanimous,tie,n_missing.summarya one-row tibble:
n_units,n_replicates,mean_pairwise_agreement,krippendorff_alpha,n_unanimous,n_ties.
Printing shows the summary.
References
Krippendorff, K. (2019). Content Analysis: An Introduction to Its Methodology (4th ed.), chapter 12. The alpha implemented here is the nominal-data form with missing values allowed.