Creates a tibble of experiments for factorial designs where you want to test all combinations of configs, messages, and repetitions with automatic metadata.
Usage
build_factorial_experiments(
configs,
user_prompts,
system_prompts = NULL,
repetitions = 1,
config_labels = NULL,
user_prompt_labels = NULL,
system_prompt_labels = NULL
)
Arguments
- configs
List of llm_config objects to test.
- user_prompts
Character vector (or list) of user‑turn prompts.
- system_prompts
Optional character vector of system messages (recycled against user prompts). Missing/NA values are ignored; messages are user-only.
- repetitions
Integer. Number of repetitions per combination. Default is 1.
- config_labels
Character vector of labels for configs. If NULL, uses "provider_model".
- user_prompt_labels
Optional labels for the user prompts.
- system_prompt_labels
Optional labels for the system prompts.
Value
A tibble with columns: config (list-column), messages (list-column), config_label, user_prompt_label, system_prompt_label, and repetition. Ready for use with call_llm_par().
Examples
if (FALSE) { # \dontrun{
# Factorial design: 3 configs x 2 user prompts x 10 reps = 60 experiments
configs <- list(gpt4_config, claude_config, llama_config)
user_prompts <- c("Control prompt", "Treatment prompt")
experiments <- build_factorial_experiments(
configs = configs,
user_prompts = user_prompts,
repetitions = 10,
config_labels = c("gpt4", "claude", "llama"),
user_prompt_labels = c("control", "treatment")
)
# Use with call_llm_par
results <- call_llm_par(experiments, progress = TRUE)
} # }