Skip to contents

call_llm() dispatches to the correct provider implementation based on config$provider. It supports both generative chat/completions and embeddings, plus a simple multimodal shortcut for local files.

Usage

call_llm(config, messages, verbose = FALSE)

Arguments

config

An llm_config object.

messages

One of:

  • Plain character vector — each element becomes a "user" message.

  • Named character vector — names are roles ("system", "user", "assistant"). Multimodal shortcut: include one or more elements named "file" whose values are local paths; consecutive {user | file} entries are combined into one user turn and files are inlined (base64) for capable providers.

  • List of message objects: list(role=..., content=...). For multimodal content, set content to a list of parts like list(list(type="text", text="..."), list(type="file", path="...")).

verbose

Logical. If TRUE, prints the full parsed API response.

Value

  • Generative mode: an llmr_response object. Use as.character(x) to get just the text; print(x) shows text plus a status line; use helpers finish_reason(x) and tokens(x).

  • Embedding mode: provider-native list with an element data; convert with parse_embeddings().

Provider notes

  • OpenAI-compatible: On a server 400 that identifies the bad parameter as max_tokens, LLMR will, unless no_change=TRUE, retry once replacing max_tokens with max_completion_tokens (and inform via a cli_alert_info). The former experimental “uncapped retry on empty content” is disabled by default to avoid unexpected costs.

  • Anthropic: max_tokens is required; if omitted LLMR uses 2048 and warns. Multimodal images are inlined as base64. Extended thinking is supported: provide thinking_budget and include_thoughts = TRUE to include a content block of type "thinking" in the response; LLMR sets the beta header automatically.

  • Gemini (REST): systemInstruction is supported; user parts use text/inlineData(mimeType,data); responses are set to responseMimeType = "text/plain".

  • Error handling: HTTP errors raise structured conditions with classes like llmr_api_param_error, llmr_api_rate_limit_error, llmr_api_server_error; see the condition fields for status, code, request id, and (where supplied) the offending parameter.

Message normalization

See the “multimodal shortcut” described under messages. Internally, LLMR expands these into the provider’s native request shape and tilde-expands local file paths.

Examples

if (FALSE) { # \dontrun{
## 1) Basic generative call
cfg <- llm_config("openai", "gpt-4o-mini")
call_llm(cfg, "Say hello in Greek.")

## 2) Generative with rich return
r <- call_llm(cfg, "Say hello in Greek.")
r
as.character(r)
finish_reason(r); tokens(r)

## 3) Anthropic extended thinking (single example)
a_cfg <- llm_config("anthropic", "claude-sonnet-4-20250514",
                    max_tokens = 5000,
                    thinking_budget = 16000,
                    include_thoughts = TRUE)
r2 <- call_llm(a_cfg, "Compute 87*93 in your head. Give only the final number.")
# thinking (if present): r2$raw$content[[1]]$thinking
# final text:            r2$raw$content[[2]]$text

## 4) Multimodal (named-vector shortcut)
msg <- c(
  system = "Answer briefly.",
  user   = "Describe this image in one sentence.",
  file   = "~/Pictures/example.png"
)
call_llm(cfg, msg)

## 4) Embeddings
e_cfg <- llm_config("voyage", "voyage-large-2",
                    embedding = TRUE)
emb_raw <- call_llm(e_cfg, c("first", "second"))
emb_mat <- parse_embeddings(emb_raw)

## 5) With a chat session
ch <- chat_session(cfg)
ch$send("Say hello in Greek.")   # prints the same status line as `print.llmr_response`
ch$history()
} # }