knitr::opts_chunk$set(
collapse = TRUE, comment = "#>",
eval = identical(tolower(Sys.getenv("LLMR_RUN_VIGNETTES", "false")), "true") )
TL;DR
-
JSON mode: ask the model for “a JSON object.” Lower
friction. Weak guarantees.
-
Schema output: give a JSON Schema and request
strict validation. Higher reliability when the provider enforces
it.
- Reality: enforcement and request shapes differ across providers. Use defensive parsing and local validation.
What the major providers actually support
OpenAI-compatible (OpenAI, Groq, Together, x.ai, DeepSeek)
Chat Completions accept aresponse_format
(e.g.,{"type":"json_object"}
or a JSON-Schema payload). Enforcement varies by provider but the interface is OpenAI-shaped.
See OpenAI API overview, Groq API (OpenAI-compatible), Together: OpenAI compatibility, x.ai: OpenAI API schema, DeepSeek: OpenAI-compatible endpointAnthropic (Claude)
No global “JSON mode.” Instead, you define a tool with aninput_schema
(JSON Schema) and force it viatool_choice
, so the model must return a JSON object that validates the schema.
See Anthropic Messages API: tools &input_schema
Google Gemini (REST)
SetresponseMimeType = "application/json"
ingenerationConfig
to request JSON. Some models also acceptresponseSchema
for constrained JSON (model-dependent).
See Gemini documentation —
Why prefer schema output?
-
Deterministic downstream code: predictable
keys/types enable typed transforms.
-
Safer integrations: strict mode avoids extra keys,
missing fields, or textual preambles.
- Faster failure: invalid generations fail early, where retry/backoff is easy to manage.
Why JSON-only still matters
-
Broadest support across
models/providers/proxies.
- Low ceremony for exploration, labeling, and quick prototypes.
Quirks you will hit in practice
- Models often wrap JSON in code fences or add
pre/post text.
- Arrays/objects appear where you expected scalars; ints vs
doubles vary by provider/sample.
- Safety/length caps can truncate output; detect and handle “finish_reason = length/filter.”
LLMR helpers to blunt those edges
-
llm_parse_structured()
strips fences and extracts the largest balanced{...}
or[...]
before parsing.
-
llm_parse_structured_col()
hoists fields (supports dot/bracket paths and JSON Pointer) and keeps non-scalars as list-columns.
-
llm_validate_structured_col()
validates locally via jsonvalidate (AJV).
-
enable_structured_output()
flips the right provider switch (OpenAI-compatresponse_format
, Anthropic tool +input_schema
, GeminiresponseMimeType
/responseSchema
).
Minimal patterns (guarded code)
All chunks use a tiny helper so your document knits even without API keys.
1) JSON mode, no schema (works across OpenAI-compatible providers)
safe({
library(LLMR)
cfg <- llm_config(
provider = "openai", # try "groq" or "together" too
model = "gpt-4.1-nano",
temperature = 0
)
# Flip JSON mode on (OpenAI-compat shape)
cfg_json <- enable_structured_output(cfg, schema = NULL)
res <- call_llm(cfg_json, 'Give me a JSON object {"ok": true, "n": 3}.')
parsed <- llm_parse_structured(res)
cat("Raw text:\n", as.character(res), "\n\n")
str(parsed)
})
What could still fail? Proxies labeled
“OpenAI-compatible” sometimes accept response_format
but
don’t strictly enforce it; LLMR’s parser recovers from fences or
pre/post text.
2) Schema mode that actually works (Groq + Qwen, open-weights / non-commercial friendly)
Groq serves Qwen 2.5 Instruct models with OpenAI-compatible APIs.
Their Structured Outputs feature enforces JSON Schema
and (notably) expects all properties to be listed under
required
.
safe({
library(LLMR); library(dplyr)
# Schema: make every property required to satisfy Groq's stricter check
schema <- list(
type = "object",
additionalProperties = FALSE,
properties = list(
title = list(type = "string"),
year = list(type = "integer"),
tags = list(type = "array", items = list(type = "string"))
),
required = list("title","year","tags")
)
cfg <- llm_config(
provider = "groq",
model = "qwen-2.5-72b-instruct", # a Qwen Instruct model on Groq
temperature = 0
)
cfg_strict <- enable_structured_output(cfg, schema = schema, strict = TRUE)
df <- tibble(x = c("BERT paper", "Vision Transformers"))
out <- llm_fn_structured(
df,
prompt = "Return JSON about '{x}' with fields title, year, tags.",
.config = cfg_strict,
.schema = schema, # send schema to provider
.fields = c("title","year","tags"),
.validate_local = TRUE
)
out %>% select(structured_ok, structured_valid, title, year, tags) %>% print(n = Inf)
})
If your key is set, you should see structured_ok = TRUE
,
structured_valid = TRUE
, plus parsed columns.
Common gotcha: If Groq returns a 400 error
complaining about required
, ensure all
properties are listed in the required
array.
Groq’s structured output implementation is stricter than OpenAI’s.
3) Anthropic: force a schema via a tool (may require
max_tokens
)
safe({
library(LLMR)
schema <- list(
type="object",
properties=list(answer=list(type="string"), confidence=list(type="number")),
required=list("answer","confidence"),
additionalProperties=FALSE
)
cfg <- llm_config("anthropic","claude-3-5-haiku-latest", temperature = 0)
cfg <- enable_structured_output(cfg, schema = schema, name = "llmr_schema")
res <- call_llm(cfg, c(
system = "Return only the tool result that matches the schema.",
user = "Answer: capital of Japan; include confidence in [0,1]."
))
parsed <- llm_parse_structured(res)
str(parsed)
})
Anthropic requires
max_tokens
; LLMR warns and defaults if you omit it.
4) Gemini: JSON response (plus optional response schema on supported models)
safe({
library(LLMR)
cfg <- llm_config(
"gemini", "gemini-2.5-flash-lite",
response_mime_type = "application/json" # ask for JSON back
# Optionally: gemini_enable_response_schema = TRUE, response_schema = <your JSON Schema>
)
res <- call_llm(cfg, c(
system = "Reply as JSON only.",
user = "Produce fields name and score about 'MNIST'."
))
str(llm_parse_structured(res))
})
Defensive patterns (no API calls)
safe({
library(LLMR); library(tibble)
messy <- c(
'```json\n{"x": 1, "y": [1,2,3]}\n```',
'Sure! Here is JSON: {"x":"1","y":"oops"} trailing words',
'{"x":1, "y":[2,3,4]}'
)
tibble(response_text = messy) |>
llm_parse_structured_col(
fields = c(x = "x", y = "/y/0") # dot/bracket or JSON Pointer
) |>
print(n = Inf)
})
Why this helps Works when outputs arrive fenced,
with pre/post text, or when arrays sneak in. Non-scalars become
list-columns (set allow_list = FALSE
to force scalars
only).
Pro tip: Combine with parallel execution
For production ETL workflows, combine schema validation with parallelization:
library(LLMR); library(dplyr)
cfg_with_schema = llm_config('openai','gpt-4.1-nano')
setup_llm_parallel(workers = 10)
### Assuming there is a large data frame large_df
large_df |>
llm_mutate_structured(
result,
prompt = "Extract: {text}",
.config = cfg_with_schema,
.schema = schema,
.fields = c("label", "score"),
tries = 3 # auto-retry failures
)
reset_llm_parallel()
This processes thousands of rows efficiently with automatic retries and validation.
Choosing the mode
- Reporting / ETL / metrics: Schema mode; fail fast and retry.
- Exploration / ad-hoc: JSON mode + recovery parser.
-
Cross-provider code: Always wrap provider toggles
with
enable_structured_output()
and runllm_parse_structured()
+ local validation.
References
- OpenAI: Structure Output: https://platform.openai.com/docs/guides/structured-outputs
- Groq: Structured Outputs: https://console.groq.com/docs/structured-outputs
- Together: Structured Output: https://docs.together.ai/docs/json-mode
- x.ai: Structured Output: https://docs.x.ai/docs/guides/structured-outputs
- DeepSeek: JSON Mode: https://api-docs.deepseek.com/guides/json_mode
- Anthropic: Messages API, tools &
input_schema
: https://docs.anthropic.com/en/api/messages#body-tool-choice - Google Gemini: Structured Output: https://ai.google.dev/gemini-api/docs/structured-output