The brain — providers
How agents think — the Claude Code binary, the OpenAI Codex CLI, local models via Ollama and LM Studio, and per-agent brains.
Every agent’s intelligence comes from a provider — the brain. The coordination layer around it (scheduling, review, budget) is the same no matter which brain an agent runs on, which is why you can mix expensive and cheap brains in one org.
The Claude Code binary
The default provider spawns the claude binary headless:
claude --print --output-format json # prompt on stdin
The binary owns its own agentic loop, so on a task an agent acts — it runs
Bash, writes and executes code, and edits files in its working directory
(agents/<id>/). The single JSON result is parsed for the answer plus
total_cost_usd / usage, which is metered to the cost ledger.
This is the same mechanism the Claude Agent SDK uses internally; Quorum shells
out to the binary directly so the project has no runtime dependencies. Auth
comes from the binary’s own login or ANTHROPIC_API_KEY. Set CLAUDE_BIN (or
--claude-bin) if claude isn’t on your PATH.
Under the hood a brain is used in two ways:
- A plain text turn — used for meetings, evaluation, and decomposition.
- A full agentic turn — real work in the agent’s working directory, with tools gated by the agent’s capabilities.
The OpenAI Codex CLI
Give an agent the codex binary with { "type": "codex" }. Quorum drives it
headless the same way — one spawn, prompt on stdin:
codex exec --json # prompt on stdin, JSONL events on stdout
Codex also owns its own agentic loop, so a task agent runs shell commands and edits files in its working directory. Two differences from the Claude binary:
- Sandbox, not an allowlist. Where Claude takes a tool allowlist, Codex takes
a sandbox policy. Quorum translates the agent’s granted
capabilities automatically: an agent with the
codecapability runsworkspace-write(it can edit files and run commands in its folder); without it,read-only. Override per agent withsandbox. - Tokens, priced by you. Codex reports token
usagebut not a dollar figure. Supply apricingrate to get a USD cost on the ledger; without one, tokens are tracked and cost counts as$0.
Auth comes from the binary’s own login (codex login) or OPENAI_API_KEY. Set
CODEX_BIN if codex isn’t on your PATH.
### Engineer
- provider: { "type": "codex", "model": "gpt-5-codex" }
Local models — Ollama & LM Studio
A brain doesn’t have to be in the cloud. Because a provider is just a CLI that owns its loop, the same harness — with its tools, sandbox, and metering — can run on a local model, so an org (or a few cheap roles in it) costs nothing to run. There are two routes:
Codex, natively. Codex has built-in local support — point it at Ollama or LM Studio and pick a model you’ve pulled:
### Researcher
- provider: { "type": "codex", "oss": true, "localProvider": "ollama", "model": "gpt-oss:20b" }
### Analyst
- provider: { "type": "codex", "oss": true, "localProvider": "lmstudio", "model": "qwen2.5-coder" }
Claude, via ollama launch. Ollama can launch the Claude Code harness backed
by a local model. Set bin to ollama and a launch prefix; Quorum runs
ollama launch claude … -- in front of the standard headless call, and the
launcher’s --model selects the brain:
### Writer
- provider: { "type": "claude", "bin": "ollama", "launch": "launch claude -y --model gemma4 --" }
Local models are free, so their calls show tokens with a $0 cost. One honest
caveat: small local models are weaker at multi-step tool use — they may talk
through a task instead of running the commands to do it. Use them for the
deliberative roles (research, review, meetings) and a cloud brain for the heavy
building, or size up the local model. Mixing the two in one org is the point:
### Strategist # local, free — does the thinking
- provider: { "type": "codex", "oss": true, "localProvider": "ollama", "model": "gpt-oss:20b" }
### Engineer # cloud — does the building
- provider: { "type": "claude", "model": "claude-opus-4-8" }
More agent CLIs
Beyond Claude and Codex, Quorum can drive the wider ecosystem of headless coding agents as brains. Each owns its own tool loop; Quorum spawns it, feeds the prompt, and reads its result:
type | binary | reports |
|---|---|---|
opencode | opencode | text + tokens + cost |
cline | cline | text + tokens + cost |
copilot | GitHub Copilot CLI | text + output tokens (bills “premium requests”, no USD) |
hermes | Hermes Agent | text only |
kimi | Kimi Code CLI | text + tokens (stream-json) |
droid | Factory Droid | text + tokens |
gemini | Gemini CLI | text + tokens (JSON) |
llm | llm (Simon Willison) | text only — a lightweight text brain |
llm is the odd one out: it has no tool loop, just text in → text out. That
makes it a cheap, reliable pick for the deliberative roles — meetings,
evaluation, routing, writing — where a full agent is overkill. Give it a local
model with the llm-ollama plugin (llm install llm-ollama, then
"model": "gpt-oss:20b"); its token counts print to stderr, so calls meter free.
The rest configure exactly like any other brain. Point one at its own cloud backend with a model and its own auth:
### Reviewer
- provider: { "type": "opencode", "model": "anthropic/claude-sonnet-4-6" }
…or run any of them on a local model through the same ollama launch
prefix as Claude — bin: "ollama" + a launch string; the launcher selects the
model, so no inner model flag is emitted:
### Analyst
- provider: { "type": "opencode", "bin": "ollama", "launch": "launch opencode -y --model gpt-oss:20b --" }
### Scout
- provider: { "type": "hermes", "bin": "ollama", "launch": "launch hermes -y --model gpt-oss:20b --" }
Set <TOOL>_BIN (e.g. OPENCODE_BIN, CLINE_BIN) if a binary isn’t on your
PATH. As with Codex, a tool that reports tokens but not dollars gets a USD cost
only if you supply pricing; local runs are free.
Quorum maps the agent’s capabilities to each tool’s
auto-approval: an agent with the code capability runs with the tool’s
write/auto-approve flag; without it, read-only.
The fake brain
The fake provider (config { "type": "fake" }) is a deterministic, no-API
stub for tests and offline dry-runs. It never emits the org’s control markers
(a JSON plan, SCORE: n, RESOLVED: yes), so control flows fall back
predictably. Give one agent — or defaults.provider — this type to exercise a
whole QUORUM.md without spending a token or needing the binary.
Configuring a brain
A provider is configured per agent (overriding the org default) via the
provider object. Every field:
| field | meaning |
|---|---|
type | "claude", "codex", "opencode", "cline", "copilot", "hermes", "kimi", "droid", "gemini", "llm", or "fake" |
model | model id, e.g. claude-opus-4-8, gpt-5-codex, gpt-oss:20b |
effort | reasoning effort: low | medium | high | xhigh | max (claude) |
maxBudgetUsd | hard per-session spend ceiling enforced by the binary (claude) |
bin | path to the binary (default $CLAUDE_BIN / $CODEX_BIN or the binary name) |
timeoutMs | per-call timeout (default 10 minutes) |
retries | retry count on failure (default 3) |
args | extra raw CLI args appended to every call |
launch | launcher prefix, e.g. "launch claude -y --model gemma4 --" (claude) |
oss | run against a local open-source model (codex) |
localProvider | ollama | lmstudio — which local server oss targets (codex) |
sandbox | read-only | workspace-write | danger-full-access (codex; else derived from tools) |
mcp | path(s) to MCP server config JSON the org owns; null disables (claude) |
strictMcp | isolate from the operator’s own MCP servers (default true) (claude) |
settingSources | --setting-sources value, e.g. "project,local" (claude) |
pricing | cost-per-token override for the ledger (Codex needs this for a USD cost) |
Per-agent brains
Any agent — seeded or hired — may run on its own brain. This is one of the most useful levers in Quorum: put a top model on the roles that need judgment, and a cheaper one at lower effort on routine execution.
### CTO
- provider: { "type": "claude", "model": "claude-opus-4-8", "effort": "high" }
### Engineer
- provider: { "type": "claude", "model": "claude-sonnet-4-6", "effort": "medium", "maxBudgetUsd": 2 }
A hired agent’s brain is persisted on the roster, so it survives restart. And the hiring directive can choose the brain at hire time — see Hiring & delegation:
HIRE[Engineer | model=claude-sonnet-4-6 effort=low]: build and verify the API
Isolation & MCP
By default (strictMcp: true) an agent is isolated from the operator’s own MCP
servers, so an org can’t silently reach into your personal tool configuration.
Point mcp at a config the org owns to give its agents MCP tools on purpose,
or set it to null to disable. settingSources similarly limits which settings
files the binary loads.