Config: provider, agent & routing
Provider/model keys, named provider profiles, agent loop settings, pacing, routing, and reliability sections.
This page is the field reference for the parts of ~/.revka/config.toml that decide which model answers, how the agent loop runs, and how requests are routed and retried. Every key here lists its type, default, and meaning so you can edit config.toml directly and know exactly what each value does.
Use this page when you are hand-tuning a deployment: choosing a provider, capping the tool loop, taming a slow local model, setting up fallbacks, or pinning named provider profiles. For the conceptual walkthrough of how routing and reliability behave at runtime, read Routing, reliability & tuning. For the file-location and precedence rules, see the Configuration overview.
Core provider & model keys
Section titled “Core provider & model keys”These top-level keys (no section header — they live at the root of config.toml) are the first things you edit. They select the default provider, model, key, and request behavior.
default_provider = "openrouter"default_model = "anthropic/claude-sonnet-4-6"api_key = "sk-or-..."default_temperature = 0.7provider_timeout_secs = 120| Key | Type | Default | Meaning |
|---|---|---|---|
api_key | string? | none | Provider API key. Overridden by REVKA_API_KEY or API_KEY, or a provider-specific env var (e.g. OPENROUTER_API_KEY). |
api_url | string? | none | Provider base URL override (e.g. Ollama at http://10.0.0.1:11434). |
api_path | string? | none | Custom API path suffix for unusual endpoints (e.g. /v2/generate). |
default_provider | string | "openrouter" | Default provider ID. Alias model_provider; env REVKA_PROVIDER (or legacy PROVIDER). |
default_model | string | "anthropic/claude-sonnet-4-6" | Default model. Alias model; env REVKA_MODEL. |
default_temperature | f64 | 0.7 | Sampling temperature, valid range 0.0–2.0; env REVKA_TEMPERATURE. |
provider_timeout_secs | u64 | 120 | HTTP timeout for LLM provider calls, in seconds. |
provider_max_tokens | u32? | unset | Caps output tokens. Set this on OpenRouter models that 402 at the default cap. |
extra_headers | map | {} | Extra HTTP headers for provider calls; env REVKA_EXTRA_HEADERS in Key:Value,Key2:Value2 format. |
[agent] — agent orchestration
Section titled “[agent] — agent orchestration”The [agent] section controls the tool-call loop, context budget, compression, and per-turn tool filtering. This is where you tune how many tool turns the agent may take and how aggressively it manages its context window.
[agent]max_tool_iterations = 60parallel_tools = truecompact_context = truemax_context_tokens = 1050000
[[agent.tool_filter_groups]]mode = "always"tools = ["mcp_vikunja_*"]
[[agent.tool_filter_groups]]mode = "dynamic"tools = ["mcp_browser_*"]keywords = ["browse", "website", "url"]| Key | Type | Default | Meaning |
|---|---|---|---|
compact_context | bool | true | Use a compact bootstrap suited to 13B-or-smaller models. |
max_tool_iterations | u32 | 60 | Max tool-call turns per message; 0 falls back to a safe internal default of 10 (in the channel loop). |
max_history_messages | u32 | 1000 | Max conversation history messages retained. |
max_context_tokens | u32 | 1050000 | Token budget before compression triggers. |
parallel_tools | bool | true | Execute independent tool calls concurrently. |
tool_dispatcher | string | "auto" | Tool dispatch strategy. |
tool_call_dedup_exempt | array | [] | Tools exempt from duplicate-call suppression. |
tool_filter_groups | array | [] | Per-turn MCP tool schema filters (see below). |
max_system_prompt_chars | u32 | 0 (unlimited) | Truncate the system prompt to N chars for small-context models. |
max_tool_result_chars | u32 | 50000 | Max chars per tool result before middle truncation. |
keep_tool_context_turns | u32 | 2 | Recent turns that preserve full tool-call/result messages. |
context_window_safety_ratio | f64 | 0.95 | Fraction of the context window before a hard fail. Clamped to 1.0; values ≤ 0 fall back to 0.95. |
Tool filter groups
Section titled “Tool filter groups”tool_filter_groups is the biggest token-cost lever for agents wired to many MCP tools. Each [[agent.tool_filter_groups]] table controls whether a set of tool schemas is injected into a turn:
| Field | Meaning |
|---|---|
mode | "always" (always inject) or "dynamic" (inject only when a keyword matches the message). |
tools | Glob patterns of tool names, e.g. "mcp_browser_*". |
keywords | For mode = "dynamic": substrings that, when present in the message, enable the group. |
Sub-sections
Section titled “Sub-sections”The [agent] section nests several tuning blocks:
[agent.context_compression]— compression behavior:enabled(true),threshold_ratio(0.5),protect_first_n(3),protect_last_n(4),max_passes(3),compact_tool_schemas(true).[agent.thinking]— controls model reasoning depth.[agent.history_pruning]— token-efficiency pruning of older history.
[pacing] — slow & local LLM controls
Section titled “[pacing] — slow & local LLM controls”The [pacing] section extends timeouts and loop-detection behavior for slow or local back-ends (Ollama, llama.cpp, vLLM, SGLang). Cloud users rarely need it.
[pacing]step_timeout_secs = 120loop_detection_min_elapsed_secs = 60loop_ignore_tools = ["browser_screenshot", "browser_navigate"]message_timeout_scale_max = 8| Key | Type | Default | Meaning |
|---|---|---|---|
step_timeout_secs | u64? | unset | Per-step LLM inference timeout, independent of the total message budget. Firing it does not consume the overall budget. |
loop_detection_min_elapsed_secs | u64? | unset | Grace period before loop detection starts counting. |
loop_ignore_tools | array | [] | Tools excluded from identical-output loop detection. |
message_timeout_scale_max | u32 | 4 | Cap for message-timeout budget scaling (the base channels_config.message_timeout_secs scales with tool-loop depth). |
loop_detection_enabled | bool | true | Master switch for pattern-based loop detection. |
loop_detection_window_size | u32 | 20 | Sliding-window size for the loop detector. |
loop_detection_max_repeats | u32 | 3 | Consecutive identical tool+args calls before a warning. |
[reliability] — retries & fallbacks
Section titled “[reliability] — retries & fallbacks”The [reliability] section wraps your primary provider with a three-tier resilience strategy — model fallback chain, provider fallback chain, then an inner retry loop with exponential backoff. These keys are validated at startup and by revka doctor, and are hot-reloaded on the next inbound channel message.
[reliability]provider_retries = 2 # attempts per provider before falloverprovider_backoff_ms = 500 # base backoff ms; doubles each retry, capped at 10000fallback_providers = ["anthropic", "openai"]api_keys = ["sk-second-key", "sk-third-key"] # round-robin on 429
[reliability.model_fallbacks]"claude-opus-4-20250514" = ["claude-sonnet-4-6-20250514", "gpt-4o"]| Key | Type | Default | Meaning |
|---|---|---|---|
provider_retries | u32 | 2 | Attempts per provider before falling over to the next. |
provider_backoff_ms | u64 | 500 | Base backoff in ms; doubles each retry, capped at 10000. |
fallback_providers | array | unset | Ordered provider names tried after the primary exhausts retries. |
api_keys | array | unset | Extra keys for round-robin rotation on 429 errors. |
model_fallbacks | map | unset | Map of model → [fallback_model, …] under [reliability.model_fallbacks]. |
The conceptual model — error classification, Retry-After handling, and the fallback notification channels surface — is documented in Routing, reliability & tuning.
[model_providers] — named provider profiles
Section titled “[model_providers] — named provider profiles”Named provider profiles support Codex app-server-style configuration and let you define multiple endpoints — most commonly an Azure OpenAI resource/deployment — under stable names.
[model_providers.azure-gpt4]name = "openai"base_url = "https://my-resource.openai.azure.com"azure_openai_resource = "my-resource"azure_openai_deployment = "gpt-4o"azure_openai_api_version = "2024-08-01-preview"| Field | Meaning |
|---|---|
name | Underlying provider implementation to use (e.g. openai). |
base_url | Endpoint base URL. |
api_path | Custom request path suffix. |
wire_api | Wire format: responses or chat_completions. |
requires_openai_auth | Whether OpenAI-style auth is required. |
azure_openai_resource | Azure resource name. |
azure_openai_deployment | Azure deployment name (acts as the model). |
azure_openai_api_version | Azure API version. |
max_tokens | Per-profile output token cap. |
[[model_routes]] — model routing hints
Section titled “[[model_routes]] — model routing hints”Routes map a symbolic hint: name to a concrete (provider, model) pair. Keep call sites on stable hints like hint:reasoning and upgrade models by editing only the route.
[[model_routes]]hint = "reasoning"provider = "openrouter"model = "anthropic/claude-opus-4-5"
[[model_routes]]hint = "fast"provider = "groq"model = "llama-3.1-8b-instant"| Field | Required | Meaning |
|---|---|---|
hint | yes | Unique symbolic name (the part after hint:). |
provider | yes | A known provider ID. |
model | yes | Model ID to use with that provider. |
api_key | no | Per-route key override. |
Two special hints — hint:cost-optimized and hint:cheapest — score routes against [model_pricing] data and pick the cheapest qualifying one. Unknown hints fall back to the default provider with the hint passed through as-is. The agent can edit routes for you via the model_routing_config tool. revka doctor validates that every route points at a known provider.
[[embedding_routes]] — embedding routing
Section titled “[[embedding_routes]] — embedding routing”Embedding routes are the same hint mechanism, applied to the memory embedding pipeline and configured independently from inference routing. Activate them by pointing [memory] embedding_model at a hint.
[memory]embedding_model = "hint:semantic"
[[embedding_routes]]hint = "semantic"provider = "openai"model = "text-embedding-3-small"dimensions = 1536| Field | Required | Meaning |
|---|---|---|
hint | yes | Symbolic name referenced by embedding_model = "hint:<name>". |
provider | yes | One of none, openai, or custom:<url>. |
model | yes | Embedding model ID. |
dimensions | no | Override when the API default differs from your storage schema. |
api_key | no | Per-route key override. |
[query_classification] — automatic hint routing
Section titled “[query_classification] — automatic hint routing”Query classification picks a hint: automatically from the content of the incoming message — no LLM call is made; matching is pure string comparison. Rules are evaluated in priority order.
[query_classification]enabled = true
[[query_classification.rules]]hint = "reasoning"keywords = ["explain", "analyze", "why"]min_length = 200priority = 10
[[query_classification.rules]]hint = "fast"keywords = ["hi", "hello", "thanks"]max_length = 50| Key | Type | Default | Meaning |
|---|---|---|---|
enabled | bool | false | Master switch for classification. |
rules | array | [] | Rules, evaluated highest-priority first. |
Each rule under [[query_classification.rules]]:
| Key | Type | Default | Meaning |
|---|---|---|---|
hint | string | required | Must match a configured [[model_routes]] hint. |
keywords | array | [] | Case-insensitive substring matches. |
patterns | array | [] | Case-sensitive literal matches (e.g. code fences). |
min_length | int | unset | Match only when the message is at least N characters. |
max_length | int | unset | Match only when the message is at most N characters. |
priority | int | 0 | Higher priority is checked first. |
[runtime] — reasoning & runtime adapter
Section titled “[runtime] — reasoning & runtime adapter”The [runtime] section controls execution mode and the model reasoning/thinking toggle. The reasoning fields feed every provider as part of its per-request options.
[runtime]reasoning_enabled = true # or false; unset = provider defaultsreasoning_effort = "high" # minimal | low | medium | high | xhigh| Key | Type | Default | Meaning |
|---|---|---|---|
reasoning_enabled | bool? | unset | Toggle extended thinking. Env REVKA_REASONING_ENABLED / REASONING_ENABLED. For Ollama this sends think: true/false. |
reasoning_effort | string? | unset | Reasoning effort: minimal, low, medium, high, xhigh. Used by Codex and other effort-aware models. |
[identity] — agent identity format
Section titled “[identity] — agent identity format”The [identity] section selects the identity-document format injected into system prompts: OpenClaw (the default style) or AIEOS.
[identity]format = "aieos"aieos_path = "identity.json" # workspace-relative# or inline:# aieos_inline = '{"name": "Revka", ...}'| Key | Meaning |
|---|---|
format | Identity document format (openclaw or aieos). |
aieos_path | Workspace-relative path to an AIEOS identity JSON file. |
aieos_inline | Inline AIEOS identity JSON, as an alternative to aieos_path. |
Related delegate-agent settings
Section titled “Related delegate-agent settings”The primary agent can hand work to named sub-agents and swarms. These live in their own sections but interact with the loop settings above:
[agents.<name>]— delegate sub-agent configs (provider, model,agentic,allowed_tools,max_iterations, timeouts).[operator]can overrideagent.max_tool_iterationsfor operator sessions. See Agents, teams & swarms.[swarms.<name>]— coordinated groups of agents withsequential,parallel, orrouterstrategy.