Skip to content

Config: provider, agent & routing

Provider/model keys, named provider profiles, agent loop settings, pacing, routing, and reliability sections.

This page is the field reference for the parts of ~/.revka/config.toml that decide which model answers, how the agent loop runs, and how requests are routed and retried. Every key here lists its type, default, and meaning so you can edit config.toml directly and know exactly what each value does.

Use this page when you are hand-tuning a deployment: choosing a provider, capping the tool loop, taming a slow local model, setting up fallbacks, or pinning named provider profiles. For the conceptual walkthrough of how routing and reliability behave at runtime, read Routing, reliability & tuning. For the file-location and precedence rules, see the Configuration overview.

These top-level keys (no section header — they live at the root of config.toml) are the first things you edit. They select the default provider, model, key, and request behavior.

default_provider = "openrouter"
default_model = "anthropic/claude-sonnet-4-6"
api_key = "sk-or-..."
default_temperature = 0.7
provider_timeout_secs = 120
KeyTypeDefaultMeaning
api_keystring?noneProvider API key. Overridden by REVKA_API_KEY or API_KEY, or a provider-specific env var (e.g. OPENROUTER_API_KEY).
api_urlstring?noneProvider base URL override (e.g. Ollama at http://10.0.0.1:11434).
api_pathstring?noneCustom API path suffix for unusual endpoints (e.g. /v2/generate).
default_providerstring"openrouter"Default provider ID. Alias model_provider; env REVKA_PROVIDER (or legacy PROVIDER).
default_modelstring"anthropic/claude-sonnet-4-6"Default model. Alias model; env REVKA_MODEL.
default_temperaturef640.7Sampling temperature, valid range 0.0–2.0; env REVKA_TEMPERATURE.
provider_timeout_secsu64120HTTP timeout for LLM provider calls, in seconds.
provider_max_tokensu32?unsetCaps output tokens. Set this on OpenRouter models that 402 at the default cap.
extra_headersmap{}Extra HTTP headers for provider calls; env REVKA_EXTRA_HEADERS in Key:Value,Key2:Value2 format.

The [agent] section controls the tool-call loop, context budget, compression, and per-turn tool filtering. This is where you tune how many tool turns the agent may take and how aggressively it manages its context window.

[agent]
max_tool_iterations = 60
parallel_tools = true
compact_context = true
max_context_tokens = 1050000
[[agent.tool_filter_groups]]
mode = "always"
tools = ["mcp_vikunja_*"]
[[agent.tool_filter_groups]]
mode = "dynamic"
tools = ["mcp_browser_*"]
keywords = ["browse", "website", "url"]
KeyTypeDefaultMeaning
compact_contextbooltrueUse a compact bootstrap suited to 13B-or-smaller models.
max_tool_iterationsu3260Max tool-call turns per message; 0 falls back to a safe internal default of 10 (in the channel loop).
max_history_messagesu321000Max conversation history messages retained.
max_context_tokensu321050000Token budget before compression triggers.
parallel_toolsbooltrueExecute independent tool calls concurrently.
tool_dispatcherstring"auto"Tool dispatch strategy.
tool_call_dedup_exemptarray[]Tools exempt from duplicate-call suppression.
tool_filter_groupsarray[]Per-turn MCP tool schema filters (see below).
max_system_prompt_charsu320 (unlimited)Truncate the system prompt to N chars for small-context models.
max_tool_result_charsu3250000Max chars per tool result before middle truncation.
keep_tool_context_turnsu322Recent turns that preserve full tool-call/result messages.
context_window_safety_ratiof640.95Fraction of the context window before a hard fail. Clamped to 1.0; values ≤ 0 fall back to 0.95.

tool_filter_groups is the biggest token-cost lever for agents wired to many MCP tools. Each [[agent.tool_filter_groups]] table controls whether a set of tool schemas is injected into a turn:

FieldMeaning
mode"always" (always inject) or "dynamic" (inject only when a keyword matches the message).
toolsGlob patterns of tool names, e.g. "mcp_browser_*".
keywordsFor mode = "dynamic": substrings that, when present in the message, enable the group.

The [agent] section nests several tuning blocks:

  • [agent.context_compression] — compression behavior: enabled (true), threshold_ratio (0.5), protect_first_n (3), protect_last_n (4), max_passes (3), compact_tool_schemas (true).
  • [agent.thinking] — controls model reasoning depth.
  • [agent.history_pruning] — token-efficiency pruning of older history.

The [pacing] section extends timeouts and loop-detection behavior for slow or local back-ends (Ollama, llama.cpp, vLLM, SGLang). Cloud users rarely need it.

[pacing]
step_timeout_secs = 120
loop_detection_min_elapsed_secs = 60
loop_ignore_tools = ["browser_screenshot", "browser_navigate"]
message_timeout_scale_max = 8
KeyTypeDefaultMeaning
step_timeout_secsu64?unsetPer-step LLM inference timeout, independent of the total message budget. Firing it does not consume the overall budget.
loop_detection_min_elapsed_secsu64?unsetGrace period before loop detection starts counting.
loop_ignore_toolsarray[]Tools excluded from identical-output loop detection.
message_timeout_scale_maxu324Cap for message-timeout budget scaling (the base channels_config.message_timeout_secs scales with tool-loop depth).
loop_detection_enabledbooltrueMaster switch for pattern-based loop detection.
loop_detection_window_sizeu3220Sliding-window size for the loop detector.
loop_detection_max_repeatsu323Consecutive identical tool+args calls before a warning.

The [reliability] section wraps your primary provider with a three-tier resilience strategy — model fallback chain, provider fallback chain, then an inner retry loop with exponential backoff. These keys are validated at startup and by revka doctor, and are hot-reloaded on the next inbound channel message.

[reliability]
provider_retries = 2 # attempts per provider before fallover
provider_backoff_ms = 500 # base backoff ms; doubles each retry, capped at 10000
fallback_providers = ["anthropic", "openai"]
api_keys = ["sk-second-key", "sk-third-key"] # round-robin on 429
[reliability.model_fallbacks]
"claude-opus-4-20250514" = ["claude-sonnet-4-6-20250514", "gpt-4o"]
KeyTypeDefaultMeaning
provider_retriesu322Attempts per provider before falling over to the next.
provider_backoff_msu64500Base backoff in ms; doubles each retry, capped at 10000.
fallback_providersarrayunsetOrdered provider names tried after the primary exhausts retries.
api_keysarrayunsetExtra keys for round-robin rotation on 429 errors.
model_fallbacksmapunsetMap of model → [fallback_model, …] under [reliability.model_fallbacks].

The conceptual model — error classification, Retry-After handling, and the fallback notification channels surface — is documented in Routing, reliability & tuning.

[model_providers] — named provider profiles

Section titled “[model_providers] — named provider profiles”

Named provider profiles support Codex app-server-style configuration and let you define multiple endpoints — most commonly an Azure OpenAI resource/deployment — under stable names.

[model_providers.azure-gpt4]
name = "openai"
base_url = "https://my-resource.openai.azure.com"
azure_openai_resource = "my-resource"
azure_openai_deployment = "gpt-4o"
azure_openai_api_version = "2024-08-01-preview"
FieldMeaning
nameUnderlying provider implementation to use (e.g. openai).
base_urlEndpoint base URL.
api_pathCustom request path suffix.
wire_apiWire format: responses or chat_completions.
requires_openai_authWhether OpenAI-style auth is required.
azure_openai_resourceAzure resource name.
azure_openai_deploymentAzure deployment name (acts as the model).
azure_openai_api_versionAzure API version.
max_tokensPer-profile output token cap.

Routes map a symbolic hint: name to a concrete (provider, model) pair. Keep call sites on stable hints like hint:reasoning and upgrade models by editing only the route.

[[model_routes]]
hint = "reasoning"
provider = "openrouter"
model = "anthropic/claude-opus-4-5"
[[model_routes]]
hint = "fast"
provider = "groq"
model = "llama-3.1-8b-instant"
FieldRequiredMeaning
hintyesUnique symbolic name (the part after hint:).
provideryesA known provider ID.
modelyesModel ID to use with that provider.
api_keynoPer-route key override.

Two special hints — hint:cost-optimized and hint:cheapest — score routes against [model_pricing] data and pick the cheapest qualifying one. Unknown hints fall back to the default provider with the hint passed through as-is. The agent can edit routes for you via the model_routing_config tool. revka doctor validates that every route points at a known provider.

[[embedding_routes]] — embedding routing

Section titled “[[embedding_routes]] — embedding routing”

Embedding routes are the same hint mechanism, applied to the memory embedding pipeline and configured independently from inference routing. Activate them by pointing [memory] embedding_model at a hint.

[memory]
embedding_model = "hint:semantic"
[[embedding_routes]]
hint = "semantic"
provider = "openai"
model = "text-embedding-3-small"
dimensions = 1536
FieldRequiredMeaning
hintyesSymbolic name referenced by embedding_model = "hint:<name>".
provideryesOne of none, openai, or custom:<url>.
modelyesEmbedding model ID.
dimensionsnoOverride when the API default differs from your storage schema.
api_keynoPer-route key override.

[query_classification] — automatic hint routing

Section titled “[query_classification] — automatic hint routing”

Query classification picks a hint: automatically from the content of the incoming message — no LLM call is made; matching is pure string comparison. Rules are evaluated in priority order.

[query_classification]
enabled = true
[[query_classification.rules]]
hint = "reasoning"
keywords = ["explain", "analyze", "why"]
min_length = 200
priority = 10
[[query_classification.rules]]
hint = "fast"
keywords = ["hi", "hello", "thanks"]
max_length = 50
KeyTypeDefaultMeaning
enabledboolfalseMaster switch for classification.
rulesarray[]Rules, evaluated highest-priority first.

Each rule under [[query_classification.rules]]:

KeyTypeDefaultMeaning
hintstringrequiredMust match a configured [[model_routes]] hint.
keywordsarray[]Case-insensitive substring matches.
patternsarray[]Case-sensitive literal matches (e.g. code fences).
min_lengthintunsetMatch only when the message is at least N characters.
max_lengthintunsetMatch only when the message is at most N characters.
priorityint0Higher priority is checked first.

The [runtime] section controls execution mode and the model reasoning/thinking toggle. The reasoning fields feed every provider as part of its per-request options.

[runtime]
reasoning_enabled = true # or false; unset = provider defaults
reasoning_effort = "high" # minimal | low | medium | high | xhigh
KeyTypeDefaultMeaning
reasoning_enabledbool?unsetToggle extended thinking. Env REVKA_REASONING_ENABLED / REASONING_ENABLED. For Ollama this sends think: true/false.
reasoning_effortstring?unsetReasoning effort: minimal, low, medium, high, xhigh. Used by Codex and other effort-aware models.

The [identity] section selects the identity-document format injected into system prompts: OpenClaw (the default style) or AIEOS.

[identity]
format = "aieos"
aieos_path = "identity.json" # workspace-relative
# or inline:
# aieos_inline = '{"name": "Revka", ...}'
KeyMeaning
formatIdentity document format (openclaw or aieos).
aieos_pathWorkspace-relative path to an AIEOS identity JSON file.
aieos_inlineInline AIEOS identity JSON, as an alternative to aieos_path.

The primary agent can hand work to named sub-agents and swarms. These live in their own sections but interact with the loop settings above:

  • [agents.<name>] — delegate sub-agent configs (provider, model, agentic, allowed_tools, max_iterations, timeouts). [operator] can override agent.max_tool_iterations for operator sessions. See Agents, teams & swarms.
  • [swarms.<name>] — coordinated groups of agents with sequential, parallel, or router strategy.