The agent loop
How Revka runs an agentic turn: tool iterations, parallel tools, context compression, and history management.
Every message Revka handles — from a channel, the dashboard chat, the /webhook
endpoint, a cron job, or revka agent — runs through the same tool-using agent
loop. The LLM reasons, issues tool calls, reads the results, and reasons again,
repeating until the task is done or a limit is reached. This page explains how
that loop is bounded and tuned: how many iterations it can take, when tools run
in parallel, how Revka keeps the context window from overflowing, and how
conversation history is pruned over time.
Read this when you need to tune agent behaviour — extending long-running tasks,
speeding up multi-tool turns, cutting token cost on big tool outputs, or running
on a small-context local model. Every knob below lives in the [agent] section
of ~/.revka/config.toml. For the surrounding architecture, see
How Revka works; for the policy checks applied to
each tool call, see Autonomy levels & approvals.
How a turn runs
Section titled “How a turn runs”-
Ingress. The message enters the loop from a channel, WebSocket (
/ws/chat), SSE, a webhook, or the CLI. -
Reason and call tools. The LLM produces reasoning and zero or more tool calls. Each tool call is checked against the security policy before it executes.
-
Execute. Independent tool calls run — concurrently when
parallel_tools = trueand no call needs approval gating. Results are returned in stable order regardless of completion order. -
Iterate. Tool results feed back into the prompt and the LLM reasons again. Each cycle counts as one tool iteration.
-
Stop. The loop ends when the LLM returns a final answer with no tool calls, or when
max_tool_iterationsis hit. The reply streams back to the originating surface.
Between iterations, Revka transparently manages the context window: it compacts tool schemas, trims oversized tool results, and — when the running token total crosses a threshold — compresses older history. None of this requires action from the agent or the user; the defaults are tuned for a 1M-token context model.
Orchestration settings ([agent])
Section titled “Orchestration settings ([agent])”The core loop knobs. All keys are optional and fall back to the defaults shown.
[agent]max_tool_iterations = 60parallel_tools = truemax_context_tokens = 1050000max_history_messages = 1000keep_tool_context_turns = 2max_tool_result_chars = 50000context_window_safety_ratio = 0.95| Key | Type | Default | Meaning |
|---|---|---|---|
max_tool_iterations | int | 60 | Maximum tool-call loop turns per user message, across CLI, gateway, and channels. 0 falls back to 60. |
parallel_tools | bool | true | Execute independent tool calls within one iteration concurrently. |
max_context_tokens | int | 1050000 | Token budget used by loop-level context trimming and compression. |
max_history_messages | int | 1000 | Maximum conversation history messages retained per session. |
keep_tool_context_turns | int | 2 | Recent turns whose full tool-call/result messages are preserved in channel history. |
max_tool_result_chars | int | 50000 | Maximum characters retained for a single tool result before middle truncation. |
context_window_safety_ratio | float | 0.95 | Fraction of the model context window allowed before Revka fails loud. Clamped to 1.0; values <= 0 fall back to 0.95. |
tool_call_dedup_exempt | [string] | [] | Exact tool names allowed to be called repeatedly with identical arguments in one turn, bypassing duplicate-call suppression. |
compact_context | bool | true | Use a compact bootstrap prompt (smaller RAG and bootstrap budgets); intended for 13B or smaller models. |
Tool iterations
Section titled “Tool iterations”max_tool_iterations caps how many reasoning-then-tool-call cycles a single
message may take. A simple Q&A might use one iteration; a multi-step task that
reads files, runs a build, and reports results uses several. When the cap is
exceeded on a channel message, the runtime returns:
Agent exceeded maximum tool iterations (60)Raise it for deep autonomous tasks; lower it to fail fast and contain cost on
untrusted input. Note that the channel message timeout budget scales with this
value: it is message_timeout_secs * min(max_tool_iterations, message_timeout_scale_max),
so a higher iteration cap also grants more wall-clock time (up to the
[pacing] message_timeout_scale_max cap, default 4).
Parallel tools
Section titled “Parallel tools”When parallel_tools = true (the default), Revka dispatches independent tool
calls from a single iteration concurrently instead of one at a time — for
example, reading three files at once, or fetching two URLs in parallel. Results
are reassembled in their original order, so the LLM sees a stable, deterministic
result list.
Calls that require approval gating are not parallelised; they run through the
normal supervised-approval path. Setting parallel_tools = false forces strictly
sequential execution, which is occasionally useful when tools share fragile
external state.
Context compression
Section titled “Context compression”Revka keeps the prompt within the model’s context window using a deterministic,
zero-LLM compression layer plus an optional summarisation pass for older history.
Configure it under [agent.context_compression].
[agent.context_compression]enabled = truethreshold_ratio = 0.5protect_first_n = 3protect_last_n = 4max_passes = 3compact_tool_schemas = trueterse_internal_outputs = true| Key | Default | Meaning |
|---|---|---|
enabled | true | Enable automatic context compression. |
threshold_ratio | 0.5 | Fraction of the context window that triggers a compression pass. |
protect_first_n | 3 | Messages protected at the start of history (the framing of the task). |
protect_last_n | 4 | Recent messages protected from compression. |
max_passes | 3 | Maximum compression passes before failing loud. |
summary_max_chars | 4000 | Maximum characters retained in stored compaction summaries. |
source_max_chars | 50000 | Safety cap on transcript text passed to the summariser. |
timeout_secs | 60 | Timeout for the summarisation provider call. |
live_tool_result_max_chars | 12000 | Max characters retained for a live tool result before content-aware compression. |
tool_result_retrim_chars | 2000 | Max characters retained for older tool results during fast trim. |
input_max_chars | 24000 | Max characters retained for a single large user input before content-aware compression. |
compact_tool_schemas | true | Shorten native tool descriptions and JSON-schema metadata before each LLM call. |
compact_system_tool_docs | true | Render compact tool docs in the system prompt when schemas are sent separately. |
tool_description_max_chars | 180 | Max characters per tool description after schema compaction. |
schema_description_max_chars | 120 | Max characters per JSON-schema description after compaction. |
terse_internal_outputs | true | Use concise output contracts for internal operator/agent handoffs. |
tool_result_trim_exempt | [string] | [] — tool names exempt from tool-result trimming. |
The four content-aware axes
Section titled “The four content-aware axes”The content-aware layer is deterministic — no LLM call, no token cost — and applies type-specific reduction on four token-heavy axes:
| Axis | Source | Reduced to |
|---|---|---|
| Large pasted input | Big user messages | Schema-and-samples for structured data; bounded text otherwise |
| CLI / shell output | shell and command tool results | Failure lines plus a tail of the output |
| General tool output | Any large tool result (JSON, diffs, …) | JSON → schema + samples; diffs → file/hunk/change summaries |
| Code-search output | content_search, semantic_code_search | Grouped file hits |
For semantic code search, semantic_code_search uses Semble when installed,
falls back to bounded local ripgrep, and finally to a built-in literal scan, so
code search stays available in zero-install environments. See the
Tools overview for the full tool catalog.
Schema and handoff compaction
Section titled “Schema and handoff compaction”Two further reductions shrink the base context that every call carries:
- Tool-schema compaction (
compact_tool_schemas) trims native tool descriptions and JSON-schema metadata before each provider call, and the system prompt omits parameter schemas that are already supplied through the provider’s tool interface. - Terse internal outputs (
terse_internal_outputs) apply a concise handoff contract to operator and sub-agent prompts. SetREVKA_TERSE_INTERNAL_OUTPUTS=0to disable the Python operator side.
Other environment overrides for compression budgets:
REVKA_AGENT_RESULT_MAX_CHARS (operator-side agent last_message budget),
REVKA_WORKFLOW_SKILL_MAX_CHARS, and REVKA_WORKFLOW_SKILL_CONTEXT_MODE
(pointer to send only krefs/paths, full to restore legacy full-inline skill
context).
History pruning
Section titled “History pruning”Distinct from per-turn compression, [agent.history_pruning] is a token-efficiency
pass over the message list itself. It is disabled by default — turn it on
for long-lived sessions or small-context models.
[agent.history_pruning]enabled = falsemax_tokens = 8192keep_recent = 4collapse_tool_results = true| Key | Default | Meaning |
|---|---|---|
enabled | false | Enable history pruning. |
max_tokens | 8192 | Maximum estimated tokens for message history. |
keep_recent | 4 | Keep the N most recent messages untouched. |
collapse_tool_results | true | Collapse old assistant tool-call / tool-result pairs into short summaries. |
System messages and the keep_recent most recent messages are always protected.
When enabled, pruning first collapses old tool-call/result pairs, then drops
older messages until the estimated token total is under max_tokens.
The related keep_tool_context_turns (default 2, in the top-level [agent]
table) controls how many recent turns keep their full tool-call and
tool-result messages in channel history — older turns keep the conversational
text but shed the verbose tool payloads.
Tool filtering (tool_filter_groups)
Section titled “Tool filtering (tool_filter_groups)”When you connect many external MCP servers, sending every
tool schema on every turn is expensive. tool_filter_groups limits which MCP
tool schemas are sent to the LLM per turn. Built-in (non-MCP) tools always pass
through unchanged, and when the list is empty the feature is inactive — all tools
pass through (the backward-compatible default).
Each group is a table:
| Field | Type | Purpose |
|---|---|---|
mode | "always" | "dynamic" | always: include the tool unconditionally. dynamic: include it only when the last user message contains a keyword. |
tools | [string] | Tool name patterns. A single * wildcard is supported (prefix, suffix, or infix), e.g. "mcp_vikunja_*". |
keywords | [string] | Dynamic mode only. Case-insensitive substrings matched against the last user message. |
[agent]# Vikunja task-management MCP tools are always available.[[agent.tool_filter_groups]]mode = "always"tools = ["mcp_vikunja_*"]
# Browser MCP tools are only included when the user mentions browsing.[[agent.tool_filter_groups]]mode = "dynamic"tools = ["mcp_browser_*"]keywords = ["browse", "navigate", "open url", "screenshot"]Tuning recipes
Section titled “Tuning recipes”| Goal | Change |
|---|---|
| Longer autonomous tasks | Raise max_tool_iterations; raise [pacing] message_timeout_scale_max. |
| Faster multi-tool turns | Keep parallel_tools = true (default). |
| Lower token cost on big outputs | Lower max_tool_result_chars and live_tool_result_max_chars; enable tool_filter_groups. |
| Small-context / local model | Set compact_context = true, enable [agent.history_pruning], set [agent.model_context_windows], and tune [pacing]. |
| Tame a runaway loop | Lower max_tool_iterations; see emergency stop. |
For slow or local LLM deployments (Ollama, llama.cpp, vLLM), pair these with the
[pacing] controls — step timeouts and loop detection — described in
Custom providers & local LLMs.
Related pages
Section titled “Related pages”- Autonomy levels & approvals — the policy checks applied to every tool call.
- Sessions & conversation state — how history is scoped and persisted.
- Agents, teams & swarms — multi-agent orchestration above the loop.
- Config: provider, agent & routing — the full
[agent]config reference. - Tools overview — the catalog of callable tools.