Response cache, hardware RAG & isolation
The two-tier response cache, the local hardware datasheet RAG index, namespace isolation, and GDPR export.
Beyond the Kumiho graph, Revka keeps a few local, per-workspace memory facilities that have nothing to do with the cloud control plane: an opt-in LLM response cache that avoids re-billing for identical prompts, a hardware-datasheet RAG index for board pin lookups, and the policy/isolation machinery that keeps one agent’s — or one client’s — memory from bleeding into another’s. This page covers all four, plus the Memory::export path used for GDPR data portability.
Read Memory overview first for the [memory] section as a whole and why durable memory lives in Kumiho. Everything here lives under [memory] (and [memory.policy]) in ~/.revka/config.toml, except where noted.
LLM response cache (ResponseCache)
Section titled “LLM response cache (ResponseCache)”The response cache is a two-tier cache that deduplicates identical LLM prompt-plus-model combinations so you aren’t billed twice for the same query. It is off by default — opt in.
- Hot tier — an in-memory LRU of the most recently used entries.
- Warm tier — a WAL-mode SQLite database at
<workspace_dir>/memory/response_cache.db.
Each entry is keyed by the SHA-256 of (model || system_prompt || user_prompt). On a miss, Revka looks in SQLite and promotes the hit into the in-memory tier; on a hit, it increments the entry’s hit_count and updates accessed_at.
Enable and tune it
Section titled “Enable and tune it”[memory]response_cache_enabled = true # default: false (opt-in)response_cache_ttl_minutes = 60 # entry expiryresponse_cache_max_entries = 5000 # SQLite warm-tier cap before LRU evictionresponse_cache_hot_entries = 256 # in-memory hot-tier size| Key | Type | Default | Meaning |
|---|---|---|---|
response_cache_enabled | bool | false | Master switch. Must be true for any caching. |
response_cache_ttl_minutes | int | 60 | Time-to-live per entry, in minutes. |
response_cache_max_entries | int | 5000 | Warm-tier (SQLite) cap. Beyond this, LRU eviction applies. |
response_cache_hot_entries | int | 256 | In-memory hot-tier size. |
The cache tracks statistics internally — total entries, total hits, and tokens saved — though those figures are not yet surfaced through the CLI.
Because the database path is under <workspace_dir>, the cache is per-workspace: with workspace isolation enabled, each client gets its own response_cache.db and cannot read another’s cached responses.
Hardware RAG (HardwareRag)
Section titled “Hardware RAG (HardwareRag)”HardwareRag is a local retrieval index over your hardware datasheets. When the agent answers a hardware question (a pin lookup, a board capability), Revka retrieves matching datasheet content and pin aliases and injects them into the agent’s context. It is entirely local — no datasheet text leaves the machine.
It loads .md and .txt files (and .pdf with the rag-pdf feature) from a configured directory, scores them by keyword overlap, and boosts chunks tagged for the board you’re asking about.
Set up the datasheet directory
Section titled “Set up the datasheet directory”Point a peripheral at a datasheet directory relative to your workspace:
[[peripherals]]datasheet_dir = "datasheets" # relative to the workspace dirLay out one file per board. The filename (minus extension) becomes the board tag:
workspace/ datasheets/ nucleo-f401re.md # board tag = "nucleo-f401re" rpi-gpio.txt # board tag = "rpi-gpio" generic.md # no board tag — matches all queriesA query scoped to one or more boards (boards: &[String]) gives matching chunks a +2.0 score boost. Files named generic* or placed in a _generic/ subdirectory carry no board tag and match every query, so use them for cross-board context.
If datasheet_dir is absent or the directory doesn’t exist, RAG returns empty results silently — it never errors.
Pin-alias tables
Section titled “Pin-alias tables”A datasheet can declare named pin aliases so the agent can resolve red_led to a pin number without guessing. Pin-alias context is built separately from chunk retrieval, so aliases are always available for matching boards. Two formats are accepted.
As key-value lines under a Pin Aliases heading:
## Pin Aliasesred_led: 13builtin_led: 13Or as a Markdown table:
## Pin Aliases| alias | pin ||----------|-----|| red_led | 13 |The rag-pdf feature
Section titled “The rag-pdf feature”PDF ingestion is gated behind the rag-pdf Cargo feature, which pulls in the pdf_extract crate:
cargo build --features hardware,rag-pdfWithout rag-pdf, .pdf datasheets are skipped by the index, and the datasheet tool’s read action returns the file path for manual reference instead of extracted text. The companion config flag [hardware].workspace_datasheets = true indexes workspace PDFs for these RAG-based pin lookups.
compact_context for small models
Section titled “compact_context for small models”If you run a small local model, set compact_context in [agent]. Among other context-shrinking effects, it caps the RAG chunk limit at 2 so datasheet retrieval doesn’t crowd out a tight context window:
[agent]compact_context = true # recommended for models ≤13BSee Custom providers & local LLMs for the broader small-model tuning picture, and Aardvark I2C/SPI/GPIO & datasheets for the datasheet download workflow that feeds this index.
Namespace isolation
Section titled “Namespace isolation”Every memory entry carries a namespace field that isolates entries between agents and contexts. Entries that don’t specify one fall into default_namespace. The memory policy can then cap how many entries a namespace or category may hold, mark namespaces read-only, and override retention per category.
[memory]default_namespace = "default"
[memory.policy]max_entries_per_namespace = 1000max_entries_per_category = 0 # 0 = unlimitedread_only_namespaces = ["system_facts"]retention_days_by_category = { core = 365, daily = 30, conversation = 7 }| Key | Type | Default | Meaning |
|---|---|---|---|
default_namespace | string | "default" | Namespace assigned to entries that don’t specify one. |
max_entries_per_namespace | int | 0 | Cap per namespace. 0 = unlimited. |
max_entries_per_category | int | 0 | Cap per category. 0 = unlimited. |
read_only_namespaces | array | [] | Writes to these namespaces are rejected. |
retention_days_by_category | table | unset | Per-category retention override (keyed by category name). |
Multi-client workspace memory isolation
Section titled “Multi-client workspace memory isolation”When you run one Revka instance for multiple clients, workspace isolation gives each client engagement a separate memory database under <workspaces_dir>/<client>/memory/. A request scoped to one workspace cannot reach another’s entries — this is the hard boundary that namespaces alone don’t provide.
[workspace]enabled = trueisolate_memory = true # default: true when workspaces are enabledcross_workspace_search = false # security default — no cross-tenant reads| Key | Type | Default | Meaning |
|---|---|---|---|
isolate_memory | bool | true | Separate memory database per workspace. |
cross_workspace_search | bool | false | Allow reads across workspaces. Leave false to prevent memory bleed between clients. |
Because isolation works at the database-file level, it also separates the response cache: each workspace gets its own memory/response_cache.db. See Config: gateway, memory, security & platform for the full [workspace] schema and per-profile settings.
GDPR data portability export
Section titled “GDPR data portability export”The Memory trait’s export method supports a bulk, filtered export of memory entries for GDPR Article 20 data portability. It returns entries ordered by creation time (ascending), with embeddings excluded, filtered by namespace, session, category, and time range.
The filter shape (ExportFilter):
| Field | Type | Meaning |
|---|---|---|
namespace | Option<String> | Restrict to one namespace. |
session_id | Option<String> | Restrict to one session. |
category | Option<MemoryCategory> | Restrict to core, daily, conversation, or a custom label. |
since | Option<String> | RFC 3339 lower bound (inclusive on timestamp). |
until | Option<String> | RFC 3339 upper bound (inclusive on timestamp). |
let filter = ExportFilter { namespace: Some("default".to_string()), session_id: None, category: Some(MemoryCategory::Core), since: Some("2026-01-01T00:00:00Z".to_string()), until: Some("2026-12-31T23:59:59Z".to_string()),};let entries = memory.export(&filter).await?;The default trait implementation delegates to list() plus client-side filtering; backends with native query support override it for efficiency.
Related pages
Section titled “Related pages”- Memory overview — the full
[memory]section, NoneMemory binding, categories, and decay. - Kumiho setup — install the sidecar and choose cloud vs Community Edition.
- Graph model: spaces, items & provenance — the Kumiho data model behind durable memory.
- Aardvark I2C/SPI/GPIO & datasheets — the
datasheettool that feeds the hardware RAG index. - Config: gateway, memory, security & platform — full
[memory],[memory.policy], and[workspace]schema. - revka memory & estop — inspect and manage memory from the CLI.