01Write-admission gate
Admit, downgrade or refuse each write by source entailment. No competitor here ships one.
The problem with agent memory
Verimem is a persistent memory layer whose add() routes every write through an anti-confabulation admission gate — does the source actually entail the fact? — and whose search() returns provenance on every read. A hippocampus with a notary at the door.
✓ 100% free & open source — MIT license. Self-hosted, local-first: no account, no API key, no billing.
Other memory layers store whatever their extractor emits. Verimem doesn't.
On write, a candidate fact is admitted, downgraded, or refused — decided by whether its cited source actually entails it. A cheap, no-LLM lexical screen first downgrades unsupported "it works / verified / done" claims; the strongest mode adds a source⊢fact entailment check. Measured on SNLI it reaches AUROC 0.971, and that number is judge-independent.
A structured review of mem0, Zep, Letta, Cognee and MemOS found that none of them ship a write-admission gate. That gate — plus the provenance every read carries back — is the whole point.
Illustrative — the gate in three verdicts
Reads don't just return text — they return each fact's status and write-time grounding_score, so your code can trust-condition instead of trusting blindly. And update() never destroys the old fact: it supersedes it, leaving an auditable history() trail.
memory.py · the write goes through the gate, the read returns provenance
from engram import Memory
mem = Memory() # local SQLite, offline
mem.add("The deployment uses PostgreSQL 16.") # write goes THROUGH the gate
for hit in mem.search("which database?"): # read returns provenance
print(hit["text"], hit["status"], hit["grounding_score"])
# -> The deployment uses PostgreSQL 16. ADMITTED 0.97 | Metric | Result | On the record |
|---|---|---|
| Write-gate entailment | AUROC 0.971 | Source⊢fact, judge-independent (SNLI). The write-path moat. |
| Downstream hallucination | 95.9% → 12.2% | −83.7 pp; McNemar p≈6e-44, replicated on 2 seeds. It works by converting confabulation into abstention (omission 3%→85%), not by raising correctness — and this arm is judge-coupled. |
| Clean-fact over-rejection | 30–39% | The honest cost at the strict threshold. The distilled local gate (v2) admits 0.924 of clean facts, largely closing this. |
| Retrieval recall@5 · LongMemEval-s | 0.8745 | Full 500, judge-free, same e5 embedder, zero external APIs. Fusion ON vs 0.8525 OFF (+2.2 pp). An earlier n=300 read 0.909 — optimistic; the full-500 number is the honest one. This is recall@k, not end-to-end QA accuracy. |
| Italian retrieval | MRR +52% | 0.466 → 0.710 on the e5 embedder flip, at zero English regression. |
| Contradiction detection · HaluMem | TPR 0.66 / FPR 0.0125 | After the temporal-supersession fix. |
| Test suite | 5,830 passing | 764 test files, ~84k LOC. Self-run, reproducible from the repo. |
Every figure here is self-run and reproducible from the repository — it is not a third-party leaderboard placement. Retrieval numbers are recall@k, not the end-to-end QA accuracy that Mem0 and Zep headline, so they are not directly comparable.
Admit, downgrade or refuse each write by source entailment. No competitor here ships one.
Every fact returns status + grounding_score; update() supersedes, history() stays auditable.
valid_until hard-expire — a memory that knows when a fact stopped being true.
Hosted MCP mode: no API key, no per-token billing — the host's LLM does the work.
A write-time prompt-injection screen — Italian included, not just English.
228 memory tools at session start in Claude Code, Cursor, Cline, Continue, Zed.
| System | Write gate | Provenance on read | Approach | Maturity |
|---|---|---|---|---|
| Mem0 | ✗ | ✗ | Flat vector + LLM summary | Established, widely adopted |
| Zep / Graphiti | ✗ | partial (temporal) | Temporal knowledge graph | Commercial, mature |
| HippoRAG | ✗ | ✗ | OpenIE + PageRank | Research |
| Verimem | ✓ | ✓ | Fusion recall + gate + sleep consolidation | Brand-new · 0 adoption yet |
Competitor scores and adoption drift by source and over time; the two columns we stand on — a write gate and provenance on read — come from a structured competitor review, not a marketing table.
A brand-new public release. Nobody depends on it yet — including us, beyond the maintainer's daily driver.
Install from source / git for now. A engram-memory package is a post-v0.4.0 target.
Every number is reproducible from the repo, but none is third-party audited. Treat them as reproducible, not certified.
Single-user and local. Multi-tenant scoping exists on the fact surface; a hosted / distributed store does not.
The entity+PageRank engine is live on real data but the extractor is deterministic regex, run as a backfill — not yet wired into the live write path.
An adversarial review found the first answer-conditioning demo inflated. Only the write-path moat stands on its own evidence — so that is all we claim.
① Python — from source
pip install "git+https://github.com/aureliocpr-ctrl/verimem.git" Provides the engram command and the from engram import Memory SDK. Local SQLite, offline, gate on by default.
github.com/aureliocpr-ctrl/verimem ↗ · MIT · README · BENCHMARKS.md
② As an MCP server — Claude Code, Cursor, Cline, Zed
{
"mcpServers": {
"engram": {
"command": "engram",
"args": ["mcp"],
"env": { "ENGRAM_HOSTED": "1" }
}
}
} Restart your host; the memory tools become callable with zero API key. The engine currently ships as engram — the rename to Verimem is in progress.