The problem with agent memory

Most agent memory will store anything. True or not.

Verimem is a persistent memory layer whose add() routes every write through an anti-confabulation admission gate — does the source actually entail the fact? — and whose search() returns provenance on every read. A hippocampus with a notary at the door.

Install in 2 minutes → See the evidence GitHub ↗

✓ 100% free & open source — MIT license. Self-hosted, local-first: no account, no API key, no billing.

Admittedgrounding 0.97

Candidate write

"The deployment uses PostgreSQL 16."

source ⊢ fact — the cited release note entails the claim.
status ADMITTED · stored with provenance.

§01 The admission gate

Other memory layers store whatever their extractor emits. Verimem doesn't.

On write, a candidate fact is admitted, downgraded, or refused — decided by whether its cited source actually entails it. A cheap, no-LLM lexical screen first downgrades unsupported "it works / verified / done" claims; the strongest mode adds a source⊢fact entailment check. Measured on SNLI it reaches AUROC 0.971, and that number is judge-independent.

A structured review of mem0, Zep, Letta, Cognee and MemOS found that none of them ship a write-admission gate. That gate — plus the provenance every read carries back — is the whole point.

Illustrative — the gate in three verdicts

Admitted0.97

Write · entailed

"The deployment uses PostgreSQL 16."

ADMITTED · stored with provenance

Downgraded0.41

Write · unsupported

"The migration is done and everything works."

DOWNGRADED · kept, flagged low-trust

Refused0.08

Write · contradicted

"The API rate limit is 10,000 req/s."

REFUSED · source says 1,000 — not stored

Provenance on every read

Reads don't just return text — they return each fact's status and write-time grounding_score, so your code can trust-condition instead of trusting blindly. And update() never destroys the old fact: it supersedes it, leaving an auditable history() trail.

memory.py · the write goes through the gate, the read returns provenance

from engram import Memory

mem = Memory()                                  # local SQLite, offline
mem.add("The deployment uses PostgreSQL 16.")   # write goes THROUGH the gate

for hit in mem.search("which database?"):       # read returns provenance
    print(hit["text"], hit["status"], hit["grounding_score"])
# -> The deployment uses PostgreSQL 16.  ADMITTED  0.97

§02 Evidence — measured, not claimed

Metric	Result	On the record
Write-gate entailment	AUROC 0.971	Source⊢fact, judge-independent (SNLI). The write-path moat.
Downstream hallucination	95.9% → 12.2%	−83.7 pp; McNemar p≈6e-44, replicated on 2 seeds. It works by converting confabulation into abstention (omission 3%→85%), not by raising correctness — and this arm is judge-coupled.
Clean-fact over-rejection	30–39%	The honest cost at the strict threshold. The distilled local gate (v2) admits 0.924 of clean facts, largely closing this.
Retrieval recall@5 · LongMemEval-s	0.8745	Full 500, judge-free, same e5 embedder, zero external APIs. Fusion ON vs 0.8525 OFF (+2.2 pp). An earlier n=300 read 0.909 — optimistic; the full-500 number is the honest one. This is recall@k, not end-to-end QA accuracy.
Italian retrieval	MRR +52%	0.466 → 0.710 on the e5 embedder flip, at zero English regression.
Contradiction detection · HaluMem	TPR 0.66 / FPR 0.0125	After the temporal-supersession fix.
Test suite	5,830 passing	764 test files, ~84k LOC. Self-run, reproducible from the repo.

Every figure here is self-run and reproducible from the repository — it is not a third-party leaderboard placement. Retrieval numbers are recall@k, not the end-to-end QA accuracy that Mem0 and Zep headline, so they are not directly comparable.

§03 The difference — an honest combination

01Write-admission gate

Admit, downgrade or refuse each write by source entailment. No competitor here ships one.

02Provenance on read

Every fact returns status + grounding_score; update() supersedes, history() stays auditable.

03Bi-temporal valid-time

valid_until hard-expire — a memory that knows when a fact stopped being true.

04Runs on your subscription

Hosted MCP mode: no API key, no per-token billing — the host's LLM does the work.

05Multilingual injection screen

A write-time prompt-injection screen — Italian included, not just English.

06MCP-native

228 memory tools at session start in Claude Code, Cursor, Cline, Continue, Zed.

System	Write gate	Provenance on read	Approach	Maturity
Mem0	✗	✗	Flat vector + LLM summary	Established, widely adopted
Zep / Graphiti	✗	partial (temporal)	Temporal knowledge graph	Commercial, mature
HippoRAG	✗	✗	OpenIE + PageRank	Research
Verimem	✓	✓	Fusion recall + gate + sleep consolidation	Brand-new · 0 adoption yet

Competitor scores and adoption drift by source and over time; the two columns we stand on — a write gate and provenance on read — come from a structured competitor review, not a marketing table.

§04 What Verimem is not

on the record

0 adoption.
A brand-new public release. Nobody depends on it yet — including us, beyond the maintainer's daily driver.
Not on PyPI.
Install from source / git for now. A engram-memory package is a post-v0.4.0 target.
Self-run benchmarks.
Every number is reproducible from the repo, but none is third-party audited. Treat them as reproducible, not certified.
Single-node SQLite.
Single-user and local. Multi-tenant scoping exists on the fact surface; a hosted / distributed store does not.
Regex-tier entity graph.
The entity+PageRank engine is live on real data but the extractor is deterministic regex, run as a backfill — not yet wired into the live write path.
Answer-path unproven.
An adversarial review found the first answer-conditioning demo inflated. Only the write-path moat stands on its own evidence — so that is all we claim.

§05 Install in 2 minutes

① Python — from source

pip install "git+https://github.com/aureliocpr-ctrl/verimem.git"

Provides the engram command and the from engram import Memory SDK. Local SQLite, offline, gate on by default.

github.com/aureliocpr-ctrl/verimem ↗ · MIT · README · BENCHMARKS.md

② As an MCP server — Claude Code, Cursor, Cline, Zed

{
  "mcpServers": {
    "engram": {
      "command": "engram",
      "args": ["mcp"],
      "env": { "ENGRAM_HOSTED": "1" }
    }
  }
}

Restart your host; the memory tools become callable with zero API key. The engine currently ships as engram — the rename to Verimem is in progress.