The ledger of record for agent memory v0.3.0 · pre-release · MIT

The problem with agent memory

Most agent memory will store anything. True or not.

Verimem is a persistent memory layer whose add() routes every write through an anti-confabulation admission gate — does the source actually entail the fact? — and whose search() returns provenance on every read. A hippocampus with a notary at the door.

100% free & open source — MIT license. Self-hosted, local-first: no account, no API key, no billing.

Admittedgrounding 0.97
Candidate write
"The deployment uses PostgreSQL 16."
source ⊢ fact — the cited release note entails the claim.
status ADMITTED · stored with provenance.
§01 The admission gate

Other memory layers store whatever their extractor emits. Verimem doesn't.

On write, a candidate fact is admitted, downgraded, or refused — decided by whether its cited source actually entails it. A cheap, no-LLM lexical screen first downgrades unsupported "it works / verified / done" claims; the strongest mode adds a source⊢fact entailment check. Measured on SNLI it reaches AUROC 0.971, and that number is judge-independent.

A structured review of mem0, Zep, Letta, Cognee and MemOS found that none of them ship a write-admission gate. That gate — plus the provenance every read carries back — is the whole point.

Illustrative — the gate in three verdicts

Admitted0.97
Write · entailed
"The deployment uses PostgreSQL 16."
ADMITTED · stored with provenance
Downgraded0.41
Write · unsupported
"The migration is done and everything works."
DOWNGRADED · kept, flagged low-trust
Refused0.08
Write · contradicted
"The API rate limit is 10,000 req/s."
REFUSED · source says 1,000 — not stored

Provenance on every read

Reads don't just return text — they return each fact's status and write-time grounding_score, so your code can trust-condition instead of trusting blindly. And update() never destroys the old fact: it supersedes it, leaving an auditable history() trail.

memory.py · the write goes through the gate, the read returns provenance

from engram import Memory

mem = Memory()                                  # local SQLite, offline
mem.add("The deployment uses PostgreSQL 16.")   # write goes THROUGH the gate

for hit in mem.search("which database?"):       # read returns provenance
    print(hit["text"], hit["status"], hit["grounding_score"])
# -> The deployment uses PostgreSQL 16.  ADMITTED  0.97
§02 Evidence — measured, not claimed
MetricResultOn the record
Write-gate entailment AUROC 0.971 Source⊢fact, judge-independent (SNLI). The write-path moat.
Downstream hallucination 95.9% → 12.2% −83.7 pp; McNemar p≈6e-44, replicated on 2 seeds. It works by converting confabulation into abstention (omission 3%→85%), not by raising correctness — and this arm is judge-coupled.
Clean-fact over-rejection 30–39% The honest cost at the strict threshold. The distilled local gate (v2) admits 0.924 of clean facts, largely closing this.
Retrieval recall@5 · LongMemEval-s 0.8745 Full 500, judge-free, same e5 embedder, zero external APIs. Fusion ON vs 0.8525 OFF (+2.2 pp). An earlier n=300 read 0.909 — optimistic; the full-500 number is the honest one. This is recall@k, not end-to-end QA accuracy.
Italian retrieval MRR +52% 0.466 → 0.710 on the e5 embedder flip, at zero English regression.
Contradiction detection · HaluMem TPR 0.66 / FPR 0.0125 After the temporal-supersession fix.
Test suite 5,830 passing 764 test files, ~84k LOC. Self-run, reproducible from the repo.

Every figure here is self-run and reproducible from the repository — it is not a third-party leaderboard placement. Retrieval numbers are recall@k, not the end-to-end QA accuracy that Mem0 and Zep headline, so they are not directly comparable.

§03 The difference — an honest combination

01Write-admission gate

Admit, downgrade or refuse each write by source entailment. No competitor here ships one.

02Provenance on read

Every fact returns status + grounding_score; update() supersedes, history() stays auditable.

03Bi-temporal valid-time

valid_until hard-expire — a memory that knows when a fact stopped being true.

04Runs on your subscription

Hosted MCP mode: no API key, no per-token billing — the host's LLM does the work.

05Multilingual injection screen

A write-time prompt-injection screen — Italian included, not just English.

06MCP-native

228 memory tools at session start in Claude Code, Cursor, Cline, Continue, Zed.

SystemWrite gateProvenance on readApproachMaturity
Mem0 Flat vector + LLM summary Established, widely adopted
Zep / Graphiti partial (temporal) Temporal knowledge graph Commercial, mature
HippoRAG OpenIE + PageRank Research
Verimem Fusion recall + gate + sleep consolidation Brand-new · 0 adoption yet

Competitor scores and adoption drift by source and over time; the two columns we stand on — a write gate and provenance on read — come from a structured competitor review, not a marketing table.

§04 What Verimem is not
on the record
  • 0 adoption.

    A brand-new public release. Nobody depends on it yet — including us, beyond the maintainer's daily driver.

  • Not on PyPI.

    Install from source / git for now. A engram-memory package is a post-v0.4.0 target.

  • Self-run benchmarks.

    Every number is reproducible from the repo, but none is third-party audited. Treat them as reproducible, not certified.

  • Single-node SQLite.

    Single-user and local. Multi-tenant scoping exists on the fact surface; a hosted / distributed store does not.

  • Regex-tier entity graph.

    The entity+PageRank engine is live on real data but the extractor is deterministic regex, run as a backfill — not yet wired into the live write path.

  • Answer-path unproven.

    An adversarial review found the first answer-conditioning demo inflated. Only the write-path moat stands on its own evidence — so that is all we claim.

§05 Install in 2 minutes

① Python — from source

pip install "git+https://github.com/aureliocpr-ctrl/verimem.git"

Provides the engram command and the from engram import Memory SDK. Local SQLite, offline, gate on by default.

github.com/aureliocpr-ctrl/verimem ↗ · MIT · README · BENCHMARKS.md

② As an MCP server — Claude Code, Cursor, Cline, Zed

{
  "mcpServers": {
    "engram": {
      "command": "engram",
      "args": ["mcp"],
      "env": { "ENGRAM_HOSTED": "1" }
    }
  }
}

Restart your host; the memory tools become callable with zero API key. The engine currently ships as engram — the rename to Verimem is in progress.