← Blog
Valentin TablanValentin Tablan·April 17, 2026·4 min read

Your Agent's Memory Is a Markdown File. That's a Problem.

Files are the most popular form of agent memory. Here are four things they structurally cannot do.

SparkMemoryAgents

Listen to this article

0:00—:—

Keeps playing as you browse the site.

The most popular form of AI agent memory in 2026 is a markdown file checked into a git repository. CLAUDE.md, .cursorrules, or AGENTS.md the convention varies but the pattern is the same: a text file that tells the agent what it needs to know. Version-controlled, human-readable, easy to edit. Hundreds of thousands of projects use this pattern. And it works ... until it doesn't.

The File Works (At First)

For a single developer working with a single agent, a curated rules file is genuinely useful. You write down your architectural preferences, your naming conventions, the gotchas specific to your codebase, and the agent picks them up. The feedback loop is tight: you notice the agent doing something wrong, you add a rule, the problem goes away. The system is transparent and the person writing the file already knows what matters.

The trouble starts when this approach needs to scale beyond one person and one agent. The properties that make files attractive (simplicity, directness, manual control) are the same properties that make them break down. What follows are four capabilities that a genuine memory system needs, and that files structurally cannot provide.

The Curation Bottleneck

A file only knows what someone explicitly put into it. That means someone has to decide what goes in, write it clearly enough for an agent to act on, and keep it current as things change. For a single developer, that's fine. For a team of twenty, it becomes a governance problem.

Conventions evolve, but the file doesn't update itself. Contradictions accumulate: one section says to use connection pooling, another section, added six months later by a different engineer, says to use individual connections for a specific service. The person who wrote the original rule has moved to another team. Nobody remembers why the rule was there. The agent follows whichever entry it encounters first, or tries to reconcile both, or ignores one silently. None of these outcomes is good.

The deeper issue is that the curation bottleneck gets worse with scale. The more people contribute, the more knowledge exists, and the harder it becomes for any individual to maintain a coherent, non-contradictory, up-to-date file. Manual curation works when the volume of knowledge is modest. At organisational scale, it becomes a full-time job that nobody has time for.

No Trust Signal

A markdown file treats every line as equally authoritative. A rule added two years ago by the team's most experienced architect sits at the same level as a rule added last week by an intern who misunderstood a stack trace. Both are plain text. Both will be followed with equal confidence.

A memory system should be able to distinguish between knowledge that has been validated through repeated use and knowledge that was added speculatively and never tested. Advice that has been followed a hundred times and produced good outcomes is qualitatively different from advice that was added once and never revisited. Files have no mechanism for tracking this. Everything in the file has the same implicit confidence level: "someone thought this was true at some point."

When an agent pulls guidance from a file, it has no way to prioritise. It cannot know that one piece of advice has been battle-tested across dozens of projects while another has never actually been applied. A well-designed memory system maintains a trust signal for each piece of knowledge, based on the evidence available, so that the most reliable knowledge surfaces first.

Passive Knowledge vs. Active Knowledge

This is, I think, the most important limitation, and the one least discussed.

Files are inert. They contain exactly what someone put into them, nothing more. They are a passive store of knowledge that was already articulated by a human being.

A genuine memory system can be active. Consider a concrete scenario: three engineers on different teams, working on unrelated features, independently discover that a particular API behaves unexpectedly under high concurrency. Each encounter is logged as a separate piece of evidence. Individually, each one looks like a minor edge case, annoying, but not worth a rule change. But taken together, they constitute a pattern: this API has a concurrency problem that the documentation doesn't mention. An active memory system can recognise that these three separate observations, from three different people who never spoke to each other, point to the same underlying issue, and surface that pattern as a new piece of organisational knowledge that none of them individually articulated.

No file can do this. Files store what you already know. An active memory system can discover what your organisation collectively knows but hasn't yet realised. The knowledge was latent in the accumulated experience; the memory system makes it explicit. This capability becomes more valuable as the team grows and the volume of evidence increases, precisely the conditions under which files fall apart.

The Noise Problem

Files don't get better with age. They get noisier.

Every addition makes the file larger, which means more tokens consumed from the agent's context window. But there is no mechanism for pruning outdated entries, no relevance decay, no conflict resolution between contradictory rules. The file from three years ago still contains advice about a framework version you stopped using eighteen months ago. An agent doesn't know that, it just sees text, and it follows it.

A well-designed memory system should get more useful as it accumulates experience. Files do the opposite: they accumulate noise at the same rate as signal, and the ratio only gets worse over time.

What Memory Needs to Be

There is a consistent thread running through these four limitations. Files are static, undifferentiated, passive, and they degrade over time. The memory that teams of agents actually need is dynamic, trust-aware, capable of generating new knowledge, and self-maintaining.

As agentic systems mature, and teams move toward workflows where multiple agents collaborate alongside humans, memory becomes infrastructure. I don't think anyone believes that infrastructure should be a text file that someone maintains by hand.

What we need it a living system that learns and evolves alongside the team. This is the problem we have built Spark to solve.

the loop

benchmarks · product · research

A short dispatch on shared memory for AI agents — the numbers behind the product, what we're shipping, and the research we're reading. No filler.

unsubscribe anytime · no spam