Smarter Together: Our New Paper Shows How Shared Memory Lifts All Models

In a previous post, I talked about how the next frontier for AI agents is collective continual learning. Today, I'm excited to share that we've published our first evaluation paper, "Smarter Together: Creating Agentic Communities of Practice through Shared Experiential Learning," which is now available on arXiv.

This paper provides the first comprehensive data on our active memory layer, Spark, when used as an aid to AI coding agents. The results are incredibly validating: we demonstrated that access to a shared, active memory benefits models of all sizes and capability tiers.

Even the largest, most capable models, like GPT-5 Codex, which already have extensive knowledge of the Python data science domain, see a statistically significant lift in code quality when given access to Spark's experiential memory. This confirms that no matter how powerful a base model is, it can't know everything—and it benefits from learning from a curated repository of past experiences.

Small Models, Big Performance

In my opinion, the most interesting result from the paper is how Spark transforms the performance of smaller models.

We found that with Spark's help, a relatively small 30-billion parameter open-weights model (Qwen3-Coder-30B-A3B-Instruct) was able to achieve a code quality score on par with much larger, state-of-the-art commercial models.

This finding is a powerful validation of the "cognitive core" concept I've discussed before. It proves that we can externalise the job of remembering from the LLM, allowing it to focus on its core cognitive and reasoning skills. The base model doesn't need to be a massive library that has memorised every possible solution; it just needs to be a smart processor that can leverage an external, active memory of curated experiences.

This has profound implications for how agentic AI will be deployed.

For the Enterprise: Many companies prefer to run agentic workflows entirely within their own infrastructure for security and privacy. This usually means relying on open-weights models. The historical trade-off has been sacrificing SOTA performance for this control. Our research shows that Spark effectively closes that performance gap. Enterprises can now use smaller, manageable open-weights models on-prem and still achieve state-of-the-art results.
For the Developer (and the Edge): This aligns perfectly with the "value at the edges" thesis. By increasing the utility of smaller models, we make them far more practical for edge deployment. An even more immediate use case is for individual developers. You can now run a smaller, open-weights model locally on your own laptop, connect it to Spark, and get world-class coding assistance—all while saving significant token costs and avoiding network latency.

Join the Spark Alpha

This paper validates our core thesis: the future of AI isn't just about building bigger models, it's about building smarter, learning systems. Active, shared memory is the foundational layer for that future.

And on that note, we are officially opening access to the Spark alpha. If you're building with coding agents and want to give them a memory that learns, we want to hear from you.

Join the waitlist on our website

We're just getting started.

Smarter Together: Our New Paper Shows How Shared Memory Lifts All Models

Small Models, Big Performance

Join the Spark Alpha

More Articles

AI-Generated Code Might Be More Maintainable Than You Think

Why Shared Memory Matters Even for "Solved" Problems

Why AI Struggles with Your Legacy Code