Academic Research

Research Behind Agentverse Memory

Agentverse Memory is built on original research in stigmergic retrieval, graph-native agent memory, and multi-agent benchmark design. Papers in preparation for ICLR 2027, AAMAS 2027, and ACL/EMNLP 2027.

🧬
Novel Contribution
First stigmergic pheromone dynamics in LLM knowledge graphs
📊
Primary Benchmark
First graph-system to publish BEAM results (target: >75% BEAM 1M)
🎯
Target Venue
ICLR 2027 (submit Sep/Oct 2026) + AAMAS 2027

Papers in Preparation

Original research under active development. Benchmark results pending (BEAM suite ~$52 to run).

Paper 4 — PrimaryOutline Complete — Running Benchmarks

PheromGraph: Stigmergic Retrieval for Persistent Agent Memory

Agentverse Memory Research Team

ICLR 2027 (primary) | NeurIPS 2027 (fallback) — Submit Sep/Oct 2026

Long-term memory for LLM agents requires systems that remain precise at million-token scale — a challenge no existing graph-based system has publicly met. We present PheromGraph, a persistent agent memory system that applies stigmergic pheromone dynamics to a knowledge graph, enabling retrieval quality to evolve through use. PheromGraph deposits five typed pheromone traces (episodic, semantic, procedural, working, meta) on nodes and edges during every retrieval, uses exponential decay with type-specific time constants τ, and traverses the graph using A* with a composite cost function combining semantic similarity, pheromone weight, and structural proximity.

  • â–¸First formalization of stigmergic pheromone dynamics in LLM agent knowledge graphs — no prior art in production systems
  • â–¸5-type pheromone taxonomy with type-differentiated exponential decay (Ï„ per memory type)
  • â–¸A* semantic pathfinding with pheromone-weighted composite cost: semantic (40%) + pheromone (30%) + structural (30%)
  • â–¸Closest prior work (arXiv:2512.10166, Dec 2024) uses grid simulations only — no LLMs, no semantics, no knowledge graphs
  • â–¸Primary benchmark: BEAM (first graph-system to publish results) — target >75% BEAM 1M, >64.1% BEAM 10M (vs Hindsight SOTA 73.9% / 64.1%)
  • â–¸Supporting: LongMemEval_S target ≥94%; LME-V2 LAFS metric rewards our <20ms retrieval directly
  • â–¸Code to be open-sourced on submission: github.com/fetchai/agentverse-memory
[PheromGraph 2026 — citation pending]
Paper 5Planned — Q3 2026

Memory as Inference: Free Energy Minimization for Agent Memory Systems

Agentverse Memory Research Team

AAMAS 2027 | October/November 2026 submission

We reformulate agent memory as an active inference problem under the Free Energy Principle: storing a memory minimizes surprise about future queries; retrieving a memory minimizes surprise about the current context. This framing unifies episodic, semantic, procedural, and working memory under a single variational objective, explains pheromone dynamics as a precision-weighting mechanism, and yields novel predictions about optimal memory consolidation schedules.

  • â–¸Free Energy Principle (Friston 2010) applied to LLM agent memory — novel theoretical framing
  • â–¸Unifies all 4 memory types under a single variational objective
  • â–¸Explains pheromone dynamics as precision-weighting in predictive coding terms
  • â–¸Companion paper to PheromGraph — shares experimental infrastructure
[EFE-Memory 2026 — citation pending]
Paper 6 — BenchmarkPlanned — 2027

MAS-MemEval: A Benchmark for Multi-Agent Shared Memory Systems

Agentverse Memory Research Team

ACL/EMNLP 2027 (first multi-agent shared memory benchmark)

No benchmark currently measures the correctness and efficiency of shared memory between cooperating LLM agents. MAS-MemEval fills this gap: 500 multi-agent scenarios across 5 task types (collaborative research, orchestrator-worker task delegation, adversarial privacy, temporal consistency across agents, and knowledge synthesis). We establish baselines for Agentverse Memory shared spaces, vector-only systems, and no-shared-memory ablations.

  • â–¸First benchmark for multi-agent shared memory — gap confirmed, no prior work exists
  • â–¸500 scenarios × 5 task types = 2,500 evaluation instances
  • â–¸Includes adversarial privacy scenarios (agent A should not access agent B's private memories)
  • â–¸Temporal consistency task: does shared memory stay coherent when two agents update the same fact concurrently?
  • â–¸Agentverse Memory is the reference implementation (shared spaces feature, Builder plan+)
[MAS-MemEval 2027 — citation pending]

Related Preprints

Earlier work in the Agentverse research program that underpins Agentverse Memory.

Taxonomy

Agentverse-2030: A Gap Taxonomy for Large-Scale Multi-Agent Systems

Preprint — Submission Pending

Systematic taxonomy of 14 capability gaps blocking production deployment of multi-agent systems at scale. Memory and coordination are the top-ranked gaps. Forms the motivating background for the Agentverse Memory product.

Architecture

MemPalace: Hierarchical Memory Architecture for Persistent AI Agents

Preprint — Submission Pending

Introduces the Palace–Wing–Room–Closet memory hierarchy. Each agent has a Memory Palace organized into semantic domains (Wings), topic clusters (Rooms), and fine-grained memory items (Closets). Provides the conceptual architecture implemented in Agentverse Memory.

Engine

GraphPalace: A LadybugDB-Native Graph Engine for Agent Memory

Preprint — Submission Pending

Technical description of the GraphPalace engine: LadybugDB embedded graph DB with temporal validity (valid_at/invalid_at), BM25+HNSW hybrid retrieval merged via Reciprocal Rank Fusion, and pheromone-weighted A* graph traversal. The production implementation behind Agentverse Memory.

Benchmark Strategy

BEAM (ICLR 2026) is the primary benchmark target. No graph-based memory system has ever published BEAM results. This is a significant white-space opportunity.

BenchmarkOur TargetSOTAStatus
BEAM 1M>75%73.9% (Hindsight)Pending ($15)
BEAM 10M>64.1%64.1% (Hindsight)Pending ($30)
LongMemEval_S≥94%95.4% (OMEGA)Pending ($5)
LME-V2 (LAFS)Top-tierNew benchmark (2026)Dataset ready
MemoryArenaTBDBaseline onlyPlanned

BEAM dataset: HuggingFace Mohammadta/BEAM and Mohammadta/BEAM-10M. 100 conversations, 2,000 questions per scale. LLM-as-judge evaluation.

Interested in Collaborating?

We're looking for research collaborators for MAS-MemEval and the EFE memory paper. If you work on agent memory, knowledge graphs, or multi-agent systems, reach out.