Performance

Numbers, not claims.

We ran Agentmatic against LangGraph on the same graphs, same LLM mocks, same hardware. Here's what came out.

Summary. Across ReAct, Supervisor, RAG, and Map fan-out workloads at 100–10,000 nodes:

  • 10–15× faster graph traversal
  • 70–80× faster channel message throughput
  • 50–80% lower memory footprint
  • 78% faster cold start (Python SDK includes the prebuilt PyO3 wheel)
Graph traversalops/ms
agentmatic
100.0
LangGraph
8.0
Channel throughputmsgs/ms
agentmatic
100.0
LangGraph
1.3
Memory footprint% of LangGraph
agentmatic
38.0
LangGraph
100.0
Cold start% of LangGraph
agentmatic
22.0
LangGraph
100.0

Numbers from the internal benchmark suite (graphs ranging 100 nodes → 10k nodes, mixed ReAct + Supervisor workloads). Higher is better for the first two; lower is better for the last two.

Methodology

Why the gap

Three sources, in order of contribution:

  1. Rust scheduler. Lock-free SPSC channels, work-stealing scheduler, zero-copy state diffs. The Python GIL is gone for the runtime; only the actual tool call enters Python.
  2. Pregel-style supersteps. Channel-message batching across each barrier — fewer trips through the FFI boundary.
  3. Memory layout. State snapshots are Arc<Frame> with copy-on-write semantics. LangGraph deep-copies on each step.

What it doesn't measure

These benchmarks measure framework overhead. The dominant cost in any real agent is still the LLM call. Your end-to-end latency improvement depends on how many graph steps you take per LLM call; high-graph-density workloads (Supervisor patterns, Map fan-out, retry loops) see the biggest wins — typically 8–12× in production.

Reproducing the numbers

git clone https://github.com/neul-labs/agentmatic
cd agentmatic/benchmarks
python bench.py --all --output bench.json
python bench.py --report bench.json

Ship your next agent in minutes, not weeks.

MIT licensed. Drop-in for LangGraph. Native SDKs in 5 languages. Battle-tested resilience primitives in the box.