LangGraph nailed the abstraction. State graphs are the right primitive for AI agents: explicit, checkpointable, debuggable, deterministic enough to test. The team behind it deserves enormous credit for landing on a paradigm that the rest of the industry is now catching up to.
What they didn’t do — couldn’t do, really, without a major rewrite — is ship a runtime that isn’t Python. The graph executor, the channels, the scheduler, the state diffing all run on the Python event loop, under the GIL. For most workloads that’s fine. For multi-agent workloads with high graph density (Supervisor patterns, Map fan-out, retry loops), the runtime overhead starts to dominate.
So we rebuilt the runtime in Rust. Same API. Same semantics. Same checkpoint formats. 10–15× faster.
Agentmatic in one paragraph: Open-source AI agent framework, drop-in compatible with LangGraph, runs on a Rust engine. pip install agentmatic in Python; native SDKs also for TypeScript, Rust, Go, and Java. Circuit breakers, retry, dead-letter queues, and distributed clusters are in the open-source core — no platform tier required. MIT licensed.
What “drop-in” actually means
Every public method in langgraph.graph.StateGraph exists in agentmatic.StateGraph with the same signature and semantics. add_node, add_edge, add_conditional_edges, compile, invoke, stream, astream, astream_events, get_state, update_state, get_state_history — all there.
# Before
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
# After
from agentmatic import StateGraph, START, END
from agentmatic.checkpoint import MemorySaver
TypedDict state with Annotated[list, operator.add] reducers works. interrupt_before / interrupt_after works. Time travel via get_state_history() works. The Memory and SQLite checkpoint files are wire-compatible — you can read an existing LangGraph checkpoint in Agentmatic with no migration.
Why it’s faster
Three sources, in order of contribution:
1. The Rust scheduler. Lock-free SPSC channels between graph nodes. Work-stealing executor. Zero-copy Arc<Frame> state snapshots with copy-on-write mutation. The Python event loop is no longer the bottleneck — Python only runs when a tool actually executes Python code.
2. Pregel supersteps. Channel messages batch across each barrier. Fewer trips through the FFI boundary. Better cache locality.
3. Memory layout. LangGraph deep-copies state on each step. We don’t.
The benchmark numbers (see /benchmarks for methodology):
| LangGraph | Agentmatic | Ratio | |
|---|---|---|---|
| Graph traversal | baseline | 10–15× | ~12× |
| Channel throughput | baseline | 70–80× | ~75× |
| Memory footprint | baseline | 20–50% | ~3× lower |
| Cold start | baseline | 22% | ~4.5× faster |
End-to-end agent latency depends on graph density. Single-LLM-call agents see modest wins (the LLM call dominates). Multi-agent supervisors with retry loops see 8–12× p95 drops. The fintech support-bot case study is a concrete example.
What you get that LangGraph reserves for the paid Platform
This is the part that surprises people. LangGraph the OSS library is great. But circuit breakers, retry with backoff, dead-letter queues, and distributed execution are all in the LangGraph Platform tier — the hosted SaaS. If you want those primitives self-hosted, you write them yourself or pay.
We put them in the open-source core:
agent = (Agent.builder("production")
.llm(OpenAI())
.tools([search, calculator])
.checkpoint(S3Saver(bucket="agents"))
.circuit_breaker("openai", failure_threshold=5, cooldown_seconds=30)
.retry_policy(RetryPolicy.exponential(max_attempts=3, jitter=True))
.dead_letter_queue(DeadLetterQueue.postgres(POSTGRES_URL))
.interrupt_before(["transfer_funds"])
.build())
Distributed clusters too:
config = ClusterConfig(
topology="coordinator-worker",
transport="grpc",
workers=["worker-1:9090", "worker-2:9090"],
load_balancing="least-loaded",
)
agent = Agent.builder("distributed").cluster(config).build()
All MIT. All self-hosted. All in the open-source core.
Five languages, same engine
The Python SDK is a PyO3 wheel. The TypeScript SDK is napi-rs bindings. Go uses CGO over the agentmatic-ffi C ABI. Java uses JNI over the same. Rust is direct. One engine. Five SDKs. Same correctness.
// Rust
let agent = Agent::builder("math").llm(OpenAI::from_env()?).tools(vec![calc()]).build()?;
// Python
agent = Agent.build("math", llm=OpenAI(), tools=[calculator])
// TypeScript
const agent = Agent.builder('math').llm(new OpenAI()).tools([calculator]).build();
// Go
agent, _ := agentmatic.NewAgent("math", agentmatic.WithLLM(openai.FromEnv()), agentmatic.WithTools(calc))
// Java
Agent agent = Agent.builder("math").llm(new OpenAI()).tools(List.of(calc)).build();
This matters more than it sounds. Polyglot teams stop arguing about “which framework should we standardize on.” Same engine in every language.
Where we don’t compete with LangGraph
- LangSmith. It’s an excellent product. Use it. Wrap an Agentmatic agent with
as_langchain_runnable()and your traces continue to land. - LangChain integrations. 1,000+ tools, embeddings, vector stores. Bridge them all via
from_langchain_tools(). - LCEL. Same —
as_langchain_runnable()makes Agentmatic agents composable with LCEL chains. - LangServe. Same. Wrap and deploy.
We don’t try to replace LangChain. We replace the runtime and give you primitives the runtime should have had.
How to actually try this
pip install agentmatic.- Change one import. Re-run your tests. They should pass.
- Read /benchmarks for the exact numbers on your workload shape.
- If you have a multi-agent supervisor, expect 5–10× p95 latency drops.
- Star the repo if you like what you see.
The 5-minute migration guide has the full step-by-step.