Consumer fintech · 120 engineers

Cut support p95 latency from 9.2 s to 1.4 s — same prompts, same tools, same model.

A US consumer fintech replaced their LangGraph supervisor with Agentmatic. The graph, prompts, and tools stayed identical; only the runtime changed. Latency dropped 6.5×.

Customer support agent · Published May 15, 2026

Outcomes

Support p95 latency 9.2 s → 1.4 s on the same OpenAI model.
Compute spend on the support service down 38% (smaller event-loop, fewer worker pods).
Zero customer-facing regressions across 4,200 conversations in the first week.
Migration took one engineer 90 minutes total.

The setup

A growing fintech with a customer support agent built on LangGraph: supervisor routes to docs / billing / escalation specialists; each specialist is a ReAct agent with 4–8 tools; PostgresSaver for HITL on refunds.

The graph was already good — clean separation, well-tuned prompts, refunds gated behind a human approval interrupt. The problem was responsiveness. Average p95 was 9.2 seconds; users churned mid-conversation.

What they changed

One import.

# Before
from langgraph.graph import StateGraph, START, END

# After
from agentmatic import StateGraph, START, END

Then pip install agentmatic, replaced langgraph in requirements.txt, ran their existing test suite — all 312 tests passed without modification. Deployed behind a 5% canary.

What they saw

Within a week the canary was at 100%. The numbers from the first 4,200 conversations:

Metric	Before (LangGraph)	After (Agentmatic)	Δ
p50 latency	2.8 s	0.6 s	4.7×
p95 latency	9.2 s	1.4 s	6.5×
p99 latency	14.1 s	2.7 s	5.2×
Worker pods	18	11	-39%
Daily compute spend	$1,420	$880	-38%

The end-to-end latency wins come from the multi-agent shape — a supervisor + 3 specialists is high-graph-density, so the Rust runtime’s framework-overhead reduction shows up directly.

What they didn’t change

Same OpenAI model (gpt-4o).
Same prompts (zero edits).
Same tools (zero edits).
Same Postgres checkpoint store (wire-compatible).
Same LangSmith tracing (wrapped via as_langchain_runnable()).

Their take

“It felt suspicious. We’ve never had a perf optimization that didn’t require a redesign. We expected to debug subtle behavior differences for a week — but the eval suite passed and the canary numbers were clean within hours.”

— Staff engineer, platform team

Why they’re featured

Two reasons: (1) the migration was a single file change, (2) the win was load-bearing for the business — sub-2-second support response is a churn-prevention threshold.

This story is anonymized at the customer's request. Industry, scale, and workflow details are accurate; identifying details have been changed.