Consumer fintech · 120 engineers

Cut support p95 latency from 9.2 s to 1.4 s — same prompts, same tools, same model.

A US consumer fintech replaced their LangGraph supervisor with Agentmatic. The graph, prompts, and tools stayed identical; only the runtime changed. Latency dropped 6.5×.

Customer support agent · Published May 15, 2026

Outcomes

  • Support p95 latency 9.2 s → 1.4 s on the same OpenAI model.
  • Compute spend on the support service down 38% (smaller event-loop, fewer worker pods).
  • Zero customer-facing regressions across 4,200 conversations in the first week.
  • Migration took one engineer 90 minutes total.

The setup

A growing fintech with a customer support agent built on LangGraph: supervisor routes to docs / billing / escalation specialists; each specialist is a ReAct agent with 4–8 tools; PostgresSaver for HITL on refunds.

The graph was already good — clean separation, well-tuned prompts, refunds gated behind a human approval interrupt. The problem was responsiveness. Average p95 was 9.2 seconds; users churned mid-conversation.

What they changed

One import.

# Before
from langgraph.graph import StateGraph, START, END

# After
from agentmatic import StateGraph, START, END

Then pip install agentmatic, replaced langgraph in requirements.txt, ran their existing test suite — all 312 tests passed without modification. Deployed behind a 5% canary.

What they saw

Within a week the canary was at 100%. The numbers from the first 4,200 conversations:

MetricBefore (LangGraph)After (Agentmatic)Δ
p50 latency2.8 s0.6 s4.7×
p95 latency9.2 s1.4 s6.5×
p99 latency14.1 s2.7 s5.2×
Worker pods1811-39%
Daily compute spend$1,420$880-38%

The end-to-end latency wins come from the multi-agent shape — a supervisor + 3 specialists is high-graph-density, so the Rust runtime’s framework-overhead reduction shows up directly.

What they didn’t change

  • Same OpenAI model (gpt-4o).
  • Same prompts (zero edits).
  • Same tools (zero edits).
  • Same Postgres checkpoint store (wire-compatible).
  • Same LangSmith tracing (wrapped via as_langchain_runnable()).

Their take

“It felt suspicious. We’ve never had a perf optimization that didn’t require a redesign. We expected to debug subtle behavior differences for a week — but the eval suite passed and the canary numbers were clean within hours.”

— Staff engineer, platform team

Two reasons: (1) the migration was a single file change, (2) the win was load-bearing for the business — sub-2-second support response is a churn-prevention threshold.

This story is anonymized at the customer's request. Industry, scale, and workflow details are accurate; identifying details have been changed.

Ship your next agent in minutes, not weeks.

MIT licensed. Drop-in for LangGraph. Native SDKs in 5 languages. Battle-tested resilience primitives in the box.