pregel pattern

Doc pipeline — custom Pregel graph with map fan-out

A high-throughput document processing pipeline that fans out to N parallel workers, with no LLM. Shows raw Pregel API in Rust — extract, dedup, embed, persist.

Highlights

  • Custom StateGraph with a Map fan-out node that processes 100s of docs in parallel.
  • No LLM — proves the Pregel runtime is useful beyond agents.
  • Tokio-async, channel-driven, runs at ~5,000 docs/sec on an M3 Max.
  • Demonstrates SPSC channels, conditional edges, and Pregel supersteps.

What this shows

Not every graph needs an LLM. Sometimes you just want a typed, parallel, checkpointable pipeline. This example uses the raw Pregel API to build a document-processing pipeline that fans out to N workers, dedupes by content hash, embeds, and persists — without ever calling a model.

Architecture

   docs (stream)

   extract ──┐
              ├──▶ map (1 → N parallel) ──▶ join ──▶ dedup ──▶ persist
   stream  ──┘

Why Pregel

Pregel (the bulk-synchronous parallel model) gives you:

  • Deterministic parallelism — every superstep is a barrier; the next step sees all messages from the previous one.
  • Free checkpointing — every superstep boundary is a checkpoint.
  • Composable channels — typed message passing between nodes, with reducers for aggregation.

If you’ve ever wanted Apache Beam’s semantics in a single Rust binary, this is that.

Key snippet

use agentmatic::prelude::*;

let graph = StateGraph::<DocState>::new()
    .add_node("extract", extract_node)
    .add_node("map", map_node.with_parallelism(16))
    .add_node("dedup", dedup_node)
    .add_node("persist", persist_node)
    .add_edge("extract", "map")
    .add_edge("map", "dedup")
    .add_edge("dedup", "persist")
    .compile()?;

let result = graph.invoke(initial_state).await?;

Benchmarks

On a single M3 Max (12 cores):

MetricValue
Throughput~5,000 docs/sec
Memory~120 MB steady-state
Cold start42 ms

Ship your next agent in minutes, not weeks.

MIT licensed. Drop-in for LangGraph. Native SDKs in 5 languages. Battle-tested resilience primitives in the box.