hitl recipe

Human-in-the-loop approval before risky tool calls

Pause the graph before a sensitive tool fires, surface the proposed call to a human, then resume with their decision. Used for refunds, prod DB writes, transfers.

4 min read · Published May 5, 2026 · Languages: python, typescript, rust

The pattern

When an LLM is about to call a tool that’s hard to undo (refund money, drop a table, send an email), pause. Surface the proposed call to a human. Let them approve, modify, or reject. Resume.

HITL in one line: interrupt_before([“dangerous_tool”]) pauses the graph at the named tool; the run resumes when you reinvoke with the same thread_id.

Setup

agent = (Agent.builder("payments")
    .llm(OpenAI())
    .tools([read_account, transfer_funds])
    .checkpoint(PostgresSaver.from_env())
    .interrupt_before(["transfer_funds"])
    .build())

Run, inspect, approve

# Run pauses at the first transfer_funds call.
state = await agent.ainvoke(
    {"messages": [HumanMessage("Refund order #42 for $250.")]},
    config={"configurable": {"thread_id": "case-42"}},
)

print(state.next)  # ('transfer_funds',)
proposed = state.checkpoint.next_tool_call
print(proposed.args)  # {"to": "...", "amount": 250}

# Human approves → resume.
await agent.ainvoke(None, config={"configurable": {"thread_id": "case-42"}})

# Or the human modifies the args:
await agent.update_state(
    {"configurable": {"thread_id": "case-42"}},
    {"messages": [HumanMessage("Approved, but amount should be 245.")]},
)
await agent.ainvoke(None, config={"configurable": {"thread_id": "case-42"}})

Conditional interrupts

Only pause when the args meet a threshold:

agent = (Agent.builder("payments")
    .interrupt_before_when(
        tool="transfer_funds",
        predicate=lambda args: args["amount"] >= 100,
    )
    .build())

After-tool interrupts

Pause after a tool fires (useful for showing the user what changed):

.interrupt_after(["create_user"])

Persistence

HITL requires a durable checkpointer — interrupts persist run state to the backend so a different process / day can resume. Memory backend works for tests; use SQLite / Postgres / Redis / S3 in production.