react pattern

Research agent — sandboxed Rhai scripts as LLM tools

An LLM-driven research agent that writes and executes Rhai scripts in a deterministic sandbox. Lets the model use code without giving it the keys to the kingdom.

Highlights

  • LLM writes Rhai (embeddable Rust scripting language) to compute, transform, plot.
  • Sandboxed: no filesystem, no network, no subprocess — only the functions you whitelist.
  • Deterministic: same input → same output → cacheable.
  • Demonstrates the safer alternative to letting an LLM call Python eval().

What this shows

LLMs love writing code. The unsafe pattern is exec(llm_output) in Python — you just gave the model arbitrary code execution on your machine. The safe pattern is to give it a sandboxed scripting language with a whitelisted standard library. This example uses Rhai (an embeddable Rust scripting language) wrapped as an Agentmatic tool.

Architecture

   Agent (ReAct)

   tool: rhai_eval(script: String) -> Value

   Rhai engine with whitelisted std (no fs, no net, no exec)

   typed return value back to the model

Key snippet

use agentmatic::prelude::*;
use rhai::{Engine, Scope};

#[tool]
fn rhai_eval(script: String) -> anyhow::Result<String> {
    let mut engine = Engine::new();
    // Whitelist only safe ops. No file/net/exec.
    engine.set_max_operations(100_000);
    engine.set_max_call_levels(8);

    let result: rhai::Dynamic = engine.eval(&script)?;
    Ok(result.to_string())
}

let agent = Agent::builder("researcher")
    .llm(OpenAI::from_env()?)
    .tools(vec![rhai_eval_tool()])
    .build()?;

Why not Python eval

  • eval(str) in Python is arbitrary code execution. Even with globals={} you can escape.
  • subprocess.run introduces fs and network access.
  • A sandboxed scripting language has a hard, auditable surface area. You decide what’s callable.

Use cases

  • Math, statistics, transformations on data the model needs to manipulate.
  • Plotting (whitelist a chart() builder that returns a serialized chart spec).
  • DSLs — let the model emit a config / policy / template in your DSL.

Ship your next agent in minutes, not weeks.

MIT licensed. Drop-in for LangGraph. Native SDKs in 5 languages. Battle-tested resilience primitives in the box.