Skip to content

Inputs, run_script, and Sub-Execution

This guide covers three closely-related features:

  1. inputs: — inject per-call Python variables before code runs.
  2. run_script — call another script from Python and receive its last-expression value.
  3. HostContext.subExecute — drive sub-executions from your own host functions, safely.

All three were introduced together because they solve the same problem from different angles: how to compose multiple Monty executions when the bridge serialises one run at a time.

Prerequisites: Host Functions — Intermediate.


inputs: — per-call variable injection

MontyRuntime.execute takes an optional Map<String, Object?> inputs. Each entry is converted to a Python literal and prepended to the code as an assignment statement, so the value lands as a top-level Python name before user code runs:

final r = await runtime
    .execute('f"{greeting}, {name}!"', inputs: {
      'greeting': 'hello',
      'name': 'Alice',
    })
    .result;
// r.value.dartValue == 'hello, Alice!'

The same parameter rides through buildRunScriptFunction's sub-script invocation and through HostContext.subExecute.

Convertible types

bool, int, double (incl. nan / inf), String, List, Map, and MontyNone(). Lists and maps are converted recursively.

Two distinct error mechanisms

Both are synchronous — the script never starts when the encoder rejects an input.

Bad input Throws Why
Dart null value MontyInternalError Use MontyNone() for Python None. MontyInternalError extends Error (not Exception), so it can't be silently swallowed by on Exception handlers.
Unsupported type (DateTime, custom class, etc.) ArgumentError Convert to a supported type first.
runtime.execute('x', inputs: {'x': null});           // → MontyInternalError
runtime.execute('x', inputs: {'x': DateTime.now()}); // → ArgumentError
runtime.execute('x is None', inputs: {'x': const MontyNone()}); // OK

Shared mode persistence

In shared mode (the default), injected names land in the Python interpreter's globals — so they persist into subsequent calls on the same runtime. Sandbox mode (MontyRuntime(sandbox: true)) starts each call from a fresh interpreter, so injected names disappear after the call returns.


run_script — call another script from Python

buildRunScriptFunction(readFile) builds a HostFunction that lets Python execute another script file by path:

final vfs = <String, String>{
  'greet.py': 'f"hello, {name}!"',
  'double.py': 'n * 2',
};

final runtime = MontyRuntime()
  ..register(buildRunScriptFunction((path) async {
    final code = vfs[path];
    if (code == null) throw Exception('file not found: $path');
    return code;
  }));

From Python:

greeting = run_script('greet.py', inputs={'name': 'Alice'})
# 'hello, Alice!'

doubled = run_script('double.py', inputs={'n': 21})
# 42

The readFile callback is your wiring — point it at a dart:io read, a VFS, an HTTP fetch, or whatever script source you have. It runs synchronously from Python's perspective; awaitable Dart work in readFile is fine.

Return-value semantics

run_script returns whatever the sub-script's last expression evaluates to:

Sub-script content run_script(...) returns
n * 2 (last-expression) the value (42)
x = 99 (statement only) None
x = 7\nx * 6 (statement, then expression) the expression value (42)
return 42 at module level the return value (42) — pydantic-monty allows it
Sub-script raises the host function raises in Python; result.isError == true

Argument shape

run_script itself is just a host function with two params:

Param Type Required
path string yes
inputs map no

Both kwargs and positional forms work — run_script('foo.py', inputs={'k':'v'}) and run_script('foo.py', {'k':'v'}) are equivalent because pydantic-monty maps positional args to schema params by index.

Invalid arguments surface as Python errors (result.isError == true):

Misuse Surfaces as
run_script(123) (wrong type) Python error
run_script("f.py", inputs="oops") (wrong type for inputs) Python error
run_script() (missing required path) Python error
run_script("f.py", typo={…}) (unknown kwarg) Python error

HostContext.subExecute — sub-execution from your own host function

Every host function handler receives a HostContext. Two of its fields are the wiring for sub-execution:

class HostContext {
  // … emit, executionId, cancelToken, os …

  /// Narrow view of the owning runtime — event forwarding + schema
  /// introspection only. Deliberately does NOT expose `execute()`.
  final HostParentRef? parent;

  /// Runs a sub-script in a fresh Monty interpreter. Safe to call from
  /// inside a handler — uses an independent execution context, so it
  /// doesn't contend with the caller's bridge lock.
  final HostSubExecutor? subExecute;
}

Why two fields?

The bridge serialises executions: while a script is running, _isExecuting stays true through every host function dispatch. Calling runtime.execute() from inside a handler would always throw StateError('Bridge is already executing').

So HostContext exposes only the safe surface:

  • parent: HostParentRef?emitChildEvent (for child-spawning extensions like SandboxExtension to forward events to the parent's stream) and schemas (for tool introspection like the requires() host function). Crucially, no execute() — the misuse is impossible at compile time.
  • subExecute — runs sub-scripts via Monty(code).run() in a fresh interpreter. No shared state with the caller, no bridge contention, returns a MontyResult.

Building your own run_script-style function

HostFunction myEvalFn() {
  return HostFunction(
    schema: const HostFunctionSchema(
      name: 'evaluate',
      description: 'Run a snippet and return its last expression.',
      params: [
        HostParam(name: 'code', type: HostParamType.string),
        HostParam(
          name: 'inputs',
          type: HostParamType.map,
          isRequired: false,
        ),
      ],
    ),
    handler: (args, ctx) async {
      final subExecute = ctx.subExecute;
      if (subExecute == null) {
        throw StateError('evaluate: no subExecute wired into HostContext');
      }
      final code = args['code']! as String;
      final rawInputs = args['inputs'];
      final inputs = rawInputs is Map
          ? rawInputs.map((k, v) => MapEntry(k.toString(), v as Object?))
          : null;

      final result = await subExecute(code, inputs: inputs);
      if (result.isError) {
        throw Exception('evaluate failed: ${result.error?.message}');
      }
      return result.value.dartValue;
    },
  );
}

What you trade for safety

subExecute runs in a fresh interpreter — variables, functions, imports from the caller are not visible to the sub-script. If you need the sub-script to see the caller's state, you have to inject it explicitly via inputs: (or accept that the sub-script can only call the same host functions, since those are wired through every Monty execution).

This is by design: nested execution on the same Python interpreter is the deadlock scenario, so the API doesn't offer it.


Putting it together

A typical pattern: the LLM generates a small helper script that needs to call back into a curated set of Dart functions. The IDE wires that pattern with three pieces:

final runtime = MontyRuntime()
  // 1. Register the helpers — visible in both the main script and any
  //    script run via run_script.
  ..register(myEvalFn())
  ..register(buildRunScriptFunction((path) => vfs.readFile(path)));

// 2. Inject context into the main script before it runs.
final r = await runtime
    .execute(mainScript, inputs: {'session_id': sessionId})
    .result;

Inside mainScript, Python can call run_script('helper.py', inputs={…}) to dispatch into a separate file, with the helper's own injected inputs. Each helper runs in its own fresh interpreter via subExecute — no deadlock, no leakage, no surprises.

Sync vs async handlers

A HostFunctionHandler always returns Future<Object?>, but how the bridge dispatches it depends on MontyRuntime(useFutures: …):

  • MontyRuntime() (default, useFutures: false) — handlers are awaited inline before the bridge resumes Python. Python sees the resolved value as a plain object. This is the simplest mental model: one host call → one round-trip → one Python resume. Limitation: Python await ext() against the host function raises TypeError: 'str' object can't be awaited — by the time Python evaluates await, the value is already a plain str/int, not a future.

  • MontyRuntime(useFutures: true) — handlers are launched as unawaited futures and the bridge replies with resumeAsFuture(). Python keeps running, can fire more host calls in parallel (asyncio.gather parallelises across externals), and the bridge batch-resolves them when Python actually suspends on a value. Concurrent dispatch is the whole point. Trade-off: handlers can race over shared state, so be deliberate about flipping the flag for any runtime whose handlers are stateful.

// Default (serial, simple): bare calls only — `await fetch(x)` would fail.
await MontyRuntime().execute('result = fetch(7)\nresult').result;

// Futures mode: Python `await fetch(x)` works, gather parallelises.
await MontyRuntime(useFutures: true).execute('''
import asyncio
results = await asyncio.gather(fetch(1), fetch(2), fetch(3))
results
''').result;

For the cell-by-cell contract — every combination of (Dart sync vs async) × (Python bare call vs await) × API layer × backend — see dart_monty_core's async-matrix deep dive and the matrix test bodies (Layer 4 lives at test/integration/_runtime_async_matrix_body.dart in this repo).