ExecutionResult (capturing stdout, stderr, success, and optional static
analysis output) and three concrete ExecutionEnvironment implementations:
StaticAnalysisEnvironment (parse and import-check only, no execution),
UnsafeEnvironment (subprocess execution in the current Python environment), and
LLMSandboxEnvironment (Docker-isolated execution via llm-sandbox). All
environments support an optional allowed_imports allowlist. The top-level
code_interpreter and local_code_interpreter functions are ready to be wrapped
as [MelleaTool](../../backends/tools#class-melleatool) instances for ReACT or other agentic loops.
Functions
FUNC code_interpreter
code: The Python code to execute.
- An
ExecutionResultwith stdout, stderr, and a success flag.
FUNC local_code_interpreter
code: The Python code to execute.
- An
ExecutionResultwith stdout, stderr, and a success flag.
Classes
CLASS ExecutionResult
Result of code execution.
Code execution can be aborted prior to spinning up an interpreter (e.g., if prohibited imports are used).
In these cases, the success flag is set to False and the skipped flag is set to True.
If code is executed, then success is set to true iff the exit code is 0, and the stdout and stderr outputs
are set to non-None values.
We also use the ExecutionResult object to communicate the result of static and dynamic analyses. Those are passed back
using the analysis_result field.
TODO: should we also be trying to pass back the value of the final expression evaluated, or the value of locals() and globals()?
Args:
success:Trueif execution succeeded (exit code 0 or static-analysis passed);Falseotherwise.stdout: Captured standard output, orNoneif execution was skipped.stderr: Captured standard error, orNoneif execution was skipped.skipped:Truewhen execution was not attempted.skip_message: Explanation of why execution was skipped.analysis_result: Optional payload from static-analysis environments.
FUNC to_validationresult_reason
ExecutionResult to implement the ValidationResult interface.
CLASS ExecutionEnvironment
Abstract environment for executing Python code.
Args:
allowed_imports: Allowlist of top-level module names that generated code may import.Nonedisables the import check.
FUNC execute
code: The Python source code to execute.timeout: Maximum number of seconds to allow the code to run.
- Execution outcome including stdout, stderr, and
- success flag.
CLASS StaticAnalysisEnvironment
Safe environment that validates but does not execute code.
Methods:
FUNC execute
code: The Python source code to validate.timeout: Ignored for static analysis; present for interface compatibility.
- Result with
skipped=Trueand the parsed AST in analysis_resulton success, or a syntax-error description on- failure.
CLASS UnsafeEnvironment
Unsafe environment that executes code directly with subprocess.
Methods:
FUNC execute
code: The Python source code to execute.timeout: Maximum number of seconds before the subprocess is killed and a timeout result is returned.
- Execution outcome with captured stdout/stderr and
- success flag, or a skipped result if imports are unauthorized or an
- unexpected error occurs.
CLASS LLMSandboxEnvironment
Environment using llm-sandbox for secure Docker-based execution.
Methods:
FUNC execute
SandboxSession
from the llm-sandbox package. Returns a skipped result if
llm-sandbox is not installed.
Args:
code: The Python source code to execute.timeout: Maximum number of seconds to allow the sandboxed process to run.
- Execution outcome with stdout/stderr and success
- flag, or a skipped result on import violation or sandbox error.