Skip to main content
Code interpreter tool and execution environments for agentic workflows. Provides ExecutionResult (capturing stdout, stderr, success, and optional static analysis output) and three concrete ExecutionEnvironment implementations: StaticAnalysisEnvironment (parse and import-check only, no execution), UnsafeEnvironment (subprocess execution in the current Python environment), and LLMSandboxEnvironment (Docker-isolated execution via llm-sandbox). All environments support an optional allowed_imports allowlist. The top-level code_interpreter and local_code_interpreter functions are ready to be wrapped as [MelleaTool](../../backends/tools#class-melleatool) instances for ReACT or other agentic loops.

Functions

FUNC code_interpreter

code_interpreter(code: str) -> ExecutionResult
Executes python code. Args:
  • code: The Python code to execute.
Returns:
  • An ExecutionResult with stdout, stderr, and a success flag.

FUNC local_code_interpreter

local_code_interpreter(code: str) -> ExecutionResult
Executes python code in the cwd. Args:
  • code: The Python code to execute.
Returns:
  • An ExecutionResult with stdout, stderr, and a success flag.

Classes

CLASS ExecutionResult

Result of code execution. Code execution can be aborted prior to spinning up an interpreter (e.g., if prohibited imports are used). In these cases, the success flag is set to False and the skipped flag is set to True. If code is executed, then success is set to true iff the exit code is 0, and the stdout and stderr outputs are set to non-None values. We also use the ExecutionResult object to communicate the result of static and dynamic analyses. Those are passed back using the analysis_result field. TODO: should we also be trying to pass back the value of the final expression evaluated, or the value of locals() and globals()? Args:
  • success: True if execution succeeded (exit code 0 or static-analysis passed); False otherwise.
  • stdout: Captured standard output, or None if execution was skipped.
  • stderr: Captured standard error, or None if execution was skipped.
  • skipped: True when execution was not attempted.
  • skip_message: Explanation of why execution was skipped.
  • analysis_result: Optional payload from static-analysis environments.
Methods:

FUNC to_validationresult_reason

to_validationresult_reason(self) -> str
Maps an ExecutionResult to a ValidationResult reason. TODO: Downstream use of this method is really hacky. A far better solution is for ExecutionResult to implement the ValidationResult interface.

CLASS ExecutionEnvironment

Abstract environment for executing Python code. Args:
  • allowed_imports: Allowlist of top-level module names that generated code may import. None disables the import check.
Methods:

FUNC execute

execute(self, code: str, timeout: int) -> ExecutionResult
Execute the given code and return the result. Args:
  • code: The Python source code to execute.
  • timeout: Maximum number of seconds to allow the code to run.
Returns:
  • Execution outcome including stdout, stderr, and
  • success flag.

CLASS StaticAnalysisEnvironment

Safe environment that validates but does not execute code.
Methods:

FUNC execute

execute(self, code: str, timeout: int) -> ExecutionResult
Validate code syntax and imports without executing. Args:
  • code: The Python source code to validate.
  • timeout: Ignored for static analysis; present for interface compatibility.
Returns:
  • Result with skipped=True and the parsed AST in
  • analysis_result on success, or a syntax-error description on
  • failure.

CLASS UnsafeEnvironment

Unsafe environment that executes code directly with subprocess.
Methods:

FUNC execute

execute(self, code: str, timeout: int) -> ExecutionResult
Execute code with subprocess after checking imports. Args:
  • code: The Python source code to execute.
  • timeout: Maximum number of seconds before the subprocess is killed and a timeout result is returned.
Returns:
  • Execution outcome with captured stdout/stderr and
  • success flag, or a skipped result if imports are unauthorized or an
  • unexpected error occurs.

CLASS LLMSandboxEnvironment

Environment using llm-sandbox for secure Docker-based execution.
Methods:

FUNC execute

execute(self, code: str, timeout: int) -> ExecutionResult
Execute code using llm-sandbox in an isolated Docker container. Checks the import allowlist first, then delegates to a SandboxSession from the llm-sandbox package. Returns a skipped result if llm-sandbox is not installed. Args:
  • code: The Python source code to execute.
  • timeout: Maximum number of seconds to allow the sandboxed process to run.
Returns:
  • Execution outcome with stdout/stderr and success
  • flag, or a skipped result on import violation or sandbox error.