Skip to main content
A generic LiteLLM compatible backend that wraps around the openai python sdk.

Classes

CLASS LiteLLMBackend

A generic LiteLLM compatible backend. Args:
  • model_id: The LiteLLM model identifier string; typically "<provider>/<model_creator>/<model_name>".
  • formatter: Formatter for rendering components. Defaults to [TemplateFormatter](../formatters/template_formatter#class-templateformatter).
  • base_url: Base URL for the LLM API endpoint; defaults to the Ollama local endpoint.
  • model_options: Default model options for generation requests.
Attributes:
  • to_mellea_model_opts_map: Mapping from backend-specific option names to Mellea [ModelOption](model_options#class-modeloption) sentinel keys.
  • from_mellea_model_opts_map: Mapping from Mellea [ModelOption](model_options#class-modeloption) sentinel keys to backend-specific option names.
Methods:

FUNC processing

processing(self, mot: ModelOutputThunk, chunk: litellm.ModelResponse | litellm.ModelResponseStream)
Accumulate content and thinking tokens from a single LiteLLM response chunk. Called during generation for each ModelResponse (non-streaming) or ModelResponseStream chunk (streaming). Tool call parsing is deferred to post_processing. Args:
  • mot: The output thunk being populated.
  • chunk: A single response object or streaming chunk from LiteLLM.

FUNC post_processing

post_processing(self, mot: ModelOutputThunk, conversation: list[dict], tools: dict[str, AbstractMelleaTool], thinking, _format)
Finalize the model output thunk after LiteLLM generation completes. Reconstructs a merged chat response from streaming chunks if applicable, extracts tool call requests, records token usage metrics, emits telemetry, and attaches the generate log to the output thunk. Args:
  • mot: The output thunk to finalize.
  • conversation: The chat conversation sent to the model, used for logging.
  • tools: Available tools, keyed by name.
  • thinking: The thinking/reasoning effort level passed to the model, or None if reasoning mode was not enabled.
  • _format: The structured output format class used during generation, if any.

FUNC generate_from_raw

generate_from_raw(self, actions: list[Component[C]], ctx: Context) -> list[ModelOutputThunk[C]]

FUNC generate_from_raw

generate_from_raw(self, actions: list[Component[C] | CBlock], ctx: Context) -> list[ModelOutputThunk[C | str]]

FUNC generate_from_raw

generate_from_raw(self, actions: Sequence[Component[C] | CBlock], ctx: Context) -> list[ModelOutputThunk]
Generate completions for multiple actions without chat templating via LiteLLM. Passes formatted prompt strings directly to LiteLLM’s text completion endpoint. Tool calling is not supported on this endpoint. Args:
  • actions: Actions to generate completions for.
  • ctx: The current generation context.
  • format: Optional Pydantic model for structured output; passed as guided_json in the request body.
  • model_options: Per-call model options.
  • tool_calls: Ignored; tool calling is not supported on this endpoint.
Returns:
  • list[ModelOutputThunk]: A list of model output thunks, one per action.