mellea.plugins.hooks.generation
Generation pipeline hook payloads.
Classes
CLASS GenerationPreCallPayload
Payload for generation_pre_call — before LLM backend call.
Attributes:
action: TheComponentorCBlockabout to be sent to the backend.context: TheContextbeing used for this generation call.model_options: Dict of model options (writable — plugins may adjust temperature, etc.).format: OptionalBaseModelsubclass for constrained decoding (writable).tool_calls: Whether tool calls are enabled for this generation (writable).generation_id: Mellea-side hook correlation ID, distinct from the provider-assignedGenerationMetadata.response_id.Nonewhen the firing site does not generate one.
CLASS GenerationPostCallPayload
Payload for generation_post_call — fires once the model output is fully computed.
For lazy ModelOutputThunk objects this hook fires inside
ModelOutputThunk.astream after post_process completes, so
model_output.value is guaranteed to be available. For already-computed
thunks (e.g. cached responses) it fires before generate_from_context
returns.
Attributes:
prompt: The formatted prompt sent to the backend (str or list of message dicts).model_output: The fully-computedModelOutputThunk.latency_ms: Elapsed milliseconds from thegenerate_from_contextcall to when the value was fully materialized.generation_id: Mellea-side hook correlation ID matching the corresponding pre_call payload, distinct from the provider-assignedGenerationMetadata.response_id.Nonewhen the firing site did not generate one.
CLASS GenerationErrorPayload
Payload for generation_error — fires when the LLM backend raises an exception.
This hook fires inside ModelOutputThunk.astream just before the exception
is re-raised, giving plugins a chance to observe (but not suppress) the error.
Attributes:
exception: The exception raised by the backend.model_output: TheModelOutputThunkat the time of the error.modelandproviderare set when the backend set them early (before the async task); otherwise they areNone.generation_id: Mellea-side hook correlation ID matching the corresponding pre_call payload, distinct from the provider-assignedGenerationMetadata.response_id.Nonewhen the firing site did not generate one.
CLASS GenerationBatchPreCallPayload
Payload for generation_batch_pre_call — fires once before a batch generation request.
Carries the action sequence being sent in the batch alongside batch-level
fields (model, provider, num_actions) describing the single
API call.
Attributes:
actions: The action sequence being sent in the batch.generation_id: Correlation identifier set by the firing backend; matches the correspondinggeneration_batch_post_call/generation_batch_errorpayloads for the same call.format: Optional structured-output format applied to the batch.tool_calls: Whether tool calling is enabled (typicallyFalsefor raw).num_actions: Convenience copy oflen(actions).model: Model identifier the backend is calling.provider: Provider name (e.g."openai","ollama").
CLASS GenerationBatchPostCallPayload
Payload for generation_batch_post_call — fires once after a batch generation succeeds.
Carries the list of ModelOutputThunk instances produced by the batch
alongside batch-level fields (usage, model, provider,
latency_ms) describing the single API call.
Attributes:
generation_id: Correlation identifier from the matching pre_call.model_outputs: The list ofModelOutputThunkinstances built from the API response, in batch order.usage: Aggregate token-usage dict (OpenAI-shape) for the whole batch.model: Model identifier from the call.provider: Provider name.latency_ms: Wall-clock duration of the API call in milliseconds.
CLASS GenerationBatchErrorPayload
Payload for generation_batch_error — fires once when a batch generation request fails.
Carries the exception alongside batch-level fields (model,
provider, latency_ms) describing the failed API call. No
ModelOutputThunk instances are present.
Attributes:
generation_id: Correlation identifier from the matching pre_call.exception: The exception raised by the backend.model: Model identifier from the call.provider: Provider name.latency_ms: Wall-clock time-until-error in milliseconds.