mellea.core.requirement
Requirement interface for constrained and validated generation.
A Requirement pairs a human-readable description with a validation function that
inspects a Context (and optionally a backend) to determine whether a model output
meets a constraint. ValidationResult carries the pass/fail verdict along with an
optional reason, score, and the ModelOutputThunk produced during validation.
PartialValidationResult provides a tri-state variant ("pass", "fail",
"unknown") for per-chunk streaming validation.
Helper factories such as default_output_to_bool make it easy to build requirements
without boilerplate.
Functions
FUNC default_output_to_bool
default_output_to_bool(x: CBlock | str) -> bool
Convert a model output string to a boolean by checking for a "yes" answer.
Checks if the output is exactly equal to "yes" or "y" (case-insensitive). If not, also checks if any of the words in the output are "yes" (case-insensitive).
Args:
x: The model output to evaluate, as aCBlockor plain string.
Returns:
Trueif the output indicates a "yes" answer,Falseotherwise.
Classes
CLASS ValidationResult
ValidationResults store the output of a Requirement's validation. They can be used to return additional info from validation functions, which is useful for sampling/repairing.
Args:
result: Boolean indicating whether the requirement passed.reason: Optional human-readable explanation for the verdict.score: Optional numeric score returned by the validator.thunk: TheModelOutputThunkproduced during LLM-as-a-Judge validation, if applicable.context: The context associated with the validation backend call, if applicable.
Methods:
FUNC reason
reason(self) -> str | None
Reason for the validation result.
FUNC score
score(self) -> float | None
An optional score for the validation result.
FUNC thunk
thunk(self) -> ModelOutputThunk | None
The ModelOutputThunk associated with the validation func if an llm was used to generate the final result.
FUNC context
context(self) -> Context | None
The context associated with validation if a backend was used to generate the final result.
FUNC as_bool
as_bool(self) -> bool
Return a boolean value based on the validation result.
Returns:
Trueif the requirement passed,Falseotherwise.
CLASS PartialValidationResult
Tri-state result from per-chunk streaming validation.
Unlike :class:ValidationResult, which stores its verdict as a private
_result: bool, this class exposes success as a public property.
The asymmetry is intentional: the tri-state value cannot be recovered from
a bool, so a public property is the only way to distinguish "fail"
from "unknown" after construction.
Args:
success: Validation state —"pass"(constraint satisfied so far),"fail"(constraint violated, stop streaming), or"unknown"(insufficient data yet, continue streaming).reason: Optional human-readable explanation.score: Optional numeric confidence score.thunk: Optional ModelOutputThunk from the validation call.context: Optional context associated with the validation call.
Methods:
FUNC success
success(self) -> Literal['pass', 'fail', 'unknown']
The tri-state validation result.
FUNC reason
reason(self) -> str | None
Reason for the validation result.
FUNC score
score(self) -> float | None
An optional score for the validation result.
FUNC thunk
thunk(self) -> ModelOutputThunk | None
The ModelOutputThunk associated with the validation call, if any.
FUNC context
context(self) -> Context | None
The context associated with the validation call, if any.
FUNC as_bool
as_bool(self) -> bool
Return True for "pass", False for "fail" or "unknown".
"unknown" maps to False intentionally. In streaming contexts,
check pvr.success == "unknown" before treating False as a definitive
failure — "unknown" means insufficient data so far, not a constraint
violation.
Returns:
Trueif the partial result is"pass",Falseotherwise.
CLASS Requirement
Requirements are a special type of Component used as input to the Validate step in Instruct/Validate/Repair patterns.
Args:
description: A natural-language description of the requirement. Sometimes included inInstructionprompts; usecheck_only=Trueto suppress this.validation_fn: If provided, this function is executed instead of LLM-as-a-Judge. Thebool()of its return value defines pass/fail.output_to_bool: Translates LLM-as-a-Judge output to a boolean. Defaults to a "yes"-detection heuristic.check_only: WhenTrue, the requirement description is excluded fromInstructionprompts.
Attributes:
description: A natural-language description of the requirement.output_to_bool: Function used to convert LLM-as-a-Judge output into a boolean pass/fail result.validation_fn: Optional custom validation function that bypasses the LLM-as-a-Judge strategy entirely.check_only: WhenTrue, the requirement description is excluded fromInstructionprompts to avoid influencing model output.
Methods:
FUNC validate
validate(self, backend: Backend, ctx: Context) -> ValidationResult
Chooses the appropriate validation strategy and applies it to the given context.
Uses validation_fn if one was provided, otherwise falls back to LLM-as-a-Judge
by generating a judgement response with the backend.
Args:
backend: The inference backend used when the LLM-as-a-Judge strategy is selected.ctx: The context to validate, which must contain aModelOutputThunkas its last output.format: Optional structured output format for the judgement call.model_options: Optional model options to pass to the backend during the judgement call.
Returns:
- The result of the validation, including a boolean pass/fail and optional metadata.
FUNC stream_validate
stream_validate(self, chunk: str) -> PartialValidationResult
Hook for per-chunk streaming validation.
The default implementation returns PartialValidationResult("unknown")
— meaning insufficient data to decide yet. Subclasses override this method
to inspect the current chunk and return "pass" or "fail" early.
Implementations may accumulate state on self across calls within a
single attempt. The orchestrator clones the requirement (copy(req))
before each attempt, so state does not bleed across retries.
Shallow-copy caveat: mutable container fields (e.g. self._buffer = [])
are shared by reference under copy(). Reassign rather than mutate in
place (self._buffer = self._buffer + [chunk], not
self._buffer.append(chunk)), or override __copy__ for proper
isolation. If an override raises, the enclosing
:func:~mellea.stdlib.streaming.stream_with_chunking call aborts before
any backend generation starts and the exception propagates unchanged.
Overrides with externally visible side effects (file writes, network
calls) should perform them only after any logic that could raise, since
the framework cannot roll them back.
Implementations must not call mot.astream() or otherwise read the
underlying stream; the orchestrator is the single consumer of the MOT
stream (see ModelOutputThunk.astream). Requirements that need access
to the text seen so far should accumulate it themselves from the
chunk values they receive.
Args:
chunk: A single complete semantic chunk produced by the chunking strategy (e.g. one sentence forSentenceChunker). This is the delta since the previousstream_validatecall for this attempt, not the accumulated output. Requirements that need earlier context should retain it onselfacross calls.backend: The inference backend, available for backend-assisted checks.ctx: The current generation context. During streaming the MOT is not yet computed, soctxdoes not contain the generated output; usechunk(and any state accumulated onself) instead.
Returns:
"unknown"by default. Subclasses may return"pass"(constraint satisfied so far) or"fail"(constraint violated,- streaming should be aborted).
"pass"does not short-circuit the final validate()call; the orchestrator decides whether to skip it.
FUNC parts
parts(self) -> list[Component | CBlock]
Returns all of the constituent parts of a Requirement.
Returns:
- List of constituent components. Empty by default; subclasses override
- to expose their internal structure.
FUNC format_for_llm
format_for_llm(self) -> TemplateRepresentation | str
Returns a TemplateRepresentation for LLM-as-a-Judge evaluation of this requirement.
Populates the template with the requirement's description and the stored model
_output. Must only be called from within a validate call for this same requirement,
after _output has been set.
Returns:
- TemplateRepresentation | str: A
TemplateRepresentationcontaining the description - and the model output to be judged.