Classes
CLASS Message
Schema for a message in the test data.
Attributes:
role: The role of the message sender (e.g."user"or"assistant").content: The text content of the message.
CLASS Example
Schema for an example in the test data.
Attributes:
input: The input messages for this example.targets: The expected target messages for scoring.input_id: An optional identifier for this input example.
CLASS TestData
Schema for test data loaded from json.
Attributes:
source: Origin identifier for this test dataset.name: Human-readable name for this test dataset.instructions: Evaluation guidelines used by the judge model.examples: The individual input/target example pairs.id: Unique identifier for this test dataset.
FUNC validate_examples
v: The value of theexamplesfield being validated.
- list[Example]: The validated examples list, unchanged.
ValueError: If the examples list is empty.
CLASS TestBasedEval
Each TestBasedEval represents a single unit test.
Args:
source: Origin identifier for this test dataset.name: Human-readable name for this test.instructions: Evaluation guidelines used by the judge model.inputs: The input texts for each example.targets: Expected target strings for each input.Noneis treated as an empty list.test_id: Optional unique identifier for this test.input_ids: Optional identifiers for each input.
FUNC parts
- list[Component | CBlock]: Always an empty list; the component
- renders entirely via
format_for_llm.
FUNC format_for_llm
- A template representation containing the
- judge context (input, prediction, target, guidelines) set by
set_judge_context, or an empty args dict if no context has- been set yet.
FUNC set_judge_context
input_text: The original input text shown to the model.prediction: The model’s generated output to evaluate.targets_for_input: Reference target strings for this input. An empty list results in"N/A"as the target text.
FUNC from_json_file
TestBasedEval per unit test.
Args:
filepath: Path to a JSON file containing one test-data object or a JSON array of test-data objects.
- list[TestBasedEval]: A list of
TestBasedEvalinstances, one for - each object found in the file.
ValueError: If any test-data object in the file does not conform to theTestDataschema.