Classes
CLASS BudgetForcingSamplingStrategy
Sampling strategy that enforces a token budget for chain-of-thought reasoning.
Extends RejectionSamplingStrategy with explicit control over the <think>
block size and the answer block size. On each loop iteration,
think_budget_forcing interleaves forced-thinking and final-answer generation,
after which the standard rejection-sampling validation pass determines whether to
accept or retry.
Currently only supports the OllamaModelBackend.
Args:
think_max_tokens: Tokens allocated for the thinking block. Defaults to4096.answer_max_tokens: Tokens allocated for the answer block.Nonemeans unbounded.start_think_token: Token opening the thinking block. Defaults to"<think>".end_think_token: Token closing the thinking block. Defaults to"</think>".begin_response_token: Optional token opening the response block. Defaults to"".end_response_token: Token closing the response block.think_more_suffix: Suffix to force continued thinking. Empty string disables forcing.answer_suffix: Suffix to elicit a final answer.loop_budget: Rejection-sampling loop count. Must be > 0.requirements: Requirements to validate against.
FUNC sample
action: The action object to be sampled.context: The context to be passed to the sampling strategy.backend: The backend used for generating samples.requirements: List of requirements to test against (merged with global requirements).validation_ctx: Optional context to use for validation. If None, validation_ctx = ctx.format: output format for structured outputs.model_options: model options to pass to the backend during generation / validation.tool_calls: True if tool calls should be used during this sampling strategy.show_progress: if true, a tqdm progress bar is used. Otherwise, messages will still be sent to flog.
- A result object indicating the success or failure of the sampling process.
AssertionError: Asserts that all required components (repair, select_from_failure, validate, and generate) are provided before proceeding with the sampling.