mellea.stdlib.requirements.safety.guardian

Risk checking with Granite Guardian models via existing backends.

Classes

CLASS `GuardianRisk`

Risk definitions for Granite Guardian models. Based on https://github.com/ibm-granite/granite-guardian but updated for 3.3 8B support. Attributes:

HARM: General harmful content risk.
GROUNDEDNESS: Factual groundedness / attribution risk.
PROFANITY: Profane language risk.
ANSWER_RELEVANCE: Answer relevance to the question risk.
JAILBREAK: Jailbreak attempt risk.
FUNCTION_CALL: Unsafe or invalid function-call risk.
SOCIAL_BIAS: Social bias risk.
VIOLENCE: Violent content risk.
SEXUAL_CONTENT: Explicit sexual content risk.
UNETHICAL_BEHAVIOR: Unethical behaviour risk.

Methods:

FUNC `get_available_risks`

get_available_risks(cls) -> list[str]

Return a list of all available risk type identifiers. Returns:

list[str]: String values of all GuardianRisk enum members.

CLASS `GuardianCheck`

Enhanced risk checking using Granite Guardian 3.3 8B with multiple backend support. [DEPRECATED as of V 0.4 — Use Intrinsics instead] Args:

risk: The type of risk to check for. Required unless custom_criteria is provided.
backend_type: Backend type to use — "ollama" or "huggingface".
model_version: Specific Guardian model version. Defaults to the appropriate 8B model for the chosen backend.
device: Device string for HuggingFace inference (e.g. "cuda").
ollama_url: Base URL for the Ollama server.
thinking: Enable chain-of-thought reasoning mode in the Guardian model.
custom_criteria: Free-text criteria string used in place of a standard GuardianRisk value.
context_text: Context document for groundedness checks.
tools: Tool schemas for function-call validation.
backend: Pre-initialised backend instance to reuse; avoids loading the model multiple times.

Methods:

FUNC `get_effective_risk`

get_effective_risk(self) -> str

Return the effective risk criteria to use for validation. Returns the custom_criteria string when one was provided, otherwise returns the risk identifier set during initialisation. Returns:

The active risk/criteria string forwarded to the Guardian model.

FUNC `get_available_risks`

get_available_risks(cls) -> list[str]

Return a list of all available standard risk type identifiers. Returns:

list[str]: String values of all GuardianRisk enum members.

FUNC `validate`

validate(self, backend: Backend, ctx: Context) -> ValidationResult

Validate a conversation using Granite Guardian via the selected backend. Builds a minimal chat context from the current session context, invokes the Guardian model, and parses its <score>yes/no</score> output. A "No" label (risk not detected) is treated as a passing validation result. Args:

backend: The session backend (used as a fallback context source; the Guardian’s own backend is used for generation).
ctx: The current conversation context to validate.
format: Unused; present for interface compatibility.
model_options: Additional model options merged into the Guardian backend call.

Returns:

result=True when the content is considered safe
(Guardian returns "No"), result=False otherwise.

mellea

cli

Documentation Index

​Classes

​CLASS GuardianRisk

​FUNC get_available_risks

​CLASS GuardianCheck

​FUNC get_effective_risk

​FUNC get_available_risks

​FUNC validate

Classes

CLASS `GuardianRisk`

FUNC `get_available_risks`

CLASS `GuardianCheck`

FUNC `get_effective_risk`

FUNC `get_available_risks`

FUNC `validate`