Classes
CLASS GuardianRisk
Risk definitions for Granite Guardian models.
Based on https://github.com/ibm-granite/granite-guardian but updated for 3.3 8B support.
Attributes:
HARM: General harmful content risk.GROUNDEDNESS: Factual groundedness / attribution risk.PROFANITY: Profane language risk.ANSWER_RELEVANCE: Answer relevance to the question risk.JAILBREAK: Jailbreak attempt risk.FUNCTION_CALL: Unsafe or invalid function-call risk.SOCIAL_BIAS: Social bias risk.VIOLENCE: Violent content risk.SEXUAL_CONTENT: Explicit sexual content risk.UNETHICAL_BEHAVIOR: Unethical behaviour risk.
FUNC get_available_risks
- list[str]: String values of all
GuardianRiskenum members.
CLASS GuardianCheck
Enhanced risk checking using Granite Guardian 3.3 8B with multiple backend support.
[DEPRECATED as of V 0.4 — Use Intrinsics instead]
Args:
risk: The type of risk to check for. Required unlesscustom_criteriais provided.backend_type: Backend type to use —"ollama"or"huggingface".model_version: Specific Guardian model version. Defaults to the appropriate 8B model for the chosen backend.device: Device string for HuggingFace inference (e.g."cuda").ollama_url: Base URL for the Ollama server.thinking: Enable chain-of-thought reasoning mode in the Guardian model.custom_criteria: Free-text criteria string used in place of a standardGuardianRiskvalue.context_text: Context document for groundedness checks.tools: Tool schemas for function-call validation.backend: Pre-initialised backend instance to reuse; avoids loading the model multiple times.
FUNC get_effective_risk
custom_criteria string when one was provided, otherwise
returns the risk identifier set during initialisation.
Returns:
- The active risk/criteria string forwarded to the Guardian model.
FUNC get_available_risks
- list[str]: String values of all
GuardianRiskenum members.
FUNC validate
<score>yes/no</score> output. A "No"
label (risk not detected) is treated as a passing validation result.
Args:
backend: The session backend (used as a fallback context source; the Guardian’s own backend is used for generation).ctx: The current conversation context to validate.format: Unused; present for interface compatibility.model_options: Additional model options merged into the Guardian backend call.
result=Truewhen the content is considered safe- (Guardian returns
"No"),result=Falseotherwise.