Skip to main content
Risk checking with Granite Guardian models via existing backends.

Classes

CLASS GuardianRisk

Risk definitions for Granite Guardian models. Based on https://github.com/ibm-granite/granite-guardian but updated for 3.3 8B support. Attributes:
  • HARM: General harmful content risk.
  • GROUNDEDNESS: Factual groundedness / attribution risk.
  • PROFANITY: Profane language risk.
  • ANSWER_RELEVANCE: Answer relevance to the question risk.
  • JAILBREAK: Jailbreak attempt risk.
  • FUNCTION_CALL: Unsafe or invalid function-call risk.
  • SOCIAL_BIAS: Social bias risk.
  • VIOLENCE: Violent content risk.
  • SEXUAL_CONTENT: Explicit sexual content risk.
  • UNETHICAL_BEHAVIOR: Unethical behaviour risk.
Methods:

FUNC get_available_risks

get_available_risks(cls) -> list[str]
Return a list of all available risk type identifiers. Returns:
  • list[str]: String values of all GuardianRisk enum members.

CLASS GuardianCheck

Enhanced risk checking using Granite Guardian 3.3 8B with multiple backend support. [DEPRECATED as of V 0.4 — Use Intrinsics instead] Args:
  • risk: The type of risk to check for. Required unless custom_criteria is provided.
  • backend_type: Backend type to use — "ollama" or "huggingface".
  • model_version: Specific Guardian model version. Defaults to the appropriate 8B model for the chosen backend.
  • device: Device string for HuggingFace inference (e.g. "cuda").
  • ollama_url: Base URL for the Ollama server.
  • thinking: Enable chain-of-thought reasoning mode in the Guardian model.
  • custom_criteria: Free-text criteria string used in place of a standard GuardianRisk value.
  • context_text: Context document for groundedness checks.
  • tools: Tool schemas for function-call validation.
  • backend: Pre-initialised backend instance to reuse; avoids loading the model multiple times.
Methods:

FUNC get_effective_risk

get_effective_risk(self) -> str
Return the effective risk criteria to use for validation. Returns the custom_criteria string when one was provided, otherwise returns the risk identifier set during initialisation. Returns:
  • The active risk/criteria string forwarded to the Guardian model.

FUNC get_available_risks

get_available_risks(cls) -> list[str]
Return a list of all available standard risk type identifiers. Returns:
  • list[str]: String values of all GuardianRisk enum members.

FUNC validate

validate(self, backend: Backend, ctx: Context) -> ValidationResult
Validate a conversation using Granite Guardian via the selected backend. Builds a minimal chat context from the current session context, invokes the Guardian model, and parses its <score>yes/no</score> output. A "No" label (risk not detected) is treated as a passing validation result. Args:
  • backend: The session backend (used as a fallback context source; the Guardian’s own backend is used for generation).
  • ctx: The current conversation context to validate.
  • format: Unused; present for interface compatibility.
  • model_options: Additional model options merged into the Guardian backend call.
Returns:
  • result=True when the content is considered safe
  • (Guardian returns "No"), result=False otherwise.