Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mellea.ai/llms.txt

Use this file to discover all available pages before exploring further.

Risk checking with Granite Guardian models via existing backends.

Classes

CLASS GuardianRisk

Risk definitions for Granite Guardian models. Based on https://github.com/ibm-granite/granite-guardian but updated for 3.3 8B support. Attributes:
  • HARM: General harmful content risk.
  • GROUNDEDNESS: Factual groundedness / attribution risk.
  • PROFANITY: Profane language risk.
  • ANSWER_RELEVANCE: Answer relevance to the question risk.
  • JAILBREAK: Jailbreak attempt risk.
  • FUNCTION_CALL: Unsafe or invalid function-call risk.
  • SOCIAL_BIAS: Social bias risk.
  • VIOLENCE: Violent content risk.
  • SEXUAL_CONTENT: Explicit sexual content risk.
  • UNETHICAL_BEHAVIOR: Unethical behaviour risk.
Methods:

FUNC get_available_risks

get_available_risks(cls) -> list[str]
Return a list of all available risk type identifiers. Returns:
  • list[str]: String values of all GuardianRisk enum members.

CLASS GuardianCheck

Enhanced risk checking using Granite Guardian 3.3 8B with multiple backend support. [DEPRECATED as of V 0.4 — Use Intrinsics instead] Args:
  • risk: The type of risk to check for. Required unless custom_criteria is provided.
  • backend_type: Backend type to use — "ollama" or "huggingface".
  • model_version: Specific Guardian model version. Defaults to the appropriate 8B model for the chosen backend.
  • device: Device string for HuggingFace inference (e.g. "cuda").
  • ollama_url: Base URL for the Ollama server.
  • thinking: Enable chain-of-thought reasoning mode in the Guardian model.
  • custom_criteria: Free-text criteria string used in place of a standard GuardianRisk value.
  • context_text: Context document for groundedness checks.
  • tools: Tool schemas for function-call validation.
  • backend: Pre-initialised backend instance to reuse; avoids loading the model multiple times.
Methods:

FUNC get_effective_risk

get_effective_risk(self) -> str
Return the effective risk criteria to use for validation. Returns the custom_criteria string when one was provided, otherwise returns the risk identifier set during initialisation. Returns:
  • The active risk/criteria string forwarded to the Guardian model.

FUNC get_available_risks

get_available_risks(cls) -> list[str]
Return a list of all available standard risk type identifiers. Returns:
  • list[str]: String values of all GuardianRisk enum members.

FUNC validate

validate(self, backend: Backend, ctx: Context) -> ValidationResult
Validate a conversation using Granite Guardian via the selected backend. Builds a minimal chat context from the current session context, invokes the Guardian model, and parses its <score>yes/no</score> output. A "No" label (risk not detected) is treated as a passing validation result. Args:
  • backend: The session backend (used as a fallback context source; the Guardian’s own backend is used for generation).
  • ctx: The current conversation context to validate.
  • format: Unused; present for interface compatibility.
  • model_options: Additional model options merged into the Guardian backend call.
Returns:
  • result=True when the content is considered safe
  • (Guardian returns "No"), result=False otherwise.