Skip to main content

mellea.telemetry.metrics

OpenTelemetry metrics instrumentation for Mellea.

Provides metrics collection using OpenTelemetry Metrics API with support for:

  • Counters: Monotonically increasing values (e.g., request counts, token usage)
  • Histograms: Value distributions (e.g., latency, token counts)
  • UpDownCounters: Values that can increase or decrease (e.g., active sessions)

Metrics Exporters:

  • Console: Print metrics to console for debugging
  • OTLP: Export to OpenTelemetry Protocol collectors (Jaeger, Grafana, etc.)
  • Prometheus: Register metrics with prometheus_client registry for scraping

Configuration via environment variables:

General:

  • MELLEA_METRICS_ENABLED: Enable/disable metrics collection (default: false)
  • OTEL_SERVICE_NAME: Service name for metrics (default: mellea)

Console Exporter (debugging):

  • MELLEA_METRICS_CONSOLE: Print metrics to console (default: false)

OTLP Exporter (production observability):

  • MELLEA_METRICS_OTLP: Enable OTLP metrics exporter (default: false)
  • OTEL_EXPORTER_OTLP_ENDPOINT: OTLP endpoint for all signals (optional)
  • OTEL_EXPORTER_OTLP_METRICS_ENDPOINT: Metrics-specific endpoint (optional, overrides general)
  • OTEL_METRIC_EXPORT_INTERVAL: Export interval in milliseconds (default: 60000)

Prometheus Exporter:

  • MELLEA_METRICS_PROMETHEUS: Enable Prometheus metric reader (default: false)

Pricing (for cost counter):

  • MELLEA_PRICING_FILE: Path to a JSON file with custom model pricing overrides (optional)

Multiple exporters can be enabled simultaneously.

Example - Console debugging: export MELLEA_METRICS_ENABLED=true export MELLEA_METRICS_CONSOLE=true

Example - OTLP production: export MELLEA_METRICS_ENABLED=true export MELLEA_METRICS_OTLP=true export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317

Example - Prometheus monitoring: export MELLEA_METRICS_ENABLED=true export MELLEA_METRICS_PROMETHEUS=true

Example - Multiple exporters: export MELLEA_METRICS_ENABLED=true export MELLEA_METRICS_CONSOLE=true export MELLEA_METRICS_OTLP=true export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 export MELLEA_METRICS_PROMETHEUS=true

Built-in metrics (auto-recorded via plugins when metrics are enabled):

  • Token counters: mellea.llm.tokens.input, mellea.llm.tokens.output (unit: tokens)
  • Latency histograms: mellea.llm.request.duration (unit: s), mellea.llm.ttfb (unit: s, streaming only)
  • Error counter: mellea.llm.errors (unit: {error}), categorized by semantic error type
  • Cost counter: mellea.llm.cost.usd (unit: USD), estimated cost when pricing data is available
  • Sampling counters: mellea.sampling.attempts, mellea.sampling.successes, mellea.sampling.failures (unit: {attempt}/{sample}/{failure})
  • Requirement counters: mellea.requirement.checks (unit: {check}), mellea.requirement.failures (unit: {failure})
  • Tool counter: mellea.tool.calls (unit: {call}), tagged by tool name and status

Programmatic usage: from mellea.telemetry.metrics import create_counter, create_histogram

request_counter = create_counter( "mellea.requests", description="Total number of LLM requests", unit="1" ) request_counter.add(1, {"backend": "ollama", "model": "llama2"})

latency_histogram = create_histogram( "mellea.request.duration", description="Request latency distribution", unit="s" ) latency_histogram.record(1.5, {"backend": "ollama"})

Functions

FUNC is_metrics_enabled

is_metrics_enabled() -> bool

Check if metrics collection is enabled.

Returns:

  • True if MELLEA_METRICS_ENABLED is truthy AND OpenTelemetry is installed.

FUNC create_counter

create_counter(name: str, description: str = '', unit: str = '1') -> Any

Create a counter instrument for monotonically increasing values.

Counters are used for values that only increase, such as:

  • Total number of requests
  • Total tokens processed
  • Total errors encountered

Args:

  • name: Metric name (e.g., "mellea.requests.total")
  • description: Human-readable description of what this metric measures
  • unit: Unit of measurement (e.g., "1" for count, "ms" for milliseconds)

Returns:

  • Counter instrument (or no-op if metrics disabled)

FUNC create_histogram

create_histogram(name: str, description: str = '', unit: str = '1') -> Any

Create a histogram instrument for recording value distributions.

Histograms are used for values that vary and need statistical analysis:

  • Request latency
  • Token counts per request
  • Response sizes

Args:

  • name: Metric name (e.g., "mellea.request.duration")
  • description: Human-readable description
  • unit: Unit of measurement (e.g., "ms", "tokens", "bytes")

Returns:

  • Histogram instrument (or no-op if metrics disabled)

FUNC create_up_down_counter

create_up_down_counter(name: str, description: str = '', unit: str = '1') -> Any

Create an up-down counter for values that can increase or decrease.

UpDownCounters are used for values that go up and down:

  • Active sessions
  • Items in a queue
  • Memory usage

Args:

  • name: Metric name (e.g., "mellea.sessions.active")
  • description: Human-readable description
  • unit: Unit of measurement

Returns:

  • UpDownCounter instrument (or no-op if metrics disabled)

FUNC record_token_usage_metrics

record_token_usage_metrics(input_tokens: int | None, output_tokens: int | None, model: str, provider: str) -> None

Record token usage metrics following OpenTelemetry Gen-AI semantic conventions.

This is a no-op when metrics are disabled, ensuring zero overhead.

Args:

  • input_tokens: Number of input tokens (prompt tokens), or None if unavailable
  • output_tokens: Number of output tokens (completion tokens), or None if unavailable
  • model: Model identifier (e.g., "gpt-4", "llama2:7b")
  • provider: Provider name (e.g., "openai", "ollama", "watsonx")

FUNC record_request_duration

record_request_duration(duration_s: float, model: str, provider: str, streaming: bool = False) -> None

Record total LLM request duration.

This is a no-op when metrics are disabled, ensuring zero overhead.

Args:

  • duration_s: Request duration in seconds
  • model: Model identifier (e.g., "gpt-4", "llama2:7b")
  • provider: Provider name (e.g., "openai", "ollama", "watsonx")
  • streaming: Whether the request used streaming mode

FUNC record_ttfb

record_ttfb(ttfb_s: float, model: str, provider: str) -> None

Record time-to-first-token for streaming LLM requests.

This is a no-op when metrics are disabled, ensuring zero overhead. Should only be called for streaming requests.

Args:

  • ttfb_s: Time to first token in seconds
  • model: Model identifier (e.g., "gpt-4", "llama2:7b")
  • provider: Provider name (e.g., "openai", "ollama", "watsonx")

FUNC classify_error

classify_error(exc: BaseException) -> str

Map an exception to a semantic error type string.

Checks OpenAI SDK exception types first (when openai is installed), then falls back to stdlib exceptions and name-based heuristics.

Args:

  • exc: The exception to classify.

Returns:

  • One of the ERROR_TYPE_* constants.

FUNC record_error

record_error(error_type: str, model: str, provider: str, exception_class: str) -> None

Record an LLM error metric.

This is a no-op when metrics are disabled, ensuring zero overhead.

Args:

  • error_type: Semantic error category (use ERROR_TYPE_* constants).
  • model: Model identifier (e.g. "gpt-4", "llama2:7b").
  • provider: Provider name (e.g. "openai", "ollama").
  • exception_class: Python exception class name (e.g. "RateLimitError").

FUNC record_cost

record_cost(cost: float, model: str, provider: str) -> None

Record estimated LLM request cost in USD.

This is a no-op when metrics are disabled, ensuring zero overhead. Only call this when pricing data is available (i.e., compute_cost returned a non-None value).

Args:

  • cost: Estimated request cost in US dollars.
  • model: Model identifier (e.g. "gpt-4o", "claude-sonnet-4-6").
  • provider: Provider name (e.g. "openai", "ollama").

FUNC record_sampling_attempt

record_sampling_attempt(strategy: str) -> None

Record one sampling attempt for the given strategy.

This is a no-op when metrics are disabled, ensuring zero overhead.

Args:

  • strategy: Sampling strategy class name (e.g. "RejectionSamplingStrategy").

FUNC record_sampling_outcome

record_sampling_outcome(strategy: str, success: bool) -> None

Record the final outcome (success or failure) of a sampling loop.

This is a no-op when metrics are disabled, ensuring zero overhead.

Args:

  • strategy: Sampling strategy class name (e.g. "RejectionSamplingStrategy").
  • success: True if at least one attempt passed all requirements.

FUNC record_requirement_check

record_requirement_check(requirement: str) -> None

Record one requirement validation check.

This is a no-op when metrics are disabled, ensuring zero overhead.

Args:

  • requirement: Requirement class name (e.g. "LLMaJRequirement").

FUNC record_requirement_failure

record_requirement_failure(requirement: str, reason: str) -> None

Record one requirement validation failure.

This is a no-op when metrics are disabled, ensuring zero overhead.

Args:

  • requirement: Requirement class name (e.g. "LLMaJRequirement").
  • reason: Human-readable failure reason from ValidationResult.reason.

FUNC record_tool_call

record_tool_call(tool: str, status: str) -> None

Record one tool invocation.

This is a no-op when metrics are disabled, ensuring zero overhead.

Args:

  • tool: Name of the tool that was invoked.
  • status: "success" if the tool executed without error, "failure" otherwise.