Skip to main content

mellea.backends.aloras

Abstract interfaces for Backends that implement Activated LoRAs.

Classes

Alora

Activated LoRAs Aloras are are low-rank adapters that can reuse KV cache from their underlying model. This class should not be directly subclassed by a specific ALora. Each backend that supports ALora should provide a backend-specific abstract class that subclasses ALora. Individual ALoras should then be defined by subclassing the model-specific backend. ALoras are always attached to an underlying model and use the following calling convention:
  1. The underlying model is prompted (without the Alora active). We call this the input.
  2. The underlying model generates some tokens from the input context (again, without the ALora active). We call this the response.
  3. Then the adapter is activated and generates some tokens. We call then the alora_response.
Args:
  • name: An arbitrary name/label in the model serving engine (e.g. vllm, or local huggingface) to assign to an ALora. This is irrelevant from the alora’s (huggingface) model id.
Methods:

generate_using_strings

generate_using_strings(self, *args, **kwargs) -> ModelOutputThunk
Generates from the ALora using raw strings as the interface for inputs. In most cases, must be run from a running event loop. This has a generic signature because each aLoRA has different parameters depending on its functionality and how it gets called.

generate_using_stdlib

generate_using_stdlib(self, *args, **kwargs) -> CBlock
Generates from the Alora using Span-based backends.

AloraBackendMixin

Mixin class for backends capable of aLoRA functionality. Methods:

add_alora

add_alora(self, *args, **kwargs)
Loads an ALora.

get_alora

get_alora(self, alora_name: str) -> Alora | None
Returns the ALora by name, or None of that ALora isn’t loaded.

get_aloras

get_aloras(self) -> list[Alora]
Returns a list of all loaded aLoRA adapters.
I