Skip to main content

mellea.backends.cache

Cache abstractions and implementations for model state.

Defines the abstract Cache interface with put, get, and current_size methods, and provides a concrete SimpleLRUCache that evicts the least-recently-used entry when capacity is exceeded — optionally calling an on_evict callback (e.g. to free GPU memory). Used by local HuggingFace backends to store and reuse KV cache state across requests.

Classes

CLASS Cache

A Cache for storing model state (e.g., kv cache).

Methods:

FUNC put

put(self, key: str | int, value: Any) -> None

Insert a value into the cache under the given key.

May trigger eviction of existing entries if the cache is at capacity.

Args:

  • key: The cache key to store the value under.
  • value: The value to store.

FUNC get

get(self, key: str | int) -> Any | None

Retrieve a value from the cache by key.

May affect which entries are considered for future eviction (e.g. LRU ordering).

Args:

  • key: The cache key to look up.

Returns:

  • Any | None: The cached value, or None if key has no cached entry.

FUNC current_size

current_size(self) -> int

Return the number of entries currently stored in the cache.

Returns:

  • Count of items currently held in the cache; useful for debugging.

CLASS SimpleLRUCache

A simple LRU <https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_Recently_Used_(LRU)>_ cache.

Evicts the least-recently-used entry when capacity is exceeded, optionally invoking an on_evict callback (e.g. to free GPU memory). Used by local HuggingFace backends to store and reuse KV cache state across requests.

Args:

  • capacity: Maximum number of items to store in the cache.
  • on_evict: Optional callback invoked with the evicted value whenever an entry is removed to make room for a new one.

Attributes:

  • cache: Internal ordered dict used for LRU tracking; always initialised empty at construction.

Methods:

FUNC current_size

current_size(self) -> int

Return the number of entries currently stored in the cache.

Returns:

  • Count of items currently held in the cache; useful for debugging.

FUNC get

get(self, key: str | int) -> Any | None

Retrieve a value from the cache, promoting it to most-recently-used.

Args:

  • key: The cache key to look up.

Returns:

  • Any | None: The cached value, or None if key is not present.

FUNC put

put(self, key: str | int, value: Any) -> None

Insert or update a value in the cache.

If the cache is at capacity and the key is new, the least-recently-used entry is evicted first, invoking the on_evict callback if set.

Args:

  • key: The cache key to store the value under.
  • value: The value to cache.