mellea.backends.cache
Cache abstractions and implementations for model state.
Defines the abstract Cache interface with put, get, and
current_size methods, and provides a concrete SimpleLRUCache that evicts
the least-recently-used entry when capacity is exceeded — optionally calling an
on_evict callback (e.g. to free GPU memory). Used by local HuggingFace backends
to store and reuse KV cache state across requests.
Classes
CLASS Cache
A Cache for storing model state (e.g., kv cache).
Methods:
FUNC put
put(self, key: str | int, value: Any) -> None
Insert a value into the cache under the given key.
May trigger eviction of existing entries if the cache is at capacity.
Args:
key: The cache key to store the value under.value: The value to store.
FUNC get
get(self, key: str | int) -> Any | None
Retrieve a value from the cache by key.
May affect which entries are considered for future eviction (e.g. LRU ordering).
Args:
key: The cache key to look up.
Returns:
- Any | None: The cached value, or
Noneifkeyhas no cached entry.
FUNC current_size
current_size(self) -> int
Return the number of entries currently stored in the cache.
Returns:
- Count of items currently held in the cache; useful for debugging.
CLASS SimpleLRUCache
A simple LRU <https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_Recently_Used_(LRU)>_ cache.
Evicts the least-recently-used entry when capacity is exceeded, optionally
invoking an on_evict callback (e.g. to free GPU memory). Used by local
HuggingFace backends to store and reuse KV cache state across requests.
Args:
capacity: Maximum number of items to store in the cache.on_evict: Optional callback invoked with the evicted value whenever an entry is removed to make room for a new one.
Attributes:
cache: Internal ordered dict used for LRU tracking; always initialised empty at construction.
Methods:
FUNC current_size
current_size(self) -> int
Return the number of entries currently stored in the cache.
Returns:
- Count of items currently held in the cache; useful for debugging.
FUNC get
get(self, key: str | int) -> Any | None
Retrieve a value from the cache, promoting it to most-recently-used.
Args:
key: The cache key to look up.
Returns:
- Any | None: The cached value, or
Noneifkeyis not present.
FUNC put
put(self, key: str | int, value: Any) -> None
Insert or update a value in the cache.
If the cache is at capacity and the key is new, the least-recently-used
entry is evicted first, invoking the on_evict callback if set.
Args:
key: The cache key to store the value under.value: The value to cache.