API Reference¶
Core¶
- @memoize(*, disable=None, **kwargs)¶
See
charmonium.cache.Memoized
.
- Parameters
- Return type
Callable[[Callable[[<sphinx.util.inspect.TypeAliasForwardRef object at 0x7f7c304a1ae0>], <sphinx.util.inspect.TypeAliasForwardRef object at 0x7f7c304a18d0>]], Memoized[<sphinx.util.inspect.TypeAliasForwardRef object at 0x7f7c304a1330>, <sphinx.util.inspect.TypeAliasForwardRef object at 0x7f7c304a1960>]]
- class Memoized(func: 'Callable[FuncParams, FuncReturn]', *, group: 'MemoizedGroup' = <charmonium.cache.util.Future object at 0x7f7c30fac8b0>, name: 'Optional[str]' = None, use_obj_store: 'bool' = True, use_metadata_size: 'bool' = False, pickler: 'Optional[Pickler]' = None, extra_func_state: 'Callable[[Callable[FuncParams, FuncReturn]], Any]' = Constant(None)) 'None' ¶
Bases:
Generic
[charmonium.cache.util.FuncParams
,charmonium.cache.util.FuncReturn
]
- Parameters
func (Callable[FuncParams, FuncReturn]) –
group (MemoizedGroup) –
name (Optional[str]) –
use_obj_store (bool) –
use_metadata_size (bool) –
pickler (Optional[Pickler]) –
extra_func_state (Callable[[Callable[FuncParams, FuncReturn]], Any]) –
- Return type
None
- __init__(func, *, group=<charmonium.cache.util.Future object>, name=None, use_obj_store=True, use_metadata_size=False, pickler=None, extra_func_state=Constant(None))¶
Construct a memozied function
- Parameters
group (MemoizedGroup) – see
charmonium.cache.MemoizedGroup
.name (Optional[str]) – A key-to-lookup distinguishing this funciton from others. Defaults to the Python module and name.
extra_func_state (Callable[[Callable[FuncParams, FuncReturn]], Any]) – An extra state function. The return-value is a key-to-match after the function name.
use_obj_store (bool) – whether the objects should be put behind object store, a layer of indirection.
use_metadata_size (bool) – whether to include the size of the metadata in the size threshold calculation for eviction.
pickler (Optional[Pickler]) – A custom pickler to use with the index. Pickle types must include tuples of picklable types, hashable types, and the arguments (
__cache_key__
and__cache_var__
, if defined).func (Callable[FuncParams, FuncReturn]) –
- Return type
None
- func: Callable[[FuncParams], FuncReturn]¶
- group: MemoizedGroup¶
- log_usage_report()¶
- Return type
None
- class MemoizedGroup(*, obj_store=None, replacement_policy='gdsize', size=KiB(100.0), pickler=<module 'pickle' from '/nix/store/iw1vmh509hcbby8dbpsaanbri4zsq7dj-python3-3.10.10/lib/python3.10/pickle.py'>, lock=None, fine_grain_persistence=False, fine_grain_eviction=False, extra_system_state=Constant(None), temporary=False)¶
Bases:
object
A MemoizedGroup holds the memoization for multiple functions.
- Parameters
- Return type
None
- __init__(*, obj_store=None, replacement_policy='gdsize', size=KiB(100.0), pickler=<module 'pickle' from '/nix/store/iw1vmh509hcbby8dbpsaanbri4zsq7dj-python3-3.10.10/lib/python3.10/pickle.py'>, lock=None, fine_grain_persistence=False, fine_grain_eviction=False, extra_system_state=Constant(None), temporary=False)¶
Construct a memoized group. Use with :py:function:Memoized.
- Parameters
obj_store (Optional[ObjStore]) – The object store to use for return values.
replacement_policy (Union[str, ReplacementPolicy]) – See policies submodule for options. You can pass an object conforming to the ReplacementPolicy protocol or one of REPLACEMENT_POLICIES.
size (Union[int, str, bitmath.Bitmath]) – The size as an int (in bytes), as a string (e.g. “3 MiB”), or as a bitmath.Bitmath.
pickler (Pickler) – A de/serialization to use on the index, conforming to the Pickler protocol.
lock (Optional[RWLock]) – A ReadersWriterLock to achieve exclusion. If the lock is wrong but the obj_store is atomic, then the memoization is still correct, but it may not be able to borrow values that another machine computed. Defaults to a FileRWLock.
fine_grain_persistence (bool) – De/serialize the index at every access. This is useful if you need to update the cache for multiple simultaneous processes, but it compromises performance in the single-process case.
fine_grain_eviction (bool) – Maintain the cache’s size through eviction at every access (rather than just the de/serialization points). This is useful if the caches size would not otherwise fit in memory, but it compromises performance if not needed.
extra_system_state (Callable[[], Any]) – A callable that returns “extra” system state. If the system state changes, the cache is dumped.
temporary (bool) – Whether the cache should be cleared at the end of the process; This is useful for tests.
- Return type
None
- remove_orphans()¶
Remove data in the objstore that are not referenced by the index.
Orphans can accumulate if there are multiple processes. They might generate orphans if they crash or if there is a bug in my code (yikes!). If you notice accumulation of orphans, I recommend calling this function once or once-per-pipeline to clean them up.
However, this can compromise performance if you do it while peer processes are active. They may have a slightly different index-state, and you might remove something they wanted to keep. I recommend calling this before you fork off processes.
- Return type
None
Components¶
- class ObjStore¶
Bases:
object
An object-store is a persistent mapping from int to bytes.
- clear()¶
- Return type
None
- class DirObjStore(path, key_bytes=16)¶
Bases:
charmonium.cache.obj_store.ObjStore
Use a directory in the filesystem as an object-store.
Each object is a file in the directory.
Note that this directory must not contain any other files.
- __init__(path, key_bytes=16)¶
- clear()¶
- Return type
None
- class ReplacementPolicy¶
Bases:
object
A replacement policy for a cache
- abstract add(key, entry)¶
Called when a key, entry pair is added to the index.
- Parameters
key (Any) –
entry (Entry) –
- Return type
None
- abstract access(key, entry)¶
Called when a key, entry pair is accessed/used.
Update last-used-time here.
- Parameters
key (Any) –
entry (Entry) –
- Return type
None
- invalidate(key, entry)¶
Called when a key is invalidated by the a subkey-to-match.
This could be useful as a metric to see develop a heuristic for fast a versioned resources is changing.
- Parameters
key (Any) –
entry (Entry) –
- Return type
None
- abstract evict()¶
Select a key, entry pair to evict.
- abstract update(other)¶
Update self with contents of other, but self overrides other.
This is necessary because there could be multiple processes using the same MemoizedGroup. A differnet process may have made progress. We want to incorporate their progress into this process.
- Parameters
other (ReplacementPolicy) –
- Return type
None
- class GDSize¶
Bases:
charmonium.cache.replacement_policies.ReplacementPolicy
GreedyDual-Size policy, described by [Cao et al]_.
- Return type
None
- add(key, entry)¶
Called when a key, entry pair is added to the index.
- Parameters
key (Any) –
entry (Entry) –
- Return type
None
- access(key, entry)¶
Called when a key, entry pair is accessed/used.
Update last-used-time here.
- Parameters
key (Any) –
entry (Entry) –
- Return type
None
- invalidate(key, entry)¶
Called when a key is invalidated by the a subkey-to-match.
This could be useful as a metric to see develop a heuristic for fast a versioned resources is changing.
- evict()¶
Select a key, entry pair to evict.
- update(other)¶
Update self with contents of other, but self overrides other.
This is necessary because there could be multiple processes using the same MemoizedGroup. A differnet process may have made progress. We want to incorporate their progress into this process.
- Parameters
other (ReplacementPolicy) –
- Return type
None
- class RWLock¶
Bases:
object
A Readers-Writer Lock guarantees N readers xor 1 writer.
This permits read-concurrency in the underlying resource when there is no writer.
- class FileRWLock(path: 'PathLikeFrom') 'None' ¶
Bases:
charmonium.cache.rw_lock.RWLock
- __init__(path)¶
Creates a lockfile at path.
- class NaiveRWLock(lock)¶
Bases:
charmonium.cache.rw_lock.RWLock
RWLock constructed from a regular Lock.
A true readers-writers lock permits read concurrency (N readers xor 1 writer), but in some cases, that may be more maintanence effort than it is worth. A NaiveRWLock permits 1 reader xor 1 writer.
- Parameters
lock (Lock) –
- Return type
None
Helpers¶
- class FileContents(path, comparison='crc32')¶
Bases:
object
wraps the path and its contents, to make your function pure
When FileContents is un/pickled, the contents of path get restored/snapshotted.
When FileContents is used as an argument, the path is the key and the contents are the version.
FileContents is
os.PathLike
, so you canopen(FileContents("file"), "rb")
. You won’t even know its not a string.Since this changes the un/pickle protocol, this class might cause unexpected results when used with fine_grain_persistence.
- __init__(path, comparison='crc32')¶
- class TTLInterval(interval)¶
Bases:
object
TTLInterval(td)()
returns a value that changes once everytd
.
td
may be a a timedelta or a number of seconds.It can be used as
extra_system_state
orextra_func_state
. For example,>>> from charmonium.cache import memoize >>> interval = TTLInterval(datetime.timedelta(seconds=0.5)) >>> # applies a 0.5-second TTL to justthis function >>> @memoize(extra_func_state=interval) ... def func(): ... passUnderlying usage:
>>> import datetime, time >>> interval = TTLInterval(datetime.timedelta(seconds=0.5)) >>> start = interval() >>> start == interval() True >>> time.sleep(0.5) >>> start == interval() False
Utils¶
- class PathLike¶
Bases:
object
Duck type of pathlib.Path
- mkdir(*, parents=Ellipsis, exist_ok=Ellipsis)¶
- stat()¶
- Return type
- property parent: Any¶