spdl.dataloader.CacheDataLoader¶
- class CacheDataLoader(dl: Iterable[T], num_caches: int, return_caches_after: int)[source]¶
Caches values from the given data loader and returns caches after the given iteration.
The class is a simple wrapper around generic data loader instance. It is intended for estimating the maximum performance gain achieved by optimizing the data loader.
You can wrap your data loader with this class, and run it in the training pipeline, and compare the performance to see if the training pipeline is bottlenecked with data loading.
- Parameters:
src – Source iterator. Expected to be a data loader object.
num_caches – The number of items (batches) to cache.
return_caches_after – The number of iterations to use the original iterator. By default, it uses the same value as
num_caches
.delete_src – When this iterator starts returning the cached value, call
del
on the original data loader so that resources are released.
- Returns:
The new iterator.
See also
spdl.pipeline.cache_iterator()
: The helper function that implements the caching logic.
- __iter__() Iterator[T] [source]¶
See
spdl.pipeline.cache_iterator()
for the detail.