spdl.dataloader.CacheDataLoader

class CacheDataLoader(dl: Iterable[T], num_caches: int, return_caches_after: int)[source]

Caches values from the given data loader and returns caches after the given iteration.

The class is a simple wrapper around generic data loader instance. It is intended for estimating the maximum performance gain achieved by optimizing the data loader.

You can wrap your data loader with this class, and run it in the training pipeline, and compare the performance to see if the training pipeline is bottlenecked with data loading.

Parameters:
  • src – Source iterator. Expected to be a data loader object.

  • num_caches – The number of items (batches) to cache.

  • return_caches_after – The number of iterations to use the original iterator. By default, it uses the same value as num_caches.

  • delete_src – When this iterator starts returning the cached value, call del on the original data loader so that resources are released.

Returns:

The new iterator.

See also

__iter__() Iterator[T][source]

See spdl.pipeline.cache_iterator() for the detail.

__len__() int[source]

Returns the length of the original data loader if defined.