spdl.source.utils.embed_shuffle

embed_shuffle(src: IterableWithShuffle[T], /, *, shuffle_last: bool = True, epoch: int = 0) Iterable[T][source]

[Experimental] Convert IterableWithShuffle to Iterable by embedding the shuffle() call into __iter__().

Roughly equivalent to the following code snippet.

while True:
     if not shuffle_last:
         src.shuffle(seed=epoch)

     yield from src
     epoch += 1

     if shuffle_last:
         src.shuffle(seed=epoch)
Parameters:
  • src – The original iterable with shuffle method.

  • shuffle_last – If True (default), then shuffle is called at the end of the iteration. Other wise shuffle is called before each iteration.

  • epoch – The initial seed value passed to shuffle().

Why default to shuffle after iteration?

Shuffling at the beginning of an iteration blocks the pipeline entirely. Shuffling after the iteration gives an opportunity to hide the overhead of shuffling behind the pipeline execution.