spdl.source.utils.embed_shuffle¶
- embed_shuffle(src: IterableWithShuffle[T], /, *, shuffle_last: bool = True, epoch: int = 0) Iterable[T] [source]¶
[Experimental] Convert
IterableWithShuffle
toIterable
by embedding theshuffle()
call into__iter__()
.Roughly equivalent to the following code snippet.
while True: if not shuffle_last: src.shuffle(seed=epoch) yield from src epoch += 1 if shuffle_last: src.shuffle(seed=epoch)
- Parameters:
src – The original iterable with
shuffle
method.shuffle_last – If
True
(default), thenshuffle
is called at the end of the iteration. Other wiseshuffle
is called before each iteration.epoch – The initial seed value passed to
shuffle()
.
Why default to shuffle after iteration?
Shuffling at the beginning of an iteration blocks the pipeline entirely. Shuffling after the iteration gives an opportunity to hide the overhead of shuffling behind the pipeline execution.