spdl.pipeline.iterate_in_subprocess

iterate_in_subprocess(fn: Callable[[], Iterable[T]], *, buffer_size: int = 3, initializer: Callable[[], None] | Sequence[Callable[[], None]] | None = None, mp_context: str | None = None, timeout: float | None = None, daemon: bool = False) Iterable[T][source]

[Experimental] Run the given iterable in a subprocess.

The subprocess is created once and reused across iterations. The returned Iterable supports multiple iterations — each call to iter() (or for ... in) instructs the worker to create a fresh iterator from the underlying iterable without spawning a new process. Because process creation involves overhead (fork/spawn, initializer execution, and pickling), reusing the same worker is more efficient than calling this function repeatedly.

Note

fn() is called once in the subprocess to create the iterable. Each subsequent iter() call creates a fresh iterator by calling iter(iterable) on the same object. If fn() returns a proper Iterable (a class with __iter__ that creates a new iterator each time), re-iteration works as expected.

However, if fn() returns a generator (or any single-use iterator), re-iteration will silently yield no items. This is because a generator is its own iterator — iter(generator) returns self — so once exhausted, calling iter() again returns the same exhausted object. The first iteration will work correctly, but all subsequent iterations will appear empty.

Parameters:
  • fn – Function that returns an iterator. Use functools.partial() to pass arguments to the function.

  • buffer_size – Maximum number of items to buffer in the queue.

  • initializer – Functions executed in the subprocess before iteration starts.

  • mp_context – Context to use for multiprocessing. If not specified, a default method is used.

  • timeout – Timeout for inactivity. If the generator function does not yield any item for this amount of time, the process is terminated.

  • daemon – Whether to run the process as a daemon. Use it only for debugging.

Returns:

Iterator over the results of the generator function.

Note

The function and the values yielded by the iterator of generator must be picklable.

See also