spdl.pipeline

Overview

Implements Pipeline, a generic task execution engine.

API Reference

Functions

cache_iterator

cache_iterator

Caches values from the iterator and returns caches after the given iteration.

create_task

iterate_in_subprocess

iterate_in_subprocess

Run an iterator in a separate process, and yield the results one by one.

run_pipeline_in_subprocess

run_pipeline_in_subprocess

Run the given Pipeline in a subprocess, and iterate on the result.

Classes

Pipeline

Pipeline

Data processing pipeline.

PipelineBuilder

PipelineBuilder

Build Pipeline object.

TaskHook

TaskHook

Base class for hooks to be used in the pipeline.

TaskStatsHook

TaskStatsHook

Track the task runtimes and success rate.

TaskPerfStats

TaskPerfStats

Performance statistics of a task measured by TaskStatsHook.

AsyncQueue

AsyncQueue

Extends asyncio.Queue with init/finalize logic.

StatsQueue

StatsQueue

Measures the time stages are blocked on upstream/downstream stage.

QueuePerfStats

QueuePerfStats

Performance statistics collected by StatsQueue.

Exceptions

PipelineFailure

PipelineFailure

Thrown by spdl.pipeline.Pipeline when pipeline encounters an error.