Overview¶
What is SPDL?¶
SPDL (Scalable and Performant Data Loading) is a library for building efficient data loader for ML/AI trainings. It was created by a group of engineers/researchers who works on improving the efficiency of GPU workloads at Meta.
Core Concept¶
Its design incorporates the authors’ experience on optimizing pipelines and the UX/DevX feedbacks from pipeline owners. The key features include
The pipeline execution is fast and efficient.
The pipeline abstraction is flexible so that users can choose structures fit their environment/data/requirements.
The pipeline can export the runtime statistics of subcomponents, which helps identify the bottleneck.
These features allow to create a feedback loop, with which users can iteratively improve the performance.

Performance & Efficiency¶
Data loading is an important component in AI training, but high CPU utilization can degrade the training performance, so the data loading pipeline must be efficient.
The following plot shows the throughput of SPDL and other data loading solutions, and their CPU utilization†.
SPDL is fast and efficient in orchestrating tasks.

The core pipeline is pure-Python package, so you can install it anywhere. By default, it uses multi-threading as the core parallelism. It turned out that the performance is improved with the recent Python version upgrade, and enabling free-threading makes it even faster.

Media I/O¶
Oftentimes, the bottleneck of data loading is in media decoding and pre-processing. So, in addition to the pipeline abstraction, SPDL also provides an I/O module for multimedia (audio, video and image) processing. This I/O module was designed from scratch to achieve high performance and high throughput.
What SPDL is NOT¶
SPDL is not a drop-in replacement of existing data loading solutions.
SPDL does not guarantee automagical performance improvement.
SPDL was started as an experiment to achieve high throughput with thread-based parallelism even under the constraint of GIL.
SPDL is not expected to be the fastest solution out-of-box. Rather, it paves the way for practitioners to properly optimize the data loading pipeline.
Aoption of SPDL also paves the way for adoption of free-threaded Python in ML/AI application.
How to use SPDL?¶
SPDL is highly flexible. You can use it in variety of ways.
As a new end-to-end data loading pipeline. The primal goal of SPDL is to build performant data loading solutions for ML. The project mostly talk about the performance in end-to-end (from data storage to GPUs) context. Using SPDL as a replacement for existing data loading solution is what the development team intends.
As a replacement for media processor. SPDL uses multi-threading for fast data processing. It is possible to use it in sub-processes. If your current data loading pipeline is elaborated, and it is not ideal to replace the whole data loading pipeline, you can start adopting SPDL by replacing the media processing part. This should allow reducing the number of sub-processes, improving the overall performance.
As a research tool in free-threaded Python and high-performance computing. SPDL’s task execute engine uses async event loop at its core. Async event loop itself is single-threaded. Only the functions passed to the executors are executed concurrently. This makes SPDL an ideal test bed for experimenting with free-threaded Python and high-performance computing.