Benchmark wav

This example measuers the performance of loading WAV audio.

It compares three different approaches for loading WAV files:

  • spdl.io.load_wav(): Fast native WAV parser optimized for simple PCM formats

  • spdl.io.load_audio(): General-purpose audio loader using FFmpeg backend

  • soundfile (libsndfile): Popular third-party audio I/O library

The benchmark suite evaluates performance across multiple dimensions:

  • Various audio configurations (sample rates, channels, bit depths, durations)

  • Different thread counts (1, 2, 4, 8, 16) to measure parallel scaling

  • Statistical analysis with 95% confidence intervals using Student’s t-distribution

  • Queries per second (QPS) as the primary performance metric

Example

$ numactl --membind 0 --cpubind 0 python benchmark_wav.py --output wav_benchmark_results.csv
# Plot results
$ python benchmark_wav_plot.py --input wav_benchmark_results.csv --output wav_benchmark_plot.png
# Plot results without load_wav
$ python benchmark_wav_plot.py --input wav_benchmark_results.csv --output wav_benchmark_plot_2.png --filter '3. spdl.io.load_wav'

Result

The following plot shows the QPS (measured by the number of files processed) of each functions with different audio durations.

../_static/data/example-benchmark-wav.webp

The spdl.io.load_wav() is a lot faster than the others, because all it does is reinterpret the input byte string as array. It shows the same performance for audio with longer duration.

And since parsing WAV is instant, the spdl.io.load_wav function spends more time on creation of NumPy Array. It needs to acquire the GIL, thus the performance does not scale in multi-threading. (This performance pattern of this function is pretty same as the spdl.io.load_npz.)

The following is the same plot without load_wav.

../_static/data/example-benchmark-wav-2.webp

libsoundfile has to process data iteratively (using io.BytesIO) because it does not support directly loading from byte string, so it takes longer to process longer audio data. The performance trend (single thread being the fastest) suggests that it does not release the GIL majority of the time.

The spdl.io.load_audio() function (the generic FFmpeg-based implementation) does a lot of work so its overall performance is not as good, but it scales in multi-threading as it releases the GIL almost entirely.

Source

Source

Click here to see the source.
  1# Copyright (c) Meta Platforms, Inc. and affiliates.
  2# All rights reserved.
  3#
  4# This source code is licensed under the BSD-style license found in the
  5# LICENSE file in the root directory of this source tree.
  6
  7# pyre-strict
  8
  9"""This example measuers the performance of loading WAV audio.
 10
 11It compares three different approaches for loading WAV files:
 12
 13- :py:func:`spdl.io.load_wav`: Fast native WAV parser optimized for simple PCM formats
 14- :py:func:`spdl.io.load_audio`: General-purpose audio loader using FFmpeg backend
 15- ``soundfile`` (``libsndfile``): Popular third-party audio I/O library
 16
 17The benchmark suite evaluates performance across multiple dimensions:
 18
 19- Various audio configurations (sample rates, channels, bit depths, durations)
 20- Different thread counts (1, 2, 4, 8, 16) to measure parallel scaling
 21- Statistical analysis with 95% confidence intervals using Student's t-distribution
 22- Queries per second (QPS) as the primary performance metric
 23
 24**Example**
 25
 26.. code-block:: shell
 27
 28   $ numactl --membind 0 --cpubind 0 python benchmark_wav.py --output wav_benchmark_results.csv
 29   # Plot results
 30   $ python benchmark_wav_plot.py --input wav_benchmark_results.csv --output wav_benchmark_plot.png
 31   # Plot results without load_wav
 32   $ python benchmark_wav_plot.py --input wav_benchmark_results.csv --output wav_benchmark_plot_2.png --filter '3. spdl.io.load_wav'
 33
 34**Result**
 35
 36The following plot shows the QPS (measured by the number of files processed) of each
 37functions with different audio durations.
 38
 39.. image:: ../../_static/data/example-benchmark-wav.webp
 40
 41
 42The :py:func:`spdl.io.load_wav` is a lot faster than the others, because all it
 43does is reinterpret the input byte string as array.
 44It shows the same performance for audio with longer duration.
 45
 46And since parsing WAV is instant, the spdl.io.load_wav function spends more time on
 47creation of NumPy Array.
 48It needs to acquire the GIL, thus the performance does not scale in multi-threading.
 49(This performance pattern of this function is pretty same as the
 50:ref:`spdl.io.load_npz <example-data-formats>`.)
 51
 52The following is the same plot without ``load_wav``.
 53
 54.. image:: ../../_static/data/example-benchmark-wav-2.webp
 55
 56``libsoundfile`` has to process data iteratively (using ``io.BytesIO``) because
 57it does not support directly loading from byte string, so it takes longer to process
 58longer audio data.
 59The performance trend (single thread being the fastest) suggests that
 60it does not release the GIL majority of the time.
 61
 62The :py:func:`spdl.io.load_audio` function (the generic FFmpeg-based implementation) does
 63a lot of work so its overall performance is not as good,
 64but it scales in multi-threading as it releases the GIL almost entirely.
 65"""
 66
 67__all__ = [
 68    "BenchmarkConfig",
 69    "create_wav_data",
 70    "load_sf",
 71    "load_spdl_audio",
 72    "load_spdl_wav",
 73    "main",
 74]
 75
 76import argparse
 77import io
 78import os
 79from collections.abc import Callable
 80from dataclasses import dataclass
 81
 82import numpy as np
 83import scipy.io.wavfile
 84import soundfile as sf
 85import spdl.io
 86from numpy.typing import NDArray
 87
 88try:
 89    from examples.benchmark_utils import (  # pyre-ignore[21]
 90        BenchmarkResult,
 91        BenchmarkRunner,
 92        ExecutorType,
 93        get_default_result_path,
 94        save_results_to_csv,
 95    )
 96except ImportError:
 97    from spdl.examples.benchmark_utils import (
 98        BenchmarkResult,
 99        BenchmarkRunner,
100        ExecutorType,
101        get_default_result_path,
102        save_results_to_csv,
103    )
104
105
106DEFAULT_RESULT_PATH: str = get_default_result_path(__file__)
107
108
109@dataclass(frozen=True)
110class BenchmarkConfig:
111    """Configuration for a single WAV benchmark run.
112
113    Combines both audio file parameters and benchmark execution parameters.
114
115    Attributes:
116        function_name: Name of the function being tested
117        function: The actual function to benchmark
118        sample_rate: Audio sample rate in Hz
119        num_channels: Number of audio channels
120        bits_per_sample: Bit depth per sample (16 or 32)
121        duration_seconds: Duration of the audio file in seconds
122        num_threads: Number of concurrent threads
123        iterations: Number of iterations per run
124        num_runs: Number of runs for statistical analysis
125    """
126
127    function_name: str
128    function: Callable[[bytes], NDArray]
129    sample_rate: int
130    num_channels: int
131    bits_per_sample: int
132    duration_seconds: float
133    num_threads: int
134    iterations: int
135    num_runs: int
136
137
138def create_wav_data(
139    sample_rate: int = 44100,
140    num_channels: int = 2,
141    bits_per_sample: int = 16,
142    duration_seconds: float = 1.0,
143) -> tuple[bytes, NDArray]:
144    """Create a WAV file in memory for benchmarking.
145
146    Args:
147        sample_rate: Sample rate in Hz
148        num_channels: Number of audio channels
149        bits_per_sample: Bits per sample (16 or 32)
150        duration_seconds: Duration of audio in seconds
151
152    Returns:
153        Tuple of (WAV file as bytes, audio samples array)
154    """
155    num_samples = int(sample_rate * duration_seconds)
156
157    dtype_map = {
158        16: np.int16,
159        32: np.int32,
160    }
161    dtype = dtype_map[bits_per_sample]
162    max_amplitude = 32767 if bits_per_sample == 16 else 2147483647
163
164    t = np.linspace(0, duration_seconds, num_samples)
165    frequencies = 440.0 + np.arange(num_channels) * 110.0
166    sine_waves = np.sin(2 * np.pi * frequencies[:, np.newaxis] * t)
167    samples = (sine_waves.T * max_amplitude).astype(dtype)
168
169    wav_buffer = io.BytesIO()
170    scipy.io.wavfile.write(wav_buffer, sample_rate, samples)
171    wav_data = wav_buffer.getvalue()
172
173    return wav_data, samples
174
175
176def load_sf(wav_data: bytes) -> NDArray:
177    """Load WAV data using soundfile library.
178
179    Args:
180        wav_data: WAV file data as bytes
181
182    Returns:
183        Audio samples array as int16 numpy array
184    """
185    audio_file = io.BytesIO(wav_data)
186    data, _ = sf.read(audio_file, dtype="int16")
187    return data
188
189
190def load_spdl_audio(wav_data: bytes) -> NDArray:
191    """Load WAV data using :py:func:`spdl.io.load_audio` function.
192
193    Args:
194        wav_data: WAV file data as bytes
195
196    Returns:
197        Audio samples array as numpy array
198    """
199    return spdl.io.to_numpy(spdl.io.load_audio(wav_data, filter_desc=None))
200
201
202def load_spdl_wav(wav_data: bytes) -> NDArray:
203    """Load WAV data using :py:func:`spdl.io.load_wav` function.
204
205    Args:
206        wav_data: WAV file data as bytes
207
208    Returns:
209        Audio samples array as numpy array
210    """
211    return spdl.io.to_numpy(spdl.io.load_wav(wav_data))
212
213
214def _parse_args() -> argparse.Namespace:
215    """Parse command line arguments for the benchmark script.
216
217    Returns:
218        Parsed command line arguments
219    """
220    parser = argparse.ArgumentParser(description="Benchmark WAV loading performance")
221    parser.add_argument(
222        "--output",
223        type=lambda p: os.path.realpath(p),
224        default=DEFAULT_RESULT_PATH,
225        help="Output file path.",
226    )
227    return parser.parse_args()
228
229
230def main() -> None:
231    """Run comprehensive benchmark suite for WAV loading performance.
232
233    Benchmarks multiple configurations of audio files with different durations,
234    comparing spdl.io.load_wav, spdl.io.load_audio, and soundfile libraries
235    across various thread counts (1, 2, 4, 8, 16).
236    """
237    args = _parse_args()
238
239    # Define audio configurations to test
240    audio_configs = [
241        # (sample_rate, num_channels, bits_per_sample, duration_seconds)
242        # (8000, 1, 16, 1.0),  # Low quality mono
243        # (16000, 1, 16, 1.0),  # Speech quality mono
244        # (48000, 2, 16, 1.0),  # High quality stereo
245        # (48000, 8, 16, 1.0),  # Multi-channel audio
246        (44100, 2, 16, 1.0),  # CD quality stereo
247        (44100, 2, 16, 10.0),  #
248        (44100, 2, 16, 60.0),  #
249        # (44100, 2, 24, 1.0),  # 24-bit audio
250    ]
251
252    thread_counts = [1, 2, 4, 8, 16]
253
254    # Define benchmark function configurations
255    # (function_name, function, iterations_multiplier, num_runs)
256    benchmark_functions = [
257        ("3. spdl.io.load_wav", load_spdl_wav, 100, 100),  # Fast but unstable
258        ("2. spdl.io.load_audio", load_spdl_audio, 10, 5),  # Slower but stable
259        ("1. soundfile", load_sf, 10, 5),  # Slower but stable
260    ]
261
262    results: list[BenchmarkResult[BenchmarkConfig]] = []
263
264    for sample_rate, num_channels, bits_per_sample, duration_seconds in audio_configs:
265        # Create WAV data for this audio configuration
266        wav_data, ref = create_wav_data(
267            sample_rate=sample_rate,
268            num_channels=num_channels,
269            bits_per_sample=bits_per_sample,
270            duration_seconds=duration_seconds,
271        )
272
273        print(
274            f"\n{sample_rate}Hz, {num_channels}ch, {bits_per_sample}bit, {duration_seconds}s"
275        )
276        print(
277            f"Threads,"
278            f"SPDL WAV QPS ({duration_seconds} sec),CI Lower,CI Upper,"
279            f"SPDL Audio QPS ({duration_seconds} sec),CI Lower,CI Upper,"
280            f"soundfile QPS ({duration_seconds} sec),CI Lower,CI Upper"
281        )
282
283        for num_threads in thread_counts:
284            thread_results: list[BenchmarkResult[BenchmarkConfig]] = []
285
286            with BenchmarkRunner(
287                executor_type=ExecutorType.THREAD,
288                num_workers=num_threads,
289            ) as runner:
290                for (
291                    function_name,
292                    function,
293                    iterations_multiplier,
294                    num_runs,
295                ) in benchmark_functions:
296                    config = BenchmarkConfig(
297                        function_name=function_name,
298                        function=function,
299                        sample_rate=sample_rate,
300                        num_channels=num_channels,
301                        bits_per_sample=bits_per_sample,
302                        duration_seconds=duration_seconds,
303                        num_threads=num_threads,
304                        iterations=iterations_multiplier * num_threads,
305                        num_runs=num_runs,
306                    )
307
308                    result, output = runner.run(
309                        config,
310                        lambda fn=function, data=wav_data: fn(data),
311                        config.iterations,
312                        num_runs=config.num_runs,
313                    )
314
315                    output_to_validate = output
316                    if output_to_validate.ndim == 1:
317                        output_to_validate = output_to_validate[:, None]
318                    np.testing.assert_array_equal(output_to_validate, ref)
319
320                    thread_results.append(result)
321                    results.append(result)
322
323            # Print results for this thread count (all 3 benchmarks)
324            spdl_wav_result = thread_results[0]
325            spdl_audio_result = thread_results[1]
326            soundfile_result = thread_results[2]
327            print(
328                f"{num_threads},"
329                f"{spdl_wav_result.qps:.2f},{spdl_wav_result.ci_lower:.2f},{spdl_wav_result.ci_upper:.2f},"
330                f"{spdl_audio_result.qps:.2f},{spdl_audio_result.ci_lower:.2f},{spdl_audio_result.ci_upper:.2f},"
331                f"{soundfile_result.qps:.2f},{soundfile_result.ci_lower:.2f},{soundfile_result.ci_upper:.2f}"
332            )
333
334    save_results_to_csv(results, args.output)
335    print(
336        f"\nBenchmark complete. To generate plots, run:\n"
337        f"python benchmark_wav_plot.py --input {args.output} "
338        f"--output {args.output.replace('.csv', '.png')}"
339    )
340
341
342if __name__ == "__main__":
343    main()

Functions

Functions

create_wav_data(sample_rate: int = 44100, num_channels: int = 2, bits_per_sample: int = 16, duration_seconds: float = 1.0) tuple[bytes, ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Create a WAV file in memory for benchmarking.

Parameters:
  • sample_rate – Sample rate in Hz

  • num_channels – Number of audio channels

  • bits_per_sample – Bits per sample (16 or 32)

  • duration_seconds – Duration of audio in seconds

Returns:

Tuple of (WAV file as bytes, audio samples array)

load_sf(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using soundfile library.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as int16 numpy array

load_spdl_audio(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using spdl.io.load_audio() function.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as numpy array

load_spdl_wav(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using spdl.io.load_wav() function.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as numpy array

main() None[source]

Run comprehensive benchmark suite for WAV loading performance.

Benchmarks multiple configurations of audio files with different durations, comparing spdl.io.load_wav, spdl.io.load_audio, and soundfile libraries across various thread counts (1, 2, 4, 8, 16).

Classes

Classes

class BenchmarkConfig(function_name: str, function: Callable[[bytes], ndarray[tuple[Any, ...], dtype[_ScalarT]]], sample_rate: int, num_channels: int, bits_per_sample: int, duration_seconds: float, num_threads: int, iterations: int, num_runs: int)[source]

Configuration for a single WAV benchmark run.

Combines both audio file parameters and benchmark execution parameters.

function_name

Name of the function being tested

Type:

str

function

The actual function to benchmark

Type:

collections.abc.Callable[[bytes], numpy.ndarray[tuple[Any, …], numpy.dtype[numpy._typing._array_like._ScalarT]]]

sample_rate

Audio sample rate in Hz

Type:

int

num_channels

Number of audio channels

Type:

int

bits_per_sample

Bit depth per sample (16 or 32)

Type:

int

duration_seconds

Duration of the audio file in seconds

Type:

float

num_threads

Number of concurrent threads

Type:

int

iterations

Number of iterations per run

Type:

int

num_runs

Number of runs for statistical analysis

Type:

int

bits_per_sample: int
duration_seconds: float
function: Callable[[bytes], ndarray[tuple[Any, ...], dtype[_ScalarT]]]
function_name: str
iterations: int
num_channels: int
num_runs: int
num_threads: int
sample_rate: int