Benchmark wav

This example measuers the performance of loading WAV audio.

It compares three different approaches for loading WAV files:

  • spdl.io.load_wav(): Fast native WAV parser optimized for simple PCM formats

  • spdl.io.load_audio(): General-purpose audio loader using FFmpeg backend

  • soundfile (libsndfile): Popular third-party audio I/O library

The benchmark suite evaluates performance across multiple dimensions:

  • Various audio configurations (sample rates, channels, bit depths, durations)

  • Different thread counts (1, 2, 4, 8, 16) to measure parallel scaling

  • Statistical analysis with 95% confidence intervals using Student’s t-distribution

  • Queries per second (QPS) as the primary performance metric

Example

$ numactl --membind 0 --cpubind 0 python benchmark_wav.py --output wav_benchmark_results.csv
# Plot results
$ python benchmark_wav_plot.py --input wav_benchmark_results.csv --output wav_benchmark_plot.png
# Plot results without load_wav
$ python benchmark_wav_plot.py --input wav_benchmark_results.csv --output wav_benchmark_plot_2.png --filter '3. spdl.io.load_wav'

Result

The following plot shows the QPS (measured by the number of files processed) of each functions with different audio durations.

../_static/data/example-benchmark-wav.webp

The spdl.io.load_wav() is a lot faster than the others, because all it does is reinterpret the input byte string as array. It shows the same performance for audio with longer duration.

And since parsing WAV is instant, the spdl.io.load_wav function spends more time on creation of NumPy Array. It needs to acquire the GIL, thus the performance does not scale in multi-threading. (This performance pattern of this function is pretty same as the spdl.io.load_npz.)

The following is the same plot without load_wav.

../_static/data/example-benchmark-wav-2.webp

libsoundfile has to process data iteratively (using io.BytesIO) because it does not support directly loading from byte string, so it takes longer to process longer audio data. The performance trend (single thread being the fastest) suggests that it does not release the GIL majority of the time.

The spdl.io.load_audio() function (the generic FFmpeg-based implementation) does a lot of work so its overall performance is not as good, but it scales in multi-threading as it releases the GIL almost entirely.

Source

Source

Click here to see the source.
  1# Copyright (c) Meta Platforms, Inc. and affiliates.
  2# All rights reserved.
  3#
  4# This source code is licensed under the BSD-style license found in the
  5# LICENSE file in the root directory of this source tree.
  6
  7# pyre-strict
  8
  9"""This example measuers the performance of loading WAV audio.
 10
 11It compares three different approaches for loading WAV files:
 12
 13- :py:func:`spdl.io.load_wav`: Fast native WAV parser optimized for simple PCM formats
 14- :py:func:`spdl.io.load_audio`: General-purpose audio loader using FFmpeg backend
 15- ``soundfile`` (``libsndfile``): Popular third-party audio I/O library
 16
 17The benchmark suite evaluates performance across multiple dimensions:
 18
 19- Various audio configurations (sample rates, channels, bit depths, durations)
 20- Different thread counts (1, 2, 4, 8, 16) to measure parallel scaling
 21- Statistical analysis with 95% confidence intervals using Student's t-distribution
 22- Queries per second (QPS) as the primary performance metric
 23
 24**Example**
 25
 26.. code-block:: shell
 27
 28   $ numactl --membind 0 --cpubind 0 python benchmark_wav.py --output wav_benchmark_results.csv
 29   # Plot results
 30   $ python benchmark_wav_plot.py --input wav_benchmark_results.csv --output wav_benchmark_plot.png
 31   # Plot results without load_wav
 32   $ python benchmark_wav_plot.py --input wav_benchmark_results.csv --output wav_benchmark_plot_2.png --filter '3. spdl.io.load_wav'
 33
 34**Result**
 35
 36The following plot shows the QPS (measured by the number of files processed) of each
 37functions with different audio durations.
 38
 39.. image:: ../../_static/data/example-benchmark-wav.webp
 40
 41
 42The :py:func:`spdl.io.load_wav` is a lot faster than the others, because all it
 43does is reinterpret the input byte string as array.
 44It shows the same performance for audio with longer duration.
 45
 46And since parsing WAV is instant, the spdl.io.load_wav function spends more time on
 47creation of NumPy Array.
 48It needs to acquire the GIL, thus the performance does not scale in multi-threading.
 49(This performance pattern of this function is pretty same as the
 50:ref:`spdl.io.load_npz <example-benchmark-numpy>`.)
 51
 52The following is the same plot without ``load_wav``.
 53
 54.. image:: ../../_static/data/example-benchmark-wav-2.webp
 55
 56``libsoundfile`` has to process data iteratively (using ``io.BytesIO``) because
 57it does not support directly loading from byte string, so it takes longer to process
 58longer audio data.
 59The performance trend (single thread being the fastest) suggests that
 60it does not release the GIL majority of the time.
 61
 62The :py:func:`spdl.io.load_audio` function (the generic FFmpeg-based implementation) does
 63a lot of work so its overall performance is not as good,
 64but it scales in multi-threading as it releases the GIL almost entirely.
 65"""
 66
 67__all__ = [
 68    "BenchmarkConfig",
 69    "create_wav_data",
 70    "load_sf",
 71    "load_spdl_audio",
 72    "load_spdl_wav",
 73    "main",
 74]
 75
 76import argparse
 77import io
 78import os
 79from collections.abc import Callable
 80from dataclasses import dataclass
 81
 82import numpy as np
 83import scipy.io.wavfile
 84import soundfile as sf
 85import spdl.io
 86from numpy.typing import NDArray
 87
 88try:
 89    from examples.benchmark_utils import (  # pyre-ignore[21]
 90        BenchmarkResult,
 91        BenchmarkRunner,
 92        ExecutorType,
 93        get_default_result_path,
 94        save_results_to_csv,
 95    )
 96except ImportError:
 97    from spdl.examples.benchmark_utils import (
 98        BenchmarkResult,
 99        BenchmarkRunner,
100        ExecutorType,
101        get_default_result_path,
102        save_results_to_csv,
103    )
104
105
106DEFAULT_RESULT_PATH: str = get_default_result_path(__file__)
107
108
109@dataclass(frozen=True)
110class BenchmarkConfig:
111    """BenchmarkConfig()
112
113    Configuration for a single WAV benchmark run.
114
115    Combines both audio file parameters and benchmark execution parameters.
116    """
117
118    function_name: str
119    """Name of the function being tested"""
120
121    function: Callable[[bytes], NDArray]
122    """The actual function to benchmark"""
123
124    sample_rate: int
125    """Audio sample rate in Hz"""
126
127    num_channels: int
128    """Number of audio channels"""
129
130    bits_per_sample: int
131    """Bit depth per sample (16 or 32)"""
132
133    duration_seconds: float
134    """Duration of the audio file in seconds"""
135
136    num_threads: int
137    """Number of concurrent threads"""
138
139    iterations: int
140    """Number of iterations per run"""
141
142    num_runs: int
143    """Number of runs for statistical analysis"""
144
145
146def create_wav_data(
147    sample_rate: int = 44100,
148    num_channels: int = 2,
149    bits_per_sample: int = 16,
150    duration_seconds: float = 1.0,
151) -> tuple[bytes, NDArray]:
152    """Create a WAV file in memory for benchmarking.
153
154    Args:
155        sample_rate: Sample rate in Hz
156        num_channels: Number of audio channels
157        bits_per_sample: Bits per sample (16 or 32)
158        duration_seconds: Duration of audio in seconds
159
160    Returns:
161        Tuple of (WAV file as bytes, audio samples array)
162    """
163    num_samples = int(sample_rate * duration_seconds)
164
165    dtype_map = {
166        16: np.int16,
167        32: np.int32,
168    }
169    dtype = dtype_map[bits_per_sample]
170    max_amplitude = 32767 if bits_per_sample == 16 else 2147483647
171
172    t = np.linspace(0, duration_seconds, num_samples)
173    frequencies = 440.0 + np.arange(num_channels) * 110.0
174    sine_waves = np.sin(2 * np.pi * frequencies[:, np.newaxis] * t)
175    samples = (sine_waves.T * max_amplitude).astype(dtype)
176
177    wav_buffer = io.BytesIO()
178    scipy.io.wavfile.write(wav_buffer, sample_rate, samples)
179    wav_data = wav_buffer.getvalue()
180
181    return wav_data, samples
182
183
184def load_sf(wav_data: bytes) -> NDArray:
185    """Load WAV data using soundfile library.
186
187    Args:
188        wav_data: WAV file data as bytes
189
190    Returns:
191        Audio samples array as int16 numpy array
192    """
193    audio_file = io.BytesIO(wav_data)
194    data, _ = sf.read(audio_file, dtype="int16")
195    return data
196
197
198def load_spdl_audio(wav_data: bytes) -> NDArray:
199    """Load WAV data using :py:func:`spdl.io.load_audio` function.
200
201    Args:
202        wav_data: WAV file data as bytes
203
204    Returns:
205        Audio samples array as numpy array
206    """
207    return spdl.io.to_numpy(spdl.io.load_audio(wav_data, filter_desc=None))
208
209
210def load_spdl_wav(wav_data: bytes) -> NDArray:
211    """Load WAV data using :py:func:`spdl.io.load_wav` function.
212
213    Args:
214        wav_data: WAV file data as bytes
215
216    Returns:
217        Audio samples array as numpy array
218    """
219    return spdl.io.to_numpy(spdl.io.load_wav(wav_data))
220
221
222def _parse_args() -> argparse.Namespace:
223    """Parse command line arguments for the benchmark script.
224
225    Returns:
226        Parsed command line arguments
227    """
228    parser = argparse.ArgumentParser(description="Benchmark WAV loading performance")
229    parser.add_argument(
230        "--output",
231        type=lambda p: os.path.realpath(p),
232        default=DEFAULT_RESULT_PATH,
233        help="Output file path.",
234    )
235    return parser.parse_args()
236
237
238def main() -> None:
239    """Run comprehensive benchmark suite for WAV loading performance.
240
241    Benchmarks multiple configurations of audio files with different durations,
242    comparing spdl.io.load_wav, spdl.io.load_audio, and soundfile libraries
243    across various thread counts (1, 2, 4, 8, 16).
244    """
245    args = _parse_args()
246
247    # Define audio configurations to test
248    audio_configs = [
249        # (sample_rate, num_channels, bits_per_sample, duration_seconds)
250        # (8000, 1, 16, 1.0),  # Low quality mono
251        # (16000, 1, 16, 1.0),  # Speech quality mono
252        # (48000, 2, 16, 1.0),  # High quality stereo
253        # (48000, 8, 16, 1.0),  # Multi-channel audio
254        (44100, 2, 16, 1.0),  # CD quality stereo
255        (44100, 2, 16, 10.0),  #
256        (44100, 2, 16, 60.0),  #
257        # (44100, 2, 24, 1.0),  # 24-bit audio
258    ]
259
260    thread_counts = [1, 2, 4, 8, 16]
261
262    # Define benchmark function configurations
263    # (function_name, function, iterations_multiplier, num_runs)
264    benchmark_functions = [
265        ("3. spdl.io.load_wav", load_spdl_wav, 100, 100),  # Fast but unstable
266        ("2. spdl.io.load_audio", load_spdl_audio, 10, 5),  # Slower but stable
267        ("1. soundfile", load_sf, 10, 5),  # Slower but stable
268    ]
269
270    results: list[BenchmarkResult[BenchmarkConfig]] = []
271
272    for sample_rate, num_channels, bits_per_sample, duration_seconds in audio_configs:
273        # Create WAV data for this audio configuration
274        wav_data, ref = create_wav_data(
275            sample_rate=sample_rate,
276            num_channels=num_channels,
277            bits_per_sample=bits_per_sample,
278            duration_seconds=duration_seconds,
279        )
280
281        print(
282            f"\n{sample_rate}Hz, {num_channels}ch, {bits_per_sample}bit, {duration_seconds}s"
283        )
284        print(
285            f"Threads,"
286            f"SPDL WAV QPS ({duration_seconds} sec),CI Lower,CI Upper,"
287            f"SPDL Audio QPS ({duration_seconds} sec),CI Lower,CI Upper,"
288            f"soundfile QPS ({duration_seconds} sec),CI Lower,CI Upper"
289        )
290
291        for num_threads in thread_counts:
292            thread_results: list[BenchmarkResult[BenchmarkConfig]] = []
293
294            with BenchmarkRunner(
295                executor_type=ExecutorType.THREAD,
296                num_workers=num_threads,
297            ) as runner:
298                for (
299                    function_name,
300                    function,
301                    iterations_multiplier,
302                    num_runs,
303                ) in benchmark_functions:
304                    config = BenchmarkConfig(
305                        function_name=function_name,
306                        function=function,
307                        sample_rate=sample_rate,
308                        num_channels=num_channels,
309                        bits_per_sample=bits_per_sample,
310                        duration_seconds=duration_seconds,
311                        num_threads=num_threads,
312                        iterations=iterations_multiplier * num_threads,
313                        num_runs=num_runs,
314                    )
315
316                    result, output = runner.run(
317                        config,
318                        lambda fn=function, data=wav_data: fn(data),
319                        config.iterations,
320                        num_runs=config.num_runs,
321                    )
322
323                    output_to_validate = output
324                    if output_to_validate.ndim == 1:
325                        output_to_validate = output_to_validate[:, None]
326                    np.testing.assert_array_equal(output_to_validate, ref)
327
328                    thread_results.append(result)
329                    results.append(result)
330
331            # Print results for this thread count (all 3 benchmarks)
332            spdl_wav_result = thread_results[0]
333            spdl_audio_result = thread_results[1]
334            soundfile_result = thread_results[2]
335            print(
336                f"{num_threads},"
337                f"{spdl_wav_result.qps:.2f},{spdl_wav_result.ci_lower:.2f},{spdl_wav_result.ci_upper:.2f},"
338                f"{spdl_audio_result.qps:.2f},{spdl_audio_result.ci_lower:.2f},{spdl_audio_result.ci_upper:.2f},"
339                f"{soundfile_result.qps:.2f},{soundfile_result.ci_lower:.2f},{soundfile_result.ci_upper:.2f}"
340            )
341
342    save_results_to_csv(results, args.output)
343    print(
344        f"\nBenchmark complete. To generate plots, run:\n"
345        f"python benchmark_wav_plot.py --input {args.output} "
346        f"--output {args.output.replace('.csv', '.png')}"
347    )
348
349
350if __name__ == "__main__":
351    main()

API Reference

Functions

create_wav_data(sample_rate: int = 44100, num_channels: int = 2, bits_per_sample: int = 16, duration_seconds: float = 1.0) tuple[bytes, ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Create a WAV file in memory for benchmarking.

Parameters:
  • sample_rate – Sample rate in Hz

  • num_channels – Number of audio channels

  • bits_per_sample – Bits per sample (16 or 32)

  • duration_seconds – Duration of audio in seconds

Returns:

Tuple of (WAV file as bytes, audio samples array)

load_sf(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using soundfile library.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as int16 numpy array

load_spdl_audio(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using spdl.io.load_audio() function.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as numpy array

load_spdl_wav(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using spdl.io.load_wav() function.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as numpy array

main() None[source]

Run comprehensive benchmark suite for WAV loading performance.

Benchmarks multiple configurations of audio files with different durations, comparing spdl.io.load_wav, spdl.io.load_audio, and soundfile libraries across various thread counts (1, 2, 4, 8, 16).

Classes

class BenchmarkConfig[source]

Configuration for a single WAV benchmark run.

Combines both audio file parameters and benchmark execution parameters.

bits_per_sample: int

Bit depth per sample (16 or 32)

duration_seconds: float

Duration of the audio file in seconds

function: Callable[[bytes], ndarray[tuple[Any, ...], dtype[_ScalarT]]]

The actual function to benchmark

function_name: str

Name of the function being tested

iterations: int

Number of iterations per run

num_channels: int

Number of audio channels

num_runs: int

Number of runs for statistical analysis

num_threads: int

Number of concurrent threads

sample_rate: int

Audio sample rate in Hz