Benchmark wav

This example measuers the performance of loading WAV audio.

It compares three different approaches for loading WAV files:

  • spdl.io.load_wav(): Fast native WAV parser optimized for simple PCM formats

  • spdl.io.load_audio(): General-purpose audio loader using FFmpeg backend

  • soundfile (libsndfile): Popular third-party audio I/O library

The benchmark suite evaluates performance across multiple dimensions:

  • Various audio configurations (sample rates, channels, bit depths, durations)

  • Different thread counts (1, 2, 4, 8, 16) to measure parallel scaling

  • Statistical analysis with 95% confidence intervals using Student’s t-distribution

  • Queries per second (QPS) as the primary performance metric

Example

$ numactl --membind 0 --cpubind 0 python benchmark_wav.py --output wav_benchmark_results.csv
# Plot results
$ python plot_wav_benchmark.py --input wav_benchmark_results.csv --output wav_benchmark_plot.png
# Plot results without load_wav
$ python plot_wav_benchmark.py --input wav_benchmark_results.csv --output wav_benchmark_plot_2.png --filter '3. spdl.io.load_wav'

Result

The following plot shows the QPS (measured by the number of files processed) of each functions with different audio durations.

../_static/data/example-benchmark-wav.webp

The spdl.io.load_wav() is a lot faster than the others, because all it does is reinterpret the input byte string as array. It shows the same performance for audio with longer duration.

And since parsing WAV is instant, the spdl.io.load_wav function spends more time on creation of NumPy Array. It needs to acquire the GIL, thus the performance does not scale in multi-threading. (This performance pattern of this function is pretty same as the spdl.io.load_npz.)

The following is the same plot without load_wav.

../_static/data/example-benchmark-wav-2.webp

libsoundfile has to process data iteratively (using io.BytesIO) because it does not support directly loading from byte string, so it takes longer to process longer audio data. The performance trend (single thread being the fastest) suggests that it does not release the GIL majority of the time.

The spdl.io.load_audio() function (the generic FFmpeg-based implementation) does a lot of work so its overall performance is not as good, but it scales in multi-threading as it releases the GIL almost entirely.

Source

Source

Click here to see the source.
  1# Copyright (c) Meta Platforms, Inc. and affiliates.
  2# All rights reserved.
  3#
  4# This source code is licensed under the BSD-style license found in the
  5# LICENSE file in the root directory of this source tree.
  6
  7# pyre-strict
  8
  9"""This example measuers the performance of loading WAV audio.
 10
 11It compares three different approaches for loading WAV files:
 12
 13- :py:func:`spdl.io.load_wav`: Fast native WAV parser optimized for simple PCM formats
 14- :py:func:`spdl.io.load_audio`: General-purpose audio loader using FFmpeg backend
 15- ``soundfile`` (``libsndfile``): Popular third-party audio I/O library
 16
 17The benchmark suite evaluates performance across multiple dimensions:
 18
 19- Various audio configurations (sample rates, channels, bit depths, durations)
 20- Different thread counts (1, 2, 4, 8, 16) to measure parallel scaling
 21- Statistical analysis with 95% confidence intervals using Student's t-distribution
 22- Queries per second (QPS) as the primary performance metric
 23
 24**Example**
 25
 26.. code-block:: shell
 27
 28   $ numactl --membind 0 --cpubind 0 python benchmark_wav.py --output wav_benchmark_results.csv
 29   # Plot results
 30   $ python plot_wav_benchmark.py --input wav_benchmark_results.csv --output wav_benchmark_plot.png
 31   # Plot results without load_wav
 32   $ python plot_wav_benchmark.py --input wav_benchmark_results.csv --output wav_benchmark_plot_2.png --filter '3. spdl.io.load_wav'
 33
 34**Result**
 35
 36The following plot shows the QPS (measured by the number of files processed) of each
 37functions with different audio durations.
 38
 39.. image:: ../../_static/data/example-benchmark-wav.webp
 40
 41
 42The :py:func:`spdl.io.load_wav` is a lot faster than the others, because all it
 43does is reinterpret the input byte string as array.
 44It shows the same performance for audio with longer duration.
 45
 46And since parsing WAV is instant, the spdl.io.load_wav function spends more time on
 47creation of NumPy Array.
 48It needs to acquire the GIL, thus the performance does not scale in multi-threading.
 49(This performance pattern of this function is pretty same as the
 50:ref:`spdl.io.load_npz <example-data-formats>`.)
 51
 52The following is the same plot without ``load_wav``.
 53
 54.. image:: ../../_static/data/example-benchmark-wav-2.webp
 55
 56``libsoundfile`` has to process data iteratively (using ``io.BytesIO``) because
 57it does not support directly loading from byte string, so it takes longer to process
 58longer audio data.
 59The performance trend (single thread being the fastest) suggests that
 60it does not release the GIL majority of the time.
 61
 62The :py:func:`spdl.io.load_audio` function (the generic FFmpeg-based implementation) does
 63a lot of work so its overall performance is not as good,
 64but it scales in multi-threading as it releases the GIL almost entirely.
 65"""
 66
 67__all__ = [
 68    "BenchmarkResult",
 69    "BenchmarkConfig",
 70    "create_wav_data",
 71    "load_sf",
 72    "load_spdl_audio",
 73    "load_spdl_wav",
 74    "benchmark",
 75    "run_benchmark_suite",
 76    "save_results_to_csv",
 77    "main",
 78]
 79
 80import argparse
 81import csv
 82import io
 83import sys
 84import time
 85from collections.abc import Callable
 86from concurrent.futures import as_completed, ThreadPoolExecutor
 87from dataclasses import dataclass
 88
 89import numpy as np
 90import scipy.io.wavfile
 91import scipy.stats
 92import soundfile as sf
 93import spdl.io
 94from numpy.typing import NDArray
 95
 96
 97def _get_python_info() -> tuple[str, bool]:
 98    """Get Python version and free-threaded ABI information.
 99
100    Returns:
101        Tuple of (python_version, is_free_threaded)
102    """
103    python_version = (
104        f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}"
105    )
106    # Check if Python is running with free-threaded ABI (PEP 703)
107    # _is_gil_enabled is only available in Python 3.13+
108    try:
109        is_free_threaded = sys._is_gil_enabled()  # pyre-ignore[16]
110    except AttributeError:
111        is_free_threaded = False
112    return python_version, is_free_threaded
113
114
115def create_wav_data(
116    sample_rate: int = 44100,
117    num_channels: int = 2,
118    bits_per_sample: int = 16,
119    duration_seconds: float = 1.0,
120) -> tuple[bytes, NDArray]:
121    """Create a WAV file in memory for benchmarking.
122
123    Args:
124        sample_rate: Sample rate in Hz
125        num_channels: Number of audio channels
126        bits_per_sample: Bits per sample (16 or 32)
127        duration_seconds: Duration of audio in seconds
128
129    Returns:
130        Tuple of (WAV file as bytes, audio samples array)
131    """
132    num_samples = int(sample_rate * duration_seconds)
133
134    dtype_map = {
135        16: np.int16,
136        32: np.int32,
137    }
138    dtype = dtype_map[bits_per_sample]
139    max_amplitude = 32767 if bits_per_sample == 16 else 2147483647
140
141    t = np.linspace(0, duration_seconds, num_samples)
142    frequencies = 440.0 + np.arange(num_channels) * 110.0
143    sine_waves = np.sin(2 * np.pi * frequencies[:, np.newaxis] * t)
144    samples = (sine_waves.T * max_amplitude).astype(dtype)
145
146    wav_buffer = io.BytesIO()
147    scipy.io.wavfile.write(wav_buffer, sample_rate, samples)
148    wav_data = wav_buffer.getvalue()
149
150    return wav_data, samples
151
152
153def load_sf(wav_data: bytes) -> NDArray:
154    """Load WAV data using soundfile library.
155
156    Args:
157        wav_data: WAV file data as bytes
158
159    Returns:
160        Audio samples array as int16 numpy array
161    """
162    audio_file = io.BytesIO(wav_data)
163    data, _ = sf.read(audio_file, dtype="int16")
164    return data
165
166
167def load_spdl_audio(wav_data: bytes) -> NDArray:
168    """Load WAV data using :py:func:`spdl.io.load_audio` function.
169
170    Args:
171        wav_data: WAV file data as bytes
172
173    Returns:
174        Audio samples array as numpy array
175    """
176    return spdl.io.to_numpy(spdl.io.load_audio(wav_data, filter_desc=None))
177
178
179def load_spdl_wav(wav_data: bytes) -> NDArray:
180    """Load WAV data using :py:func:`spdl.io.load_wav` function.
181
182    Args:
183        wav_data: WAV file data as bytes
184
185    Returns:
186        Audio samples array as numpy array
187    """
188    return spdl.io.to_numpy(spdl.io.load_wav(wav_data))
189
190
191@dataclass(frozen=True)
192class BenchmarkResult:
193    """Results from a single benchmark run."""
194
195    duration: float
196    qps: float
197    ci_lower: float
198    ci_upper: float
199    num_threads: int
200    function_name: str
201    duration_seconds: float
202    python_version: str
203    free_threaded: bool
204
205
206def benchmark(
207    name: str,
208    func: Callable[[], NDArray],
209    iterations: int,
210    num_threads: int,
211    num_sets: int,
212    duration_seconds: float,
213) -> tuple[BenchmarkResult, NDArray]:
214    """Benchmark a function using multiple threads and calculate statistics.
215
216    Executes a warmup phase followed by multiple benchmark sets to compute
217    performance metrics including mean queries per second (QPS) and 95%
218    confidence intervals using Student's t-distribution.
219
220    Args:
221        name: Descriptive name for the benchmark (used in results)
222        func: Callable function to benchmark (takes no args, returns NDArray)
223        iterations: Total number of function calls per benchmark set
224        num_threads: Number of concurrent threads for parallel execution
225        num_sets: Number of independent benchmark sets for confidence interval
226        duration_seconds: Duration of audio file being processed (for metadata)
227
228    Returns:
229        Tuple containing:
230            - BenchmarkResult with timing statistics, QPS, confidence intervals
231            - Output NDArray from the last function execution
232    """
233
234    with ThreadPoolExecutor(max_workers=num_threads) as executor:
235        # Warmup
236        futures = [executor.submit(func) for _ in range(num_threads * 30)]
237        for future in as_completed(futures):
238            output = future.result()
239
240        # Run multiple sets for confidence interval
241        qps_samples = []
242        for _ in range(num_sets):
243            t0 = time.perf_counter()
244            futures = [executor.submit(func) for _ in range(iterations)]
245            for future in as_completed(futures):
246                output = future.result()
247            elapsed = time.perf_counter() - t0
248            qps_samples.append(iterations / elapsed)
249
250    # Calculate mean and 95% confidence interval
251    qps_mean = np.mean(qps_samples)
252    qps_std = np.std(qps_samples, ddof=1)
253    confidence_level = 0.95
254    degrees_freedom = num_sets - 1
255    confidence_interval = scipy.stats.t.interval(
256        confidence_level,
257        degrees_freedom,
258        loc=qps_mean,
259        scale=qps_std / np.sqrt(num_sets),
260    )
261
262    duration = 1.0 / qps_mean
263    python_version, free_threaded = _get_python_info()
264    result = BenchmarkResult(
265        duration=duration,
266        qps=qps_mean,
267        ci_lower=float(confidence_interval[0]),
268        ci_upper=float(confidence_interval[1]),
269        num_threads=num_threads,
270        function_name=name,
271        duration_seconds=duration_seconds,
272        python_version=python_version,
273        free_threaded=free_threaded,
274    )
275    return result, output  # pyre-ignore[61]
276
277
278def run_benchmark_suite(
279    wav_data: bytes,
280    ref: NDArray,
281    num_threads: int,
282    duration_seconds: float,
283) -> tuple[BenchmarkResult, BenchmarkResult, BenchmarkResult]:
284    """Run benchmarks for both libraries with given parameters.
285
286    Args:
287        wav_data: WAV file data as bytes
288        ref: Reference audio array for validation
289        num_threads: Number of threads (use 1 for single-threaded)
290        duration_seconds: Duration of audio in seconds
291
292    Returns:
293        Tuple of (spdl_wav_result, spdl_audio_result, soundfile_result)
294    """
295    # load_wav is fast but the performance is unstable, so we need to run more
296    iterations = 100 * num_threads
297    num_sets = 100
298
299    spdl_wav_result, output = benchmark(
300        name="3. spdl.io.load_wav",
301        func=lambda: load_spdl_wav(wav_data),
302        iterations=iterations,
303        num_threads=num_threads,
304        num_sets=num_sets,
305        duration_seconds=duration_seconds,
306    )
307    np.testing.assert_array_equal(output, ref)
308
309    # others are slow but the performance is stable.
310    iterations = 10 * num_threads
311    num_sets = 5
312
313    spdl_audio_result, output = benchmark(
314        name="2. spdl.io.load_audio",
315        func=lambda: load_spdl_audio(wav_data),
316        iterations=iterations,
317        num_threads=num_threads,
318        num_sets=num_sets,
319        duration_seconds=duration_seconds,
320    )
321    np.testing.assert_array_equal(output, ref)
322    soundfile_result, output = benchmark(
323        name="1. soundfile",
324        func=lambda: load_sf(wav_data),
325        iterations=iterations,
326        num_threads=num_threads,
327        num_sets=num_sets,
328        duration_seconds=duration_seconds,
329    )
330    if output.ndim == 1:
331        output = output[:, None]
332    np.testing.assert_array_equal(output, ref)
333
334    return spdl_wav_result, spdl_audio_result, soundfile_result
335
336
337@dataclass(frozen=True)
338class BenchmarkConfig:
339    """Configuration for audio file parameters used in benchmarking.
340
341    Attributes:
342        sample_rate: Audio sample rate in Hz (e.g., 44100 for CD quality)
343        num_channels: Number of audio channels (1=mono, 2=stereo, etc.)
344        bits_per_sample: Bit depth per sample (16 or 32)
345        duration_seconds: Duration of the audio file in seconds
346    """
347
348    sample_rate: int
349    num_channels: int
350    bits_per_sample: int
351    duration_seconds: float
352
353
354def save_results_to_csv(
355    results: list[BenchmarkResult], output_file: str = "benchmark_results.csv"
356) -> None:
357    """Save benchmark results to a CSV file that Excel can open.
358
359    Args:
360        results: List of BenchmarkResult objects containing benchmark data
361        output_file: Output file path for the CSV file
362    """
363    with open(output_file, "w", newline="") as csvfile:
364        fieldnames = [
365            "function_name",
366            "duration_seconds",
367            "num_threads",
368            "qps",
369            "ci_lower",
370            "ci_upper",
371            "duration",
372            "python_version",
373            "free_threaded",
374        ]
375        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
376
377        writer.writeheader()
378        for r in results:
379            writer.writerow(
380                {
381                    "function_name": r.function_name,
382                    "duration_seconds": r.duration_seconds,
383                    "num_threads": r.num_threads,
384                    "qps": r.qps,
385                    "ci_lower": r.ci_lower,
386                    "ci_upper": r.ci_upper,
387                    "duration": r.duration,
388                    "python_version": r.python_version,
389                    "free_threaded": r.free_threaded,
390                }
391            )
392    print(f"Results saved to {output_file}")
393
394
395def _parse_args() -> argparse.Namespace:
396    """Parse command line arguments for the benchmark script.
397
398    Returns:
399        Parsed command line arguments
400    """
401    parser = argparse.ArgumentParser(description="Benchmark WAV loading performance")
402    parser.add_argument(
403        "--output",
404        type=str,
405        default="wav_benchmark_results.csv",
406        help="Output file path.",
407    )
408    return parser.parse_args()
409
410
411def main() -> None:
412    """Run comprehensive benchmark suite for WAV loading performance.
413
414    Benchmarks multiple configurations of audio files with different durations,
415    comparing spdl.io.load_wav, spdl.io.load_audio, and soundfile libraries
416    across various thread counts (1, 2, 4, 8, 16).
417    """
418    args = _parse_args()
419
420    benchmark_configs = [
421        # (sample_rate, num_channels, bits_per_sample, duration_seconds, iterations)
422        # BenchmarkConfig(8000, 1, 16, 1.0),  # Low quality mono
423        # BenchmarkConfig(16000, 1, 16, 1.0),  # Speech quality mono
424        # BenchmarkConfig(48000, 2, 16, 1.0),  # High quality stereo
425        # BenchmarkConfig(48000, 8, 16, 1.0),  # Multi-channel audio
426        BenchmarkConfig(44100, 2, 16, 1.0),  # CD quality stereo
427        BenchmarkConfig(44100, 2, 16, 10.0),  #
428        BenchmarkConfig(44100, 2, 16, 60.0),  #
429        # (44100, 2, 24, 1.0, 100),  # 24-bit audio
430    ]
431
432    results: list[BenchmarkResult] = []
433
434    for cfg in benchmark_configs:
435        print(cfg)
436        wav_data, ref = create_wav_data(
437            sample_rate=cfg.sample_rate,
438            num_channels=cfg.num_channels,
439            bits_per_sample=cfg.bits_per_sample,
440            duration_seconds=cfg.duration_seconds,
441        )
442        print(
443            f"Threads,"
444            f"SPDL WAV QPS ({cfg.duration_seconds} sec),CI Lower, CI Upper,"
445            f"SPDL Audio QPS ({cfg.duration_seconds} sec),CI Lower, CI Upper,"
446            f"soundfile QPS ({cfg.duration_seconds} sec),CI Lower, CI Upper"
447        )
448        for num_threads in [1, 2, 4, 8, 16]:
449            spdl_wav_result, spdl_audio_result, soundfile_result = run_benchmark_suite(
450                wav_data,
451                ref,
452                num_threads=num_threads,
453                duration_seconds=cfg.duration_seconds,
454            )
455            results.extend([spdl_wav_result, spdl_audio_result, soundfile_result])
456            print(
457                f"{num_threads},"
458                f"{spdl_wav_result.qps:.2f},{spdl_wav_result.ci_lower:.2f},{spdl_wav_result.ci_upper:.2f},"
459                f"{spdl_audio_result.qps:.2f},{spdl_audio_result.ci_lower:.2f},{spdl_audio_result.ci_upper:.2f},"
460                f"{soundfile_result.qps:.2f},{soundfile_result.ci_lower:.2f},{soundfile_result.ci_upper:.2f}"
461            )
462
463    save_results_to_csv(results, args.output)
464    print(
465        f"\nBenchmark complete. To generate plots, run:\n"
466        f"python plot_wav_benchmark.py --input {args.output} --output {args.output.replace('.csv', '.png')}"
467    )
468
469
470if __name__ == "__main__":
471    main()

Functions

Functions

create_wav_data(sample_rate: int = 44100, num_channels: int = 2, bits_per_sample: int = 16, duration_seconds: float = 1.0) tuple[bytes, ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Create a WAV file in memory for benchmarking.

Parameters:
  • sample_rate – Sample rate in Hz

  • num_channels – Number of audio channels

  • bits_per_sample – Bits per sample (16 or 32)

  • duration_seconds – Duration of audio in seconds

Returns:

Tuple of (WAV file as bytes, audio samples array)

load_sf(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using soundfile library.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as int16 numpy array

load_spdl_audio(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using spdl.io.load_audio() function.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as numpy array

load_spdl_wav(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using spdl.io.load_wav() function.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as numpy array

benchmark(name: str, func: Callable[[], ndarray[tuple[Any, ...], dtype[_ScalarT]]], iterations: int, num_threads: int, num_sets: int, duration_seconds: float) tuple[BenchmarkResult, ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Benchmark a function using multiple threads and calculate statistics.

Executes a warmup phase followed by multiple benchmark sets to compute performance metrics including mean queries per second (QPS) and 95% confidence intervals using Student’s t-distribution.

Parameters:
  • name – Descriptive name for the benchmark (used in results)

  • func – Callable function to benchmark (takes no args, returns NDArray)

  • iterations – Total number of function calls per benchmark set

  • num_threads – Number of concurrent threads for parallel execution

  • num_sets – Number of independent benchmark sets for confidence interval

  • duration_seconds – Duration of audio file being processed (for metadata)

Returns:

  • BenchmarkResult with timing statistics, QPS, confidence intervals

  • Output NDArray from the last function execution

Return type:

Tuple containing

run_benchmark_suite(wav_data: bytes, ref: ndarray[tuple[Any, ...], dtype[_ScalarT]], num_threads: int, duration_seconds: float) tuple[BenchmarkResult, BenchmarkResult, BenchmarkResult][source]

Run benchmarks for both libraries with given parameters.

Parameters:
  • wav_data – WAV file data as bytes

  • ref – Reference audio array for validation

  • num_threads – Number of threads (use 1 for single-threaded)

  • duration_seconds – Duration of audio in seconds

Returns:

Tuple of (spdl_wav_result, spdl_audio_result, soundfile_result)

save_results_to_csv(results: list[BenchmarkResult], output_file: str = 'benchmark_results.csv') None[source]

Save benchmark results to a CSV file that Excel can open.

Parameters:
  • results – List of BenchmarkResult objects containing benchmark data

  • output_file – Output file path for the CSV file

main() None[source]

Run comprehensive benchmark suite for WAV loading performance.

Benchmarks multiple configurations of audio files with different durations, comparing spdl.io.load_wav, spdl.io.load_audio, and soundfile libraries across various thread counts (1, 2, 4, 8, 16).

Classes

Classes

class BenchmarkResult(duration: float, qps: float, ci_lower: float, ci_upper: float, num_threads: int, function_name: str, duration_seconds: float, python_version: str, free_threaded: bool)[source]

Results from a single benchmark run.

ci_lower: float
ci_upper: float
duration: float
duration_seconds: float
free_threaded: bool
function_name: str
num_threads: int
python_version: str
qps: float
class BenchmarkConfig(sample_rate: int, num_channels: int, bits_per_sample: int, duration_seconds: float)[source]

Configuration for audio file parameters used in benchmarking.

sample_rate

Audio sample rate in Hz (e.g., 44100 for CD quality)

Type:

int

num_channels

Number of audio channels (1=mono, 2=stereo, etc.)

Type:

int

bits_per_sample

Bit depth per sample (16 or 32)

Type:

int

duration_seconds

Duration of the audio file in seconds

Type:

float

bits_per_sample: int
duration_seconds: float
num_channels: int
sample_rate: int