Benchmark wav

This example measuers the performance of loading WAV audio.

It compares three different approaches for loading WAV files:

  • spdl.io.load_wav(): Fast native WAV parser optimized for simple PCM formats

  • spdl.io.load_audio(): General-purpose audio loader using FFmpeg backend

  • soundfile (libsndfile): Popular third-party audio I/O library

The benchmark suite evaluates performance across multiple dimensions:

  • Various audio configurations (sample rates, channels, bit depths, durations)

  • Different thread counts (1, 2, 4, 8, 16) to measure parallel scaling

  • Statistical analysis with 95% confidence intervals using Student’s t-distribution

  • Queries per second (QPS) as the primary performance metric

Example

$ python benchmark_wav.py --plot --output results.png

Result

The following plot shows the QPS (measured by the number of files processed) of each functions with different audio durations.

../_static/data/example-benchmark-wav.webp

The spdl.io.load_wav() is a lot faster than the others, because all it does is reinterpret the input byte string as array. It shows the same performance for audio with longer duration.

And since parsing WAV is instant, the spdl.io.load_wav function spends more time on creation of NumPy Array. It needs to acquire the GIL, thus the performance does not scale in multi-threading. (This performance pattern of this function is pretty same as the spdl.io.load_npz.)

The following is the same plot without load_wav.

../_static/data/example-benchmark-wav-2.webp

libsoundfile has to process data iteratively (using io.BytesIO) because it does not support directly loading from byte string, so it takes longer to process longer audio data. The performance trend (single thread being the fastest) suggests that it does not release the GIL majority of the time.

The spdl.io.load_audio() function (the generic FFmpeg-based implementation) does a lot of work so its overall performance is not as good, but it scales in multi-threading as it releases the GIL almost entirely.

Source

Source

Click here to see the source.
  1# Copyright (c) Meta Platforms, Inc. and affiliates.
  2# All rights reserved.
  3#
  4# This source code is licensed under the BSD-style license found in the
  5# LICENSE file in the root directory of this source tree.
  6
  7# pyre-strict
  8
  9"""This example measuers the performance of loading WAV audio.
 10
 11It compares three different approaches for loading WAV files:
 12
 13- :py:func:`spdl.io.load_wav`: Fast native WAV parser optimized for simple PCM formats
 14- :py:func:`spdl.io.load_audio`: General-purpose audio loader using FFmpeg backend
 15- ``soundfile`` (``libsndfile``): Popular third-party audio I/O library
 16
 17The benchmark suite evaluates performance across multiple dimensions:
 18
 19- Various audio configurations (sample rates, channels, bit depths, durations)
 20- Different thread counts (1, 2, 4, 8, 16) to measure parallel scaling
 21- Statistical analysis with 95% confidence intervals using Student's t-distribution
 22- Queries per second (QPS) as the primary performance metric
 23
 24**Example**
 25
 26.. code-block:: shell
 27
 28   $ python benchmark_wav.py --plot --output results.png
 29
 30**Result**
 31
 32The following plot shows the QPS (measured by the number of files processed) of each
 33functions with different audio durations.
 34
 35.. image:: ../../_static/data/example-benchmark-wav.webp
 36
 37
 38The :py:func:`spdl.io.load_wav` is a lot faster than the others, because all it
 39does is reinterpret the input byte string as array.
 40It shows the same performance for audio with longer duration.
 41
 42And since parsing WAV is instant, the spdl.io.load_wav function spends more time on
 43creation of NumPy Array.
 44It needs to acquire the GIL, thus the performance does not scale in multi-threading.
 45(This performance pattern of this function is pretty same as the
 46:ref:`spdl.io.load_npz <data-format>`.)
 47
 48The following is the same plot without ``load_wav``.
 49
 50.. image:: ../../_static/data/example-benchmark-wav-2.webp
 51
 52``libsoundfile`` has to process data iteratively (using ``io.BytesIO``) because
 53it does not support directly loading from byte string, so it takes longer to process
 54longer audio data.
 55The performance trend (single thread being the fastest) suggests that
 56it does not release the GIL majority of the time.
 57
 58The :py:func:`spdl.io.load_audio` function (the generic FFmpeg-based implementation) does
 59a lot of work so its overall performance is not as good,
 60but it scales in multi-threading as it releases the GIL almost entirely.
 61"""
 62
 63__all__ = [
 64    "BenchmarkResult",
 65    "BenchmarkConfig",
 66    "create_wav_data",
 67    "load_sf",
 68    "load_spdl_audio",
 69    "load_spdl_wav",
 70    "benchmark",
 71    "run_benchmark_suite",
 72    "plot_benchmark_results",
 73    "main",
 74]
 75
 76import argparse
 77import io
 78import struct
 79import time
 80import wave
 81from collections.abc import Callable
 82from concurrent.futures import as_completed, ThreadPoolExecutor
 83from dataclasses import dataclass
 84
 85import numpy as np
 86import scipy.stats
 87import soundfile as sf
 88import spdl.io
 89from numpy.typing import NDArray
 90
 91
 92def create_wav_data(
 93    sample_rate: int = 44100,
 94    num_channels: int = 2,
 95    bits_per_sample: int = 16,
 96    duration_seconds: float = 1.0,
 97) -> tuple[bytes, NDArray]:
 98    """Create a WAV file in memory for benchmarking.
 99
100    Args:
101        sample_rate: Sample rate in Hz
102        num_channels: Number of audio channels
103        bits_per_sample: Bits per sample (16 or 32)
104        duration_seconds: Duration of audio in seconds
105
106    Returns:
107        Tuple of (WAV file as bytes, audio samples array)
108    """
109    num_samples = int(sample_rate * duration_seconds)
110
111    # Generate audio samples with sine wave pattern
112    dtype_map = {
113        16: np.int16,
114        32: np.int32,
115    }
116    dtype = dtype_map[bits_per_sample]
117
118    # Create audio samples with a simple sine wave pattern
119    samples = np.zeros((num_samples, num_channels), dtype=dtype)
120    for channel_idx in range(num_channels):
121        frequency = 440.0 + (channel_idx * 110.0)  # A4 and harmonics
122        t = np.linspace(0, duration_seconds, num_samples)
123        sine_wave = np.sin(2 * np.pi * frequency * t)
124
125        if bits_per_sample == 16:
126            samples[:, channel_idx] = (sine_wave * 32767).astype(dtype)
127        elif bits_per_sample == 32:
128            samples[:, channel_idx] = (sine_wave * 2147483647).astype(dtype)
129
130    # Use Python's built-in wave module to write WAV file to memory buffer
131    wav_buffer = io.BytesIO()
132    with wave.open(wav_buffer, "wb") as wav_file:
133        wav_file.setnchannels(num_channels)
134        wav_file.setsampwidth(bits_per_sample // 8)
135        wav_file.setframerate(sample_rate)
136
137        # Convert samples to bytes
138        if bits_per_sample == 16:
139            format_char = "h"  # signed short
140        elif bits_per_sample == 32:
141            format_char = "i"  # signed int
142        else:
143            raise ValueError(f"Unsupported bits_per_sample: {bits_per_sample}")
144
145        # Interleave channels and pack to bytes
146        frames = b""
147        for frame_idx in range(num_samples):
148            for channel_idx in range(num_channels):
149                frames += struct.pack(format_char, int(samples[frame_idx, channel_idx]))
150
151        wav_file.writeframes(frames)
152
153    wav_data = wav_buffer.getvalue()
154
155    return wav_data, samples
156
157
158def load_sf(wav_data: bytes) -> NDArray:
159    """Load WAV data using soundfile library.
160
161    Args:
162        wav_data: WAV file data as bytes
163
164    Returns:
165        Audio samples array as int16 numpy array
166    """
167    audio_file = io.BytesIO(wav_data)
168    data, _ = sf.read(audio_file, dtype="int16")
169    return data
170
171
172def load_spdl_audio(wav_data: bytes) -> NDArray:
173    """Load WAV data using :py:func:`spdl.io.load_audio` function.
174
175    Args:
176        wav_data: WAV file data as bytes
177
178    Returns:
179        Audio samples array as numpy array
180    """
181    return spdl.io.to_numpy(spdl.io.load_audio(wav_data, filter_desc=None))
182
183
184def load_spdl_wav(wav_data: bytes) -> NDArray:
185    """Load WAV data using :py:func:`spdl.io.load_wav` function.
186
187    Args:
188        wav_data: WAV file data as bytes
189
190    Returns:
191        Audio samples array as numpy array
192    """
193    return spdl.io.to_numpy(spdl.io.load_wav(wav_data))
194
195
196@dataclass(frozen=True)
197class BenchmarkResult:
198    """Results from a single benchmark run."""
199
200    duration: float
201    qps: float
202    ci_lower: float
203    ci_upper: float
204    num_threads: int
205    function_name: str
206    duration_seconds: float
207
208
209def benchmark(
210    name: str,
211    func: Callable[[], NDArray],
212    iterations: int,
213    num_threads: int,
214    num_sets: int,
215    duration_seconds: float,
216) -> tuple[BenchmarkResult, NDArray]:
217    """Benchmark a function using multiple threads and calculate statistics.
218
219    Executes a warmup phase followed by multiple benchmark sets to compute
220    performance metrics including mean queries per second (QPS) and 95%
221    confidence intervals using Student's t-distribution.
222
223    Args:
224        name: Descriptive name for the benchmark (used in results)
225        func: Callable function to benchmark (takes no args, returns NDArray)
226        iterations: Total number of function calls per benchmark set
227        num_threads: Number of concurrent threads for parallel execution
228        num_sets: Number of independent benchmark sets for confidence interval
229        duration_seconds: Duration of audio file being processed (for metadata)
230
231    Returns:
232        Tuple containing:
233            - BenchmarkResult with timing statistics, QPS, confidence intervals
234            - Output NDArray from the last function execution
235    """
236
237    with ThreadPoolExecutor(max_workers=num_threads) as executor:
238        # Warmup
239        futures = [executor.submit(func) for _ in range(num_threads * 30)]
240        for future in as_completed(futures):
241            output = future.result()
242
243        # Run multiple sets for confidence interval
244        qps_samples = []
245        for _ in range(num_sets):
246            t0 = time.perf_counter()
247            futures = [executor.submit(func) for _ in range(iterations)]
248            for future in as_completed(futures):
249                output = future.result()
250            elapsed = time.perf_counter() - t0
251            qps_samples.append(iterations / elapsed)
252
253    # Calculate mean and 95% confidence interval
254    qps_mean = np.mean(qps_samples)
255    qps_std = np.std(qps_samples, ddof=1)
256    confidence_level = 0.95
257    degrees_freedom = num_sets - 1
258    confidence_interval = scipy.stats.t.interval(
259        confidence_level,
260        degrees_freedom,
261        loc=qps_mean,
262        scale=qps_std / np.sqrt(num_sets),
263    )
264
265    duration = 1.0 / qps_mean
266    result = BenchmarkResult(
267        duration=duration,
268        qps=qps_mean,
269        ci_lower=float(confidence_interval[0]),
270        ci_upper=float(confidence_interval[1]),
271        num_threads=num_threads,
272        function_name=name,
273        duration_seconds=duration_seconds,
274    )
275    return result, output  # pyre-ignore[61]
276
277
278def run_benchmark_suite(
279    wav_data: bytes,
280    ref: NDArray,
281    num_threads: int,
282    duration_seconds: float,
283) -> tuple[BenchmarkResult, BenchmarkResult, BenchmarkResult]:
284    """Run benchmarks for both libraries with given parameters.
285
286    Args:
287        wav_data: WAV file data as bytes
288        ref: Reference audio array for validation
289        num_threads: Number of threads (use 1 for single-threaded)
290        duration_seconds: Duration of audio in seconds
291
292    Returns:
293        Tuple of (spdl_wav_result, spdl_audio_result, soundfile_result)
294    """
295    # load_wav is fast but the performance is unstable, so we need to run more
296    iterations = 100 * num_threads
297    num_sets = 100
298
299    spdl_wav_result, output = benchmark(
300        name="spdl.io.load_wav",
301        func=lambda: load_spdl_wav(wav_data),
302        iterations=iterations,
303        num_threads=num_threads,
304        num_sets=num_sets,
305        duration_seconds=duration_seconds,
306    )
307    np.testing.assert_array_equal(output, ref)
308
309    # others are slow but the performance is stable.
310    iterations = 10 * num_threads
311    num_sets = 5
312
313    spdl_audio_result, output = benchmark(
314        name="spdl.io.load_audio",
315        func=lambda: load_spdl_audio(wav_data),
316        iterations=iterations,
317        num_threads=num_threads,
318        num_sets=num_sets,
319        duration_seconds=duration_seconds,
320    )
321    np.testing.assert_array_equal(output, ref)
322    soundfile_result, output = benchmark(
323        name="soundfile",
324        func=lambda: load_sf(wav_data),
325        iterations=iterations,
326        num_threads=num_threads,
327        num_sets=num_sets,
328        duration_seconds=duration_seconds,
329    )
330    if output.ndim == 1:
331        output = output[:, None]
332    np.testing.assert_array_equal(output, ref)
333
334    return spdl_wav_result, spdl_audio_result, soundfile_result
335
336
337@dataclass(frozen=True)
338class BenchmarkConfig:
339    """Configuration for audio file parameters used in benchmarking.
340
341    Attributes:
342        sample_rate: Audio sample rate in Hz (e.g., 44100 for CD quality)
343        num_channels: Number of audio channels (1=mono, 2=stereo, etc.)
344        bits_per_sample: Bit depth per sample (16 or 32)
345        duration_seconds: Duration of the audio file in seconds
346    """
347
348    sample_rate: int
349    num_channels: int
350    bits_per_sample: int
351    duration_seconds: float
352
353
354def plot_benchmark_results(
355    results: list[BenchmarkResult], output_file: str = "benchmark_results.png"
356) -> None:
357    """Plot benchmark results and save to file.
358
359    Args:
360        results: List of BenchmarkResult objects containing benchmark data
361        output_file: Output file path for the saved plot
362    """
363    import matplotlib
364    import matplotlib.pyplot as plt
365    import pandas as pd
366    import seaborn as sns
367
368    matplotlib.use("Agg")  # Use non-interactive backend
369
370    data = [
371        {
372            "num_threads": r.num_threads,
373            "qps": r.qps,
374            "ci_lower": r.ci_lower,
375            "ci_upper": r.ci_upper,
376            "function": r.function_name,
377            "duration": f"{r.duration_seconds}s",
378        }
379        for r in results
380    ]
381    df = pd.DataFrame(data)
382
383    sns.set_theme(style="whitegrid")
384    _, ax = plt.subplots(figsize=(12, 6))
385    df["label"] = df["function"] + " (" + df["duration"] + ")"
386    for label in df["label"].unique():
387        subset = df[df["label"] == label].sort_values("num_threads")
388        line = ax.plot(
389            subset["num_threads"],
390            subset["qps"],
391            marker="o",
392            label=label,
393            linewidth=2,
394        )
395
396        # Add confidence interval as shaded region
397        ax.fill_between(
398            subset["num_threads"],
399            subset["ci_lower"],
400            subset["ci_upper"],
401            alpha=0.2,
402            color=line[0].get_color(),
403        )
404
405    ax.set_xlabel("Number of Threads", fontsize=12)
406    ax.set_ylabel("QPS (Queries Per Second)", fontsize=12)
407    ax.set_title("WAV Loading Performance Benchmark", fontsize=14, fontweight="bold")
408    ax.legend(title="Function", bbox_to_anchor=(1.05, 1), loc="upper left")
409    ax.grid(True, alpha=0.3)
410
411    plt.tight_layout()
412    plt.savefig(output_file, dpi=300, bbox_inches="tight")
413    print(f"Plot saved to {output_file}")
414
415
416def _parse_args() -> argparse.Namespace:
417    """Parse command line arguments for the benchmark script.
418
419    Returns:
420        Parsed command line arguments
421    """
422    parser = argparse.ArgumentParser(description="Benchmark WAV loading performance")
423    parser.add_argument(
424        "--plot",
425        action="store_true",
426        help="Generate and save a plot of the benchmark results",
427    )
428    parser.add_argument(
429        "--output",
430        type=str,
431        default="benchmark_results.png",
432        help="Output file path for the plot (default: benchmark_results.png)",
433    )
434    return parser.parse_args()
435
436
437def main() -> None:
438    """Run comprehensive benchmark suite for WAV loading performance.
439
440    Benchmarks multiple configurations of audio files with different durations,
441    comparing spdl.io.load_wav, spdl.io.load_audio, and soundfile libraries
442    across various thread counts (1, 2, 4, 8, 16).
443    """
444    args = _parse_args()
445
446    benchmark_configs = [
447        # (sample_rate, num_channels, bits_per_sample, duration_seconds, iterations)
448        # BenchmarkConfig(8000, 1, 16, 1.0),  # Low quality mono
449        # BenchmarkConfig(16000, 1, 16, 1.0),  # Speech quality mono
450        # BenchmarkConfig(48000, 2, 16, 1.0),  # High quality stereo
451        # BenchmarkConfig(48000, 8, 16, 1.0),  # Multi-channel audio
452        BenchmarkConfig(44100, 2, 16, 1.0),  # CD quality stereo
453        BenchmarkConfig(44100, 2, 16, 10.0),  #
454        BenchmarkConfig(44100, 2, 16, 60.0),  #
455        # (44100, 2, 24, 1.0, 100),  # 24-bit audio
456    ]
457
458    results: list[BenchmarkResult] = []
459
460    for cfg in benchmark_configs:
461        print(cfg)
462        wav_data, ref = create_wav_data(
463            sample_rate=cfg.sample_rate,
464            num_channels=cfg.num_channels,
465            bits_per_sample=cfg.bits_per_sample,
466            duration_seconds=cfg.duration_seconds,
467        )
468        print(
469            f"Threads,"
470            f"SPDL WAV QPS ({cfg.duration_seconds} sec),CI Lower, CI Upper,"
471            f"SPDL Audio QPS ({cfg.duration_seconds} sec),CI Lower, CI Upper,"
472            f"soundfile QPS ({cfg.duration_seconds} sec),CI Lower, CI Upper"
473        )
474        for num_threads in [1, 2, 4, 8, 16]:
475            spdl_wav_result, spdl_audio_result, soundfile_result = run_benchmark_suite(
476                wav_data,
477                ref,
478                num_threads=num_threads,
479                duration_seconds=cfg.duration_seconds,
480            )
481            results.extend([spdl_wav_result, spdl_audio_result, soundfile_result])
482            print(
483                f"{num_threads},"
484                f"{spdl_wav_result.qps:.2f},{spdl_wav_result.ci_lower:.2f},{spdl_wav_result.ci_upper:.2f},"
485                f"{spdl_audio_result.qps:.2f},{spdl_audio_result.ci_lower:.2f},{spdl_audio_result.ci_upper:.2f},"
486                f"{soundfile_result.qps:.2f},{soundfile_result.ci_lower:.2f},{soundfile_result.ci_upper:.2f}"
487            )
488
489    if args.plot:
490        plot_benchmark_results(results, args.output)
491
492
493if __name__ == "__main__":
494    main()

Functions

Functions

create_wav_data(sample_rate: int = 44100, num_channels: int = 2, bits_per_sample: int = 16, duration_seconds: float = 1.0) tuple[bytes, ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Create a WAV file in memory for benchmarking.

Parameters:
  • sample_rate – Sample rate in Hz

  • num_channels – Number of audio channels

  • bits_per_sample – Bits per sample (16 or 32)

  • duration_seconds – Duration of audio in seconds

Returns:

Tuple of (WAV file as bytes, audio samples array)

load_sf(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using soundfile library.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as int16 numpy array

load_spdl_audio(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using spdl.io.load_audio() function.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as numpy array

load_spdl_wav(wav_data: bytes) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Load WAV data using spdl.io.load_wav() function.

Parameters:

wav_data – WAV file data as bytes

Returns:

Audio samples array as numpy array

benchmark(name: str, func: Callable[[], ndarray[tuple[Any, ...], dtype[_ScalarT]]], iterations: int, num_threads: int, num_sets: int, duration_seconds: float) tuple[BenchmarkResult, ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Benchmark a function using multiple threads and calculate statistics.

Executes a warmup phase followed by multiple benchmark sets to compute performance metrics including mean queries per second (QPS) and 95% confidence intervals using Student’s t-distribution.

Parameters:
  • name – Descriptive name for the benchmark (used in results)

  • func – Callable function to benchmark (takes no args, returns NDArray)

  • iterations – Total number of function calls per benchmark set

  • num_threads – Number of concurrent threads for parallel execution

  • num_sets – Number of independent benchmark sets for confidence interval

  • duration_seconds – Duration of audio file being processed (for metadata)

Returns:

  • BenchmarkResult with timing statistics, QPS, confidence intervals

  • Output NDArray from the last function execution

Return type:

Tuple containing

run_benchmark_suite(wav_data: bytes, ref: ndarray[tuple[Any, ...], dtype[_ScalarT]], num_threads: int, duration_seconds: float) tuple[BenchmarkResult, BenchmarkResult, BenchmarkResult][source]

Run benchmarks for both libraries with given parameters.

Parameters:
  • wav_data – WAV file data as bytes

  • ref – Reference audio array for validation

  • num_threads – Number of threads (use 1 for single-threaded)

  • duration_seconds – Duration of audio in seconds

Returns:

Tuple of (spdl_wav_result, spdl_audio_result, soundfile_result)

plot_benchmark_results(results: list[BenchmarkResult], output_file: str = 'benchmark_results.png') None[source]

Plot benchmark results and save to file.

Parameters:
  • results – List of BenchmarkResult objects containing benchmark data

  • output_file – Output file path for the saved plot

main() None[source]

Run comprehensive benchmark suite for WAV loading performance.

Benchmarks multiple configurations of audio files with different durations, comparing spdl.io.load_wav, spdl.io.load_audio, and soundfile libraries across various thread counts (1, 2, 4, 8, 16).

Classes

Classes

class BenchmarkResult(duration: float, qps: float, ci_lower: float, ci_upper: float, num_threads: int, function_name: str, duration_seconds: float)[source]

Results from a single benchmark run.

ci_lower: float
ci_upper: float
duration: float
duration_seconds: float
function_name: str
num_threads: int
qps: float
class BenchmarkConfig(sample_rate: int, num_channels: int, bits_per_sample: int, duration_seconds: float)[source]

Configuration for audio file parameters used in benchmarking.

sample_rate

Audio sample rate in Hz (e.g., 44100 for CD quality)

Type:

int

num_channels

Number of audio channels (1=mono, 2=stereo, etc.)

Type:

int

bits_per_sample

Bit depth per sample (16 or 32)

Type:

int

duration_seconds

Duration of the audio file in seconds

Type:

float

bits_per_sample: int
duration_seconds: float
num_channels: int
sample_rate: int