spdl.io.cuda_config

cuda_config(device_index: int, stream: int = 2, allocator: tuple[Callable[[int, int, int], int], Callable[[int], None]] | None = None) CUDAConfig[source]

Specify the CUDA device and memory management.

Parameters:
  • device_index (int) – The device to move the data to.

  • stream (int) –

    Optional: Pointer to a custom CUDA stream. The value corresponds to uintptr_t of CUDA API.

    By default, SPDL uses the per-thread default stream (cudaStreamPerThread, value 0x2). To use the legacy default stream, explicitly pass stream=0. To use a custom stream, pass the stream’s uintptr_t value.

    Using PyTorch CUDA stream

    You can create a custom CUDA stream and use it with SPDL:

    >>> import torch
    >>> stream = torch.cuda.Stream()  # Creates a new CUDA stream
    >>> cuda_stream = stream.cuda_stream  # Get the stream handle
    >>> cuda_config = spdl.io.cuda_config(device_index=0, stream=cuda_stream)
    

    To use PyTorch’s current stream (the stream PyTorch operations use by default):

    >>> import torch
    >>> current_stream = torch.cuda.current_stream()
    >>> cuda_stream = current_stream.cuda_stream
    >>> cuda_config = spdl.io.cuda_config(device_index=0, stream=cuda_stream)
    

    Warning

    Using the same stream as a model is running might introduce undesired synchronization between SPDL operations and model operations.

  • allocator (tuple[Callable[[int, int, int], int], Callable[[int], None]]) –

    Optional: A pair of custom CUDA memory allocator and deleter functions.

    Allocator

    The allocator function takes the following arguments, and return the address of the allocated memory.

    • Size: int

    • CUDA device index: int

    • CUDA stream address: int (uintptr_t)

    Deleter

    The deleter takes the address of memory allocated by the allocator and free the memory.

    An example of such functions are PyTorch’s caching_allocator_alloc() and caching_allocator_delete().