spdl.io.cuda_config¶
- cuda_config(device_index: int, stream: int = 2, allocator: tuple[Callable[[int, int, int], int], Callable[[int], None]] | None = None) CUDAConfig[source]¶
Specify the CUDA device and memory management.
- Parameters:
device_index (int) – The device to move the data to.
stream (int) –
Optional: Pointer to a custom CUDA stream. The value corresponds to
uintptr_tof CUDA API.By default, SPDL uses the per-thread default stream (
cudaStreamPerThread, value0x2). To use the legacy default stream, explicitly passstream=0. To use a custom stream, pass the stream’suintptr_tvalue.Using PyTorch CUDA stream
You can create a custom CUDA stream and use it with SPDL:
>>> import torch >>> stream = torch.cuda.Stream() # Creates a new CUDA stream >>> cuda_stream = stream.cuda_stream # Get the stream handle >>> cuda_config = spdl.io.cuda_config(device_index=0, stream=cuda_stream)
To use PyTorch’s current stream (the stream PyTorch operations use by default):
>>> import torch >>> current_stream = torch.cuda.current_stream() >>> cuda_stream = current_stream.cuda_stream >>> cuda_config = spdl.io.cuda_config(device_index=0, stream=cuda_stream)
Warning
Using the same stream as a model is running might introduce undesired synchronization between SPDL operations and model operations.
allocator (tuple[Callable[[int, int, int], int], Callable[[int], None]]) –
Optional: A pair of custom CUDA memory allocator and deleter functions.
Allocator
The allocator function takes the following arguments, and return the address of the allocated memory.
Size:
intCUDA device index:
intCUDA stream address:
int(uintptr_t)
Deleter
The deleter takes the address of memory allocated by the allocator and free the memory.
An example of such functions are PyTorch’s
caching_allocator_alloc()andcaching_allocator_delete().