spdl.io.cuda_config¶

cuda_config(device_index: int, **kwargs) → CUDAConfig[source]¶

Sprcify the CUDA device and memory management.

Parameters:

device_index (int) – The device to move the data to.
stream (int) –
Optional: Pointer to a custom CUDA stream. By default, per-thread default stream is used.

The value corresponds to uintptr_t of CUDA API.
Using PyTorch default CUDA stream

It is possible to provide the same stream as the one used in Python’s main thread. For example, you can fetch the default CUDA stream that PyTorch is using as follow.
```
>>> stream = torch.cuda.Stream()
>>> cuda_stream = stream.cuda_stream
```
Warning

Using the same stream as a model is running might introduce undesired synchronization.
allocator (tuple[Callable[[int, int, int], int], Callable[[int], None]]) –
Optional: A pair of custom CUDA memory allocator and deleter functions.

Allocator

The allocator function takes the following arguments, and return the address of the allocated memory.
- Size: int
- CUDA device index: int
- CUDA stream address: int (uintptr_t)
Deleter

The deleter takes the address of memory allocated by the allocator and free the memory.

An example of such functions are PyTorch’s caching_allocator_alloc() and caching_allocator_delete().