.. _tc_with_pytorch: Getting Started =============== We provide integration of Tensor Comprehensions (TC) with PyTorch for both **training** and **inference** purposes. Using TC with PyTorch, you can express an operator using Einstein notation and get a fast CUDA implementation for that layer with just a few lines of code (examples below). Here are a few cases where TC can be useful: * specialize your layer for uncommon tensor sizes and get better performance than libraries, *or* * experiment with layer fusion like group convolution, ReLU, FC, *or* * synthesize new layers and get an efficient kernel automatically, *or* * synthesize layers for tensors with unconventional memory layouts TC makes it easy to synthesize CUDA kernels for such cases and more. By providing TC integration with PyTorch, we hope to make it easy for PyTorch users to express their operations and bridge the gap between research and engineering. Installation ------------ See instructions here: :ref:`installation_guide`. Example ------- For demonstration purposes, we illustrate a simple :code:`matmul` operation backed by TC. .. code-block:: python import tensor_comprehensions as tc import torch mm = """ def matmul(float(M,K) A, float(N,K) B) -> (output) { output(m, n) +=! A(m, r_k) * B(n, r_k) } """ TC = tc.define(mm, tc.make_naive_options_factory()) A, B = torch.randn(3, 4).cuda(), torch.randn(4, 5).cuda() C = TC.matmul(A, B) With a few lines of code, you can get a functional CUDA implementation for an operation expressed in TC. Note, however, that this simplest example is not expected to be fast. Read the documentation to find out more.