Skip to main content

Design Decisions

NumPy vs PyTorch Tensors

PyMomentum mixes NumPy arrays and PyTorch tensors throughout the codebase. This explains the reasoning and current status.

Why the Mixed Approach?

Old Reason (Historical):

  • Auto-conversion from Eigen types to NumPy arrays works natively in pybind11
  • Can use Python buffer interface to wrap data without copies
  • PyTorch tensors are "more painful to work with" in pybind11
  • Used PyTorch only where differentiability was needed

Internal Convention:

  • PyTorch tensor = differentiable operation
  • NumPy array = non-differentiable operation

New Reason (Current): The dependence on aten makes building Python code that uses torch.Tensor somewhat challenging compared to code that uses the basic buffer interface:

  • Blocking pymomentum usage in downstream projects
  • Preventing demonstration projects
  • Causing various compatibility problems

Current State

The codebase is mixed which can be confusing because we're between approaches. Discussions are happening around:

  • Stripping all PyTorch from pymomentum.geometry
  • Isolating differentiability (and hence PyTorch dependencies) in specific libraries, such as diff_solver
  • Using GPU_character for ML workloads instead

Examples:

  • solver2 uses NumPy arrays (not differentiable, PyTorch-independent)
  • Batching support being removed from non-ML contexts (like rendering)
  • Rendering code will strip batch support for open sourcing

PyTorch advantages being lost:

  • Extra tensor manipulation functionality
  • Batching/unbatching support
  • But interfaces become more confusing for non-ML use

Future Directions

Potential approach: If you need batching + differentiability + GPU support, use GPU_character instead of pymomentum. This would let us:

  • Strip enormous amounts of code from pymomentum
  • Fix build system issues
  • Simplify interfaces

(Obviously could only happen after open sourcing GPU_character)

Current recommendation: Accept manual conversions between NumPy/PyTorch until the architecture stabilizes.