Skip to main content

Photoreal Scene Reconstruction

Overview

Photoreal Scene Reconstruction is a system for reconstructing photorealistic 3D scenes captured from an egocentric device. In contrast to off-the-shelf Gaussian Splatting reconstruction pipelines that use videos as input through structure-from-motion, this system has two major innovations that greatly improve the reconstruction quality:

  • Visual-inertial bundle adjustment (VIBA): Unlike the mainstream approach of treating an RGB camera as a frame-rate camera, VIBA allows us to calibrate the precise timestamps and movements of an RGB camera in a high-frequency trajectory format. This supports the system in precisely modeling the online RGB camera calibrations and pixel movements of a rolling-shutter camera. The VIBA input comes from Project Aria’s Machine Perception Services.

  • Gaussian Splatting model: This physical image formation model, based on the Gaussian Splatting algorithm, effectively addresses sensor characteristics, including the rolling-shutter effect of RGB cameras and the dynamic ranges measured by the sensors. This formulation can apply to other rasterization-based techniques.

Included in the system are comprehensive guidelines for using data recorded by Aria Gen 1 devices, including how to preprocess the recordings and reconstruct them using different variations of the Gaussian Splatting algorithm. Provided examples include the reconstruction of scenes using RGB sensors, SLAM cameras, and all cameras combined.

For more details on this method, check out the GitHub repo.

References