Skip to main content

ASE Data Format

This page provides an overview of Aria Synthetic Environments (ASE) data formats and organization.

Using the code snippets and tools listed in Data Tools and Visualization, researchers should be able to quickly onboard this data into ML pipelines.

Overall Data Organization

  • Each scene has its own subdirectory with a unique ID (0-100K)
  • Each scene directory contains separate files and directories for each type of data
<sceneID>
├── rgb
│ └── vignette0000000.jpg
│ └── vignette0000001.jpg
│ ...
│ └── vignette0xxn.jpg
├── depth
│ └── depth0000000.jpg
│ └── depth0000001.jpg
│ ...
│ └── depth0xxn.jpg
├── instances
│ └── instance0000000.jpg
│ └── instance0000001.jpg
│ ...
│ └── instance0xxn.jpg
├── ase_scene_language.txt
├── trajectory.txt
├── semidense_points.csv.gz
├── semidense_observations.csv.gz
└── object_instances_to_classes.json
  • rgb - 2D RGB fisheye images
    • Synthetically generated Aria RGB images at 10 FPS
    • Each image is saved in JPEG format
  • depth - 2D depth maps (16 bit)
    • Each depth image is the same size as the corresponding synthetic RGB image, where the pixel contents are integers expressing the depth along the pixel’s ray direction, in units of mm.
      • This should not be confused with ADT depth images, which describe the depth in the camera’s Z-axis
    • Each image is saved in PNG format
  • instances - 2D segmentation maps (16 bit)
    • Each segmentation image is the same size as the corresponding synthetic RGB image, where the pixel contents are integers expressing the object Id that was observed by the pixel
    • Each image is saved as PNG format
  • ase_scene_language.txt - 3D floor plan definition
    • Describes the scene in the form of a language.
    • Each row is a command which includes its own set of parameters. A set of such commands describe the geomtery of the scene specified.
    • Go to ASE scene language format below for more details
  • trajectory.txt - Ground-truth trajectory
    • Go to MPS Output - Trajectory for how the data is structured
      • While the file structure is the same, please note, this is the ground truth trajectory, not an output generated by MPS
  • semidense_points.csv.gz - Semi-dense map points
  • semidense_observations.csv.gz - Semi-dense map observations
  • object_instances_to_classes.json

Aria RGB Sensor - Image, Depth and Instance Segmentation

For each frame from the RGB sensor we provide:

  • A vignetted sensor image
  • Simulated 16 bit metric depth (mm) in PNG image format
  • A segmentation image (16 bit PNG)

The images in each folder are in sync. This means there will be same number of images in each folder. We also provide example data visualizers to load these images and/or associate them. Image: sample_rgb_depth_instance_images.png

ASE Scene Language Format

The ASE Scene Language format is set of hand-designed procedural commands in pure text form. To handle commonly encountered static indoor layout elements, we use three commands:

  • make_wall - the full set of parameters specifies a gravity-aligned oriented box
  • make_door - specify box-based cutouts from walls
  • make_window - specify box-based cutouts from wall

Each command includes its own set of parameters, as described below. Given the command’s full set of parameters, a geometry is completely specified.

A single scene is described via a sequence of multiple commands stored in ase_scene_language.txt. The sequence length is arbitrary and follows no specific ordering. The interpretation of the command and its arguments is carried out by a customized interpreter responsible for parsing the sequence and generating a 3D mesh of the scene.

Image: language_format.png

Trajectory and Semi-Dense Map Points

Ground-truth trajectory data provides poses for each frame generated from a simulation at 10 FPS. We are follow the same trajectory format as the closed loop trajectory used by Machine Perception Services (MPS).

For semi-dense map point clouds and their observations, we follow the same point cloud points and observations format as MPS. The semi-dense map point cloud is generated using same algorithm as MPS, with the addition of ground-truth trajectory and simulated SLAM camera images.