How to Download the AEA Dataset
Overview
This page covers how to download sample Aria Everyday Activities (AEA) sequences, as well as how to download specific sequences and types of data. Follow the instructions to download the sample sequence and from there you'll be able to use the CLI to download more data.
By downloading the datasets you agree that you have read and accepted the terms of the Aria Everyday Activities Dataset License Agreement.
Download the sample AEA sequence
Step 0: Install project_aria_tools package and create a virtual environment if not already done
Follow Step 0 to Step 3 in Getting Started.
Step 1 : Visit the AEA website and sign up.
Scroll down to the bottom of the page. Enter your email and select Access the Datasets.
Step 2 : Download the download-links file
Once you've selected Access the Datasets you'll be taken back to the top of the AEA page.
Scroll down the page to select AEA Download Links and download the file to the folder $HOME/Downloads.
You can re-download the download-links whenever they expire
Step 3 : Set up a folder for AEA data
mkdir -p $HOME/Documents/projectaria_tools_aea_data
mv $HOME/Downloads/aria_everyday_activities_dataset_download_urls.json $HOME/Documents/projectaria_tools_aea_data/
Step 4 : Download the sample sequence (~3GB) via CLI:
From your Python virtual environment, run:
aea_dataset_downloader -c $HOME/Documents/projectaria_tools_aea_data/aria_everyday_activities_dataset_download_urls.json \
-o $HOME/Documents/projectaria_tools_aea_data/ \
-d 0 1 2 3 -e
The sample sequence is representative of a typical single-user sequence, which gives you an idea of to expect from the dataset.
Download the AEA dataset
Data size
The AEA dataset contains 143 sequences and the total size of the dataset is about 353GB. The dataset is split into main data and MPS outputs, eye gaze and Multi-SLAM (SLAM outputs created in shared coordinate frame) results. Go to Project Aria Machine Perception Services for more information about MPS data. The MPS data is also broken into chunks that can be included or excluded at download time.
Data type | What's included | Per sequence size | Total size for all sequences |
---|---|---|---|
main | Aria raw data, speech to text, metadata json | 2 - 4 GB | ~309 GB |
MPS eyegaze | Eyegaze, summary file | < 1 MB | ~31 MB |
MPS SLAM points | Semi-dense points and observations | 200 - 500 MB | ~31 GB |
MPS SLAM trajectories | Open and closed loop trajectories | 100 - 200 MB | 12 GB |
MPS SLAM online calibration | Online calibrations | < 20 MB | 1.2 GB |
Download via CLI
Follow the AEA Getting Started Guide to download the example data. This section will introduce how to download the dataset using the aea_dataset_downloader
.
Resumable download
The aea_dataset_downloader
checks the previous download status of the sequences in the --output_folder. If the downloading breaks in the middle, relaunch the CLI and it will continue the downloading.
Detailed arguments
Arguments | Type | Description |
---|---|---|
--cdn_file | str | The download-urls file you downloaded from the AEA website page after signing up |
--output_folder | str | A local path where the downloaded files and metadata will be stored |
--metadata_only | flag | Only download the metadata file for the dataset |
--data_types | list of int | 0→main, 1→MPS eyegaze, 2→MPS trajectories, 3→MPS semidense pointclouds and observations, 4→MPS online calibrations |
--example_only | flag | Only download example data |
--overwrite | flag | Disable resumable download. Force download and overwrite existing data |
--sequence_names | list of str | list of sequence names. If not specified, download all sequences |
Download Examples
All these commands must be run from your Python virtual environment that has the projectaria-tools package and dependencies installed.
Download metadata for all datasets
This will download the aria_everyday_activities_metadata.json
, which contains metadata for each AEA sequence, including: location number, script number, sequence number, recording number, dataset version, dataset name, and list of concurrent recordings.
You can use this data to select specific sequences to download.
aea_dataset_downloader --cdn_file ${PATH_TO_YOUR_CDN_FILE} --output_folder ${OUTPUT_FOLDER_PATH} --metadata_only
Download main data for all sequences
aea_dataset_downloader --cdn_file ${PATH_TO_YOUR_CDN_FILE} --output_folder ${OUTPUT_FOLDER_PATH} --data_types 0
Download all data for all sequences
aea_dataset_downloader --cdn_file ${PATH_TO_YOUR_CDN_FILE} --output_folder ${OUTPUT_FOLDER_PATH} --data_types 0 1 2 3
Download main data for 2 specific sequences
aea_dataset_downloader --cdn_file ${PATH_TO_YOUR_CDN_FILE} --output_folder ${OUTPUT_FOLDER_PATH} --data_types 0 --sequence_names loc1_script1_seq1_rec1 loc2_script1_seq1_rec1
Download main data for all sequences and overwrite
aea_dataset_downloader --cdn_file ${PATH_TO_YOUR_CDN_FILE} --output_folder ${OUTPUT_FOLDER_PATH} --data_types 0 --overwrite
Troubleshooting
Go to troubleshooting if you experience issues using this guide.