Skip to main content

Speech2Text Output Data

caution

Aria Pilot Dataset documentation is stored in Archive: Aria Data Tools, because it was Project Aria's first open source initiative and it uses a different data structure compared to our latest open releases. For the most up to date tooling and to find out about our other open datasets go to Project Aria Tools.

This website will be deleted in September 2024.

Speech2Text Output Data

Speech2Text Output Data provides text strings generated by Automatic Speech Recognition with timestamps and confidence rating.

Each recording has two .csv files that are the same, except speech2text/speech.csv uses the wav file time domain and speech2text/speech_aria_domain.csv uses Aria time domain.

Table 1: speech.csv Structure

startTime_msendTime_mswrittenconfidence
5404055040I’m0.25608
7292073920looking0.84339

Note: token in wav file time domain (start = 0)

Table 2: speech_aria_domain.csv Structure

startTime_nsendTime_nswrittenconfidence
5651104056512040I’m0.25608
5652992056530920looking0.84339

Note: token in Aria file time domain (start = 0)