Skip to main content

Hand Tracking Algorithm Benchmark

This page presents the performance evaluation of the MPS Hand Tracking algorithm for Aria Gen2 glasses.

Benchmark Dataset

The MPS Hand Tracking algorithm is benchmarked on an internal dataset specifically designed for hand-object interaction (HOI) scenarios. These scenarios are particularly valuable for robotics applications and present significant challenges for hand tracking due to object-induced occlusions.

Dataset Specifications

PropertyValue
Total Duration1 hour
Frame Rate30 fps
Cameras UsedAll 4 CV cameras on Aria Gen2
Number of Participants7
Number of Objects20 everyday objects

Participant Diversity

Seven participants were recruited to ensure diversity across:

  • Ethnicity
  • Age
  • Gender
  • Hand size
  • Arm length

Recording Scenarios

Each participant was asked to interact naturally with a pool of 20 everyday objects. The dataset captures:

  • Frequent hand occlusions from objects or the other hand during natural interaction
  • Out-of-field-of-view segments where subjects' hands move outside the Aria Gen2 field of view

These challenging scenarios provide an end-to-end evaluation signal for Aria Gen2 hand tracking performance.

Ground Truth Annotations

Hand pose ground truth annotations follow the standard UmeTrack format and are generated using an internal marker-free reconstruction system with millimeter-level keypoint accuracy.

Thanks to the extensive field-of-view coverage of outside-in cameras in the reconstruction system, nearly all frames in the 1-hour VRS dataset have valid hand pose ground truth annotations.

Evaluation Metrics

We report the following metrics (lower is better for both):

MetricDescription
MKPE (Mean Keypoint Error)Average Euclidean distance between predicted and ground truth keypoints, measured in millimeters
LTR (Lose Track Ratio)Percentage of frames where tracking is lost

Results

MethodMKPE (mm) ↓LTR (%) ↓
On-device HT45.012.7
MPS HT (3.1.1)20.18.6