Skip to content

Issues with MANUS fingertip data and poor finger extension in retargeting #10

Description

@gray-wei

Hi, Alex, Thanks for your great work and for sharing this project. I have a few questions regarding using MANUS data for retargeting to the Allegro Hand.

  1. Using fingertip xyz vs joint angles:
    I noticed that MANUS states their fingertip positions (xyz) are highly accurate. Thus, I initially thought directly using these fingertip positions for retargeting would yield better results than using joint angles. However, in my experiments, the results were somewhat counterintuitive:

    • The thumb tracks very well.
    • But the other four fingers have poor extension capability, though their general direction is roughly correct.

    Could you please shed some light on why using fingertip xyz might not perform as well as expected here? Is it due to kinematic ambiguities or something related to optimization?

  2. On data quality and sensitivity:
    I collected about 5000 samples using fingertip positions for training. After training the retargeting network, I tested it by feeding the same data back into inference to evaluate the output and online retargeting. Both of them have similar results:

    • The overall motion directions look roughly correct.
    • The thumb shows good tracking.
    • However, the other three fingers (index, middle, ring) show poor precision and cannot extend well, appearing much less refined than the thumb.

    I also found that training on different batches sometimes leads to different pairwise alignments: for example, one model aligns the index and thumb well, while another aligns the thumb and middle finger better.
    This makes me wonder if this approach is highly sensitive to the data distribution.

    • Should I pay extra attention during data collection to ensure more varied finger extensions, especially for the non-thumb fingers?
  3. On simulation data:
    I also noticed that in many simulation setups, people simply sample joint angles uniformly. Given my observations, should we explicitly ensure that the synthetic dataset covers more fully extended and flexed poses to improve training quality?

Thanks a lot!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions