V²-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence

Jiancheng Pan, Runze Wang, Tianwen Qian, Mohammad Mahdi, Xiangyang Xue,

Xiaomeng Huang, Luc Van Gool, Danda Pani Paudel, Yuqian Fu✉

* Equal Contribution Corresponding Author ✉

News | Abstract | Dataset | Model | Statement

News

[2026/2/21] Our V²-SAM is accepted by CVPR 2026. Thanks to all contributors.
[2025/5/20] Our paper of "V²-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence" is up on arXiv.

Abstract

Cross-view object correspondence, exemplified by the representative task of ego-exo object correspondence, aims to establish consistent associations of the same object across different viewpoints (e.g., ego-centric and exo-centric). This task poses significant challenges due to drastic viewpoint and appearance variations, making existing segmentation models, such as SAM2, non-trivial to apply directly. To address this, we present V2-SAM, a unified cross-view object correspondence framework that adapts SAM2 from single-view segmentation to cross-view correspondence through two complementary prompt generators. Specifically, the Cross-View Anchor Prompt Generator (V2-Anchor), built upon DINOv3 features, establishes geometry-aware correspondences and, for the first time, unlocks coordinate-based prompting for SAM2 in cross-view scenarios, while the Cross-View Visual Prompt Generator (V2-Visual) enhances appearance-guided cues via a novel visual prompt matcher that aligns ego-exo representations from both feature and structural perspectives. To effectively exploit the strengths of both prompts, we further adopt a multi-expert design and introduce a Post-hoc Cyclic Consistency Selector (PCCS) that adaptively selects the most reliable expert based on cyclic consistency. Extensive experiments validate the effectiveness of V2-SAM, achieving new state-of-the-art performance on Ego-Exo4D (ego-exo object correspondence), DAVIS-2017 (video object tracking), and HANDAL-X (robotic-ready cross-view correspondence).

Dataset

Our method based on Ego-Exo4D (ego-exo object correspondence), DAVIS-2017 (video object tracking), and HANDAL-X (robotic-ready cross-view correspondence).

You can use our process data in Huggingface:

Ego-Exo4D: https://huggingface.co/datasets/jaychempan/Ego-Exo4D-Relation-Train and https://huggingface.co/datasets/jaychempan/Ego-Exo4D-Relation-Test

DAVIS-2017: https://huggingface.co/datasets/jaychempan/DAVIS

HANDAL-X: https://huggingface.co/datasets/jaychempan/HANDAL

Model

Environment Setup

conda create -n v2sam python=3.10 -y
conda activate v2sam
cd ~/projects/V2-SAM
export LD_LIBRARY_PATH=/opt/modules/nvidia-cuda-12.1.0/lib64:$LD_LIBRARY_PATH
export PATH=/opt/modules/nvidia-cuda-12.1.0/bin:$PATH
# conda install pytorch==2.3.1 torchvision==0.18.1 pytorch-cuda=12.1 cuda -c pytorch  -c "nvidia/label/cuda-12.1.0" -c "nvidia/label/cuda-12.1.1"
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121

# pip install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu121/torch2.3/index.html 
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.1.0"
pip install -r requirements.txt
pip install prettytable

# use local mmengine for use the thrid party tools
cd mmengine
pip install -e .

SAM2 and DINOV3 weights

Choose the base model weights to use.

huggingface-cli download jaychempan/sam3 --local-dir weights/sam2 --include dinov3_vitl16_pretrain_lvd1689m-8aa4cbdd.pth

huggingface-cli download jaychempan/dinov2 --local-dir weights/dinov2 --include dinov2_vitg14_reg4_pretrain.pth

huggingface-cli download jaychempan/dinov3 --local-dir weights/dinov3 --include sam2_hiera_large.pt

Train

bash tools/dist.sh train projects/v2sam/configs/v2sam.py 4

if V²-Visual, rename the project's dir projects/v2sam_visual --> projects/v2sam

else V²-Fusion, rename the project's dir projects/v2sam_fusion --> projects/v2sam

Note: V²-Anchor no need to train (use sam2 offical decoder checkpoint)

Test

bash tools/test.sh test projects/v2sam/configs/v2sam.py 4 /path/to/checkpoint

bash tools/test_all.sh test projects/v2sam/configs/v2sam.py 4 /path/to/checkpoint/dir

Statement

Acknowledgement

This project references and uses the following open source models and datasets.

Related Open Source Models

Related Open Source Datasets

Citation

If you are interested in the following work or want to use our dataset, please cite the following paper.

@article{pan2025v,
  title={V $\^{}$\{$2$\}$ $-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence},
  author={Pan, Jiancheng and Wang, Runze and Qian, Tianwen and Mahdi, Mohammad and Fu, Yanwei and Xue, Xiangyang and Huang, Xiaomeng and Van Gool, Luc and Paudel, Danda Pani and Fu, Yuqian},
  journal={arXiv preprint arXiv:2511.20886},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
mmengine		mmengine
projects		projects
third_parts		third_parts
tools		tools
vlm		vlm
weights		weights
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

V²-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence

Jiancheng Pan, Runze Wang, Tianwen Qian, Mohammad Mahdi, Xiangyang Xue,

Xiaomeng Huang, Luc Van Gool, Danda Pani Paudel, Yuqian Fu✉

News

Abstract

Dataset

Model

Environment Setup

SAM2 and DINOV3 weights

Train

Test

Statement

Acknowledgement

Related Open Source Models

Related Open Source Datasets

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

V²-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence

Jiancheng Pan*, Runze Wang*, Tianwen Qian, Mohammad Mahdi, Xiangyang Xue,

Xiaomeng Huang, Luc Van Gool, Danda Pani Paudel, Yuqian Fu✉

News

Abstract

Dataset

Model

Environment Setup

SAM2 and DINOV3 weights

Train

Test

Statement

Acknowledgement

Related Open Source Models

Related Open Source Datasets

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Jiancheng Pan, Runze Wang, Tianwen Qian, Mohammad Mahdi, Xiangyang Xue,

Packages