Official implementation of the manuscript "Hierarchical Token Learning and Adaptive Gated Fusion for Robust Multi-Modal Object Re-Identification".
⚠️ This code is directly related to the HTL-ReID manuscript. If you use this code, please cite our paper.
HTL-ReID is a unified framework for multi-modal (RGB / NIR / TIR) object re-identification. It combines three coordinated components:
- Hierarchical Token Selection (HS) — aggregates attention cues from shallow, middle, and deep ViT layers as complementary spatial priors, while keeping all token features in a single deep semantic space.
- Fusion-Aware Synergistic Selection (FACSS) — jointly scores intra-modal discriminability and cross-modal cosine consensus, modulated by an environment-aware dynamic weight.
- Adaptive Gated Fusion (AGF) — channel-wise convex interpolation gating that calibrates fusion intensity according to modality reliability.
pip install -r requirements.txt
pytorch_waveletsis vendored under./pytorch_wavelets— do notpip installit separately.
Download RGBNT201, RGBNT100, and MSVR310 from their official sources, then update DATASETS.ROOT_DIR in the corresponding YAML config under ./configs/ to point to your local dataset directory.
Download a pretrained ViT checkpoint and set MODEL.PRETRAIN_PATH_T in the YAML config to its local path. The backbone variant is selected via MODEL.TRANSFORMER_TYPE; supported values are listed in modeling/make_model.py.
Train on a single dataset by selecting its YAML config:
# RGBNT201
python train_net.py --config_file configs/RGBNT201/default.yml
# RGBNT100
python train_net.py --config_file configs/RGBNT100/default.yml
# MSVR310
python train_net.py --config_file configs/MSVR310/default.ymlYou can override any config field from the command line, e.g.:
python train_net.py --config_file configs/RGBNT201/default.yml \
DATASETS.ROOT_DIR /path/to/datasets \
MODEL.PRETRAIN_PATH_T /path/to/pretrained_vit.pthpython test_net.py --config_file configs/RGBNT201/default.yml \
TEST.WEIGHT /path/to/checkpoint.pthIf you find this work useful, please cite:
@article{luo2026htl,
title = {Hierarchical Token Learning and Adaptive Gated Fusion for Robust Multi-Modal Object Re-Identification},
author = {Luo, Hongbin and Ye, Yihan},
note = {Manuscript},
year = {2026}
}Please also cite the archived release:
@software{htl_reid_zenodo,
title = {HTL-ReID: Hierarchical Token Learning and Adaptive Gated Fusion for Robust Multi-Modal Object Re-Identification},
author = {Luo, Hongbin and Ye, Yihan},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.19769558},
url = {https://doi.org/10.5281/zenodo.19769558}
}This project is released under the terms of the LICENSE file in this repository.