Skip to content

zhao-chunyu/SaliencyMamba

Repository files navigation

logo

arXiv AAAI 2025 Paper License: MIT GitHub GitHub GitHub

Baidu TrafficGaze Baidu DrFixD(Rainy) Baidu BDDA HF TrafficGaze HF DrFixD(Rainy)

Authors: Chunyu Zhao, Wentao Mu, Xian Zhou, Wenbo Liu, Fei Yan, Tao Deng📧
Contact: springyu.zhao@foxmail.com      📧: corresponding author
BDDA-1 BDDA-2 BDDA-2

🔥Update

  • 2025/08/02: We have added multiple download options for datasets.

    • Baidu: Trafficgaze, DrFixD-rainy, BDDA
    • Hugging Face: Trafficgaze, DrFixD-rainy
  • 2025/07/24: The official trained weights have been uploaded. Details, Download

  • 2025/03/03: Complete the contents of the code repository.

    • Datasets upload: Trafficgaze✅, DrFixD-rainy✅, BDDA
    • Environment configuration: environment
    • Visualization code: our code in repository. visualization
    • Evaluation metrics code: our code in repository. python, Matlab (official)
  • 2024/12/10: Our paper is accepted by AAAI🎉🎉🎉. arxiv

  • 2024/11/08: Update supplementary materials. Details

  • 2024/10/23: We release the uniform saliency dataset loader. You can simply use it by from utils.datasets import build_dataset.

  • 2024/07/25: How to use our model (SalM²).

  • 2024/07/24: All the code and models are completed.

  • 2024/07/05: We collect the possible datasets to use, and make a uniform dataloader.

  • 2024/06/14: Our model is proposed !

💬Motivation 🔁

(1) Using semantic information to guide driver attention.

Solution: We propose a dual-branch network that separately extracts semantic information and image information. The semantic information is used to guide the image information at the deepest level of image feature extraction.

(2) Reducing model parameters and computational complexity.

Solution: We develop a highly lightweight saliency prediction network based on the latest Mamba framework, with only 0.0785M (88% reduction compared to SOTA) parameters and 4.45G FLOPs (37% reduction compared to SOTA).

⚡Proposed Model 🔁

we propose a saliency mamba model, named SalM² that uses "Top-down" driving scene semantic information to guide "Bottom-up" driving scene image information to simulate human drivers' attention allocation.

📖Datasets 🔁

Name Train (video/frame) Valid (video/frame) Test (video/frame) Dataset example
TrafficGaze 49080 6655 19135 BDDA-3
DrFixD-rainy 52291 9816 19154 BDDA-1
BDDA 286251 63036 93260 BDDA-0
【note】 For all datasets we will provide our download link with the official link. Please choose according to your needs.

(1) TrafficGaze: This dataset is available on BaiduYun (code: SALM) baiduyun or on Hugging Face HuggingFace. We crop 5 frames before and after each video. Official web in link.

(2) DrFixD-rainy: This dataset is available on BaiduYun (code: SALM) baiduyun or on Hugging Face HuggingFace. We crop 5 frames before and after each video. Official web in link.

(3) BDDA: This dataset we uploaded in BaiduYun (code: BDDA) baidunyu. Some camera videos and gazemap videos frame rate inconsistency, we have matched and cropped them. Some camera videos do not correspond to gazemap videos, we have filtered them. Official web in link.

TrafficGaze DrFixD-rainy BDDA
./TrafficGaze
  |——fixdata
  |  |——fixdata1.mat
  |  |——fixdata2.mat
  |  |—— ... ...
  |  |——fixdata16.mat
  |——trafficframe
  |  |——01
  |  |  |——000001.jpg
  |  |  |—— ... ...
  |  |——02
  |  |—— ... ...
  |  |——16
  |——test.json
  |——train.json
  |——valid.json
./DrFixD-rainy
  |——fixdata
  |  |——fixdata1.mat
  |  |——fixdata2.mat
  |  |—— ... ...
  |  |——fixdata16.mat
  |——trafficframe
  |  |——01
  |  |  |——000001.jpg
  |  |  |—— ... ...
  |  |——02
  |  |—— ... ...
  |  |——16
  |——test.json
  |——train.json
  |——valid.json
./BDDA
  |——camera_frames
  |  |——0001
  |  |  |——0001.jpg
  |  |  |—— ... ...
  |  |——0002
  |  |—— ... ...
  |  |——2017
  |——gazemap_frames
  |  |——0001
  |  |  |——0001.jpg
  |  |  |—— ... ...
  |  |——0002
  |  |—— ... ...
  |  |——2017
  |——test.json
  |——train.json
  |——valid.json

🛠️ Deployment 🔁

Environment

​ 👉If you have downloaded our repository code and installed PyTorch and CUDA. More details

pip install -r requirements.txt
pip install -e utils/models/causal-conv1d
pip install -e utils/models/mamba

Run train

​ 👉If you wish to train with our model, please use the command below. More details

python train.py --network salmm --b 32 --g 0 --category xxx --root xxx

Run test

[1] Official test [⭐⭐⭐⭐⭐]

We calculate the predicted values and then use Matlab for the prediction. More details

cd metrics
./run_matlab.sh

[2] General test

Although Python testing is more convenient, our test benchmark is based on the previous work (CDNNDrFixD-rainy、......), and the results calculated by Python do not match those calculated by Matlab. We have provided a Python test code, which is basically consistent with Matlab in terms of CC, SIM, and KLD metrics. (We do not recommend using this script for final testing, as it differs from our official evaluation !)

👉If you wish to start with a rough evaluation metric, you can do so using the command. More details

Not Recommended. This script is intended solely for observing the trends of evaluation metrics, and is not suitable for final evaluation.

python evaluate_metrics.py --network salmm --b 1 --g 0 --category xxx --root xxx --test_weight xxx

Run visualization

We also offer visualized code. Visualization can support the input of various types of data such as str, list, and dataloader. More details

​ 👉If you want to visualize all the data of a certain dataset directly, you can use the following command.

python visualization.py --network salmm --b 1 --g 0 --category xxx --root xxx --test_weight xxx

🚀 Live Demo 🔁

BDDA-1 BDDA-2 BDDA-3

✨ Downstream Tasks 🔁

Some interesting downstream tasks are shown here, and our work will be of significant research interest.

  • Saliency object detection: saliency mapGuideobject detection
B
  • Event recognition: saliency mapGuideevent recognition
B
  • Other downstream tasks......

🙌 Acknowledgements 🔁

Thank you to all collaborators for your support and to those who have helped improve this repository.

zhao-chunyu
Core Author
MoonTao1
Core Author
liu-5658
Core Author
taodeng
Core Author
ly27253
Repository
ly27253
Repository

Repository: It indicates that reproducing the repository helps to improve the repository errors.

⭐️ Cite 🔁

If you find this repository useful, please use the following BibTeX entry for citation and give us a star⭐.

@article{zhao2025salmamba, 
  title={SalM²: An Extremely Lightweight Saliency Mamba Model for Real-Time Cognitive Awareness of Driver Attention}, 
  volume={39}, 
  DOI={10.1609/aaai.v39i2.32157},  
  number={2},
  journal={Proceedings of the AAAI Conference on Artificial Intelligence}, 
  author={Zhao, Chunyu and Mu, Wentao and Zhou, Xian and Liu, Wenbo and Yan, Fei and Deng, Tao}, 
  year={2025}, 
  month={Apr.}, 
  pages={1647-1655} 
}

About

[AAAI'2025] SalM²: An Extremely Lightweight Saliency Mamba Model for Real-Time Cognitive Awareness of Driver Attention

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors