Skip to content

LiQiiiii/Vid-SME

Repository files navigation

Qi Li Runpeng Yu Xinchao Wang
xML-Lab, National University of Singapore  corresponding author

TL;DR (1) - Introduce Vid-SME, the first dedicated method for video membership inference attacks against large video understanding models.

TL;DR (2) - Benchmarking MIA performance by training three VULLMs, each on a distinct dataset, using different representative training strategies.

Overview

Diagram 2
Figure 1. Vid-SME against Video Understanding Large Language Models (VULLMs). Left: An example of the video instruction context used in our experiments. Middle: The overall pipeline of Vid-SME. Right: The detailed illustration of the membership score calculaiton of Vid-SME.

Installation & Preparation

  1. Follow the instructions provided in LongVA to build the environment.

  2. Download the models and move them into ./checkpoints. For the datasets, the json files are given in the ./video_json folder, download the related videos and move them into ./video_json/videos.

Evaluation

Run Vid-SME on each model via the corresponding script:

python Vid_SME_main_CinePile.py

Citation

If you finding our work interesting or helpful to you, please cite as follows:

@article{li2025vid,
  title={Vid-sme: Membership inference attacks against large video understanding models},
  author={Li, Qi and Yu, Runpeng and Wang, Xinchao},
  journal={arXiv preprint arXiv:2506.03179},
  year={2025}
}

About

[NeurIPS‘25] Vid-SME: Membership Inference Attacks against Large Video Understanding Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages