Skip to content

NEUIR/MemShot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Memory Shot for Long-Term Dialogue

Chunyi Peng1*, Haidong Xin1*, Xuanshuo Sheng1, Xin Dai1, Zhenghao Liu1†,
Shuo Wang2, Yukun Yan2, Zulong Chen3, Yu Gu1, and Ge Yu1

1Northeastern University, 2Tsinghua University, 3Alibaba Group

Introduction

MemShot overview

MemShot redefines memory construction for long-term dialogue modeling by leveraging dialogue structuring and the model’s internal visual reasoning capabilities to associate key episodes within dialogues. Specifically, MemShot directly renders local contiguous dialogue spans into structured visual memory units, explicitly preserving meta-information and the chronological structure of dialogue turns while avoiding heavy-weight textual memory construction. Experimental results show that MemShot achieves stable and competitive performance on both LoCoMo and LongMemEval, while substantially shortening the memory construction pipeline and delivering 70× speedup.

Setup

Clone the repository and install dependencies:

git clone https://github.com/NEUIR/MemShot.git
cd MemShot
pip install -r requirements.txt

The current codebase is built around Qwen3-VL series for generation and Qwen3-VL-Embedding-8B for retrieval. Please prepare the corresponding model checkpoints before running retrieval or inference.

Data Preparation

We conduct experiments on the following long-term dialogue benchmarks:

MemShot Construction on LoCoMo

For Full-session Rendering

python src/memshot/rendering/locomo/locomo_render_full_session.py

For Split-session Rendering

python src/memshot/rendering/locomo/locomo_render_split_session.py

MemShot Construction on LongMemEval

Before rendering LongMemEval, please first convert it into a LoCoMo-style format. After conversion, you can run the following scripts.

Convert LongMemEval to LoCoMo-style format

python /home/shengxuanshuo/locomo/MemShot/assets/convert_longmemeval_to_locomo.py

For Full-session Rendering

python src/memshot/rendering/longmemeval/longmemeval_render_full_session.py

For Split-session Rendering

python src/memshot/rendering/longmemeval/longmemeval_render_split_session.py

Quick Start

The current pipelines in this repository are script-based and use hard-coded paths in the source files. Before running the pipeline, ensure that the model paths, data paths, cache paths, and output paths in each script are correctly configured.

1. Run retrieval

We provide a unified launcher for LoCoMo and LongMemEval retrieval variants.

LoCoMo:

python src/memshot/retrieval/locomo/run_locomo_retrieval.py --variant image_split

LongMemEval:

python src/memshot/retrieval/longmemeval/run_longmemeval_retrieval.py --variant image_split

2. Run inference

After retrieval, run inference on the retrieved memory units.

LoCoMo:

python src/memshot/inference/locomo/locomo_inference.py

LongMemEval:

python src/memshot/inference/longmemeval/longmemeval_inference.py

3. Evaluate predictions

After inference, run the LLM-based judge to evaluate predictions.

LoCoMo:

python src/memshot/evaluation/locomo/llm_judge.py

LongMemEval:

python src/memshot/evaluation/longmemeval/llm_judge.py

Acknowledgement

Our work is built on the following codebases, and we are deeply grateful for their contributions.

📧 Contact

If you have questions, suggestions, and bug reports, please give us an issue or email:

hm.cypeng@gmail.com

About

This is the code repo for our paper "Memory Shot for Long-Term Dialogue".

Resources

License

Stars

Watchers

Forks

Contributors

Languages