MemShot redefines memory construction for long-term dialogue modeling by leveraging dialogue structuring and the model’s internal visual reasoning capabilities to associate key episodes within dialogues. Specifically, MemShot directly renders local contiguous dialogue spans into structured visual memory units, explicitly preserving meta-information and the chronological structure of dialogue turns while avoiding heavy-weight textual memory construction. Experimental results show that MemShot achieves stable and competitive performance on both LoCoMo and LongMemEval, while substantially shortening the memory construction pipeline and delivering 70× speedup.
Clone the repository and install dependencies:
git clone https://github.com/NEUIR/MemShot.git
cd MemShot
pip install -r requirements.txtThe current codebase is built around Qwen3-VL series for generation and Qwen3-VL-Embedding-8B for retrieval. Please prepare the corresponding model checkpoints before running retrieval or inference.
We conduct experiments on the following long-term dialogue benchmarks:
For Full-session Rendering
python src/memshot/rendering/locomo/locomo_render_full_session.pyFor Split-session Rendering
python src/memshot/rendering/locomo/locomo_render_split_session.pyBefore rendering LongMemEval, please first convert it into a LoCoMo-style format. After conversion, you can run the following scripts.
python /home/shengxuanshuo/locomo/MemShot/assets/convert_longmemeval_to_locomo.pyFor Full-session Rendering
python src/memshot/rendering/longmemeval/longmemeval_render_full_session.pyFor Split-session Rendering
python src/memshot/rendering/longmemeval/longmemeval_render_split_session.pyThe current pipelines in this repository are script-based and use hard-coded paths in the source files. Before running the pipeline, ensure that the model paths, data paths, cache paths, and output paths in each script are correctly configured.
We provide a unified launcher for LoCoMo and LongMemEval retrieval variants.
LoCoMo:
python src/memshot/retrieval/locomo/run_locomo_retrieval.py --variant image_splitLongMemEval:
python src/memshot/retrieval/longmemeval/run_longmemeval_retrieval.py --variant image_splitAfter retrieval, run inference on the retrieved memory units.
LoCoMo:
python src/memshot/inference/locomo/locomo_inference.pyLongMemEval:
python src/memshot/inference/longmemeval/longmemeval_inference.pyAfter inference, run the LLM-based judge to evaluate predictions.
LoCoMo:
python src/memshot/evaluation/locomo/llm_judge.pyLongMemEval:
python src/memshot/evaluation/longmemeval/llm_judge.pyOur work is built on the following codebases, and we are deeply grateful for their contributions.
If you have questions, suggestions, and bug reports, please give us an issue or email:
hm.cypeng@gmail.com
