- [Mar 02, 2026] MemCoach is live!
- [Feb 28, 2026] MemBench is live!
- [Feb 26, 2026] Arxiv paper is live!
# clone project
git clone https://github.com/laitifranz/MemCoach
cd MemCoach
# (recommended) use uv to set up the python version
# and to install the required dependencies
curl -LsSf https://astral.sh/uv/install.sh | sh
# --frozen ensures uv.lock is respected strictly and not modified,
# guaranteeing you get the exact same environment as intended
uv sync --frozen
# (alternative) use pip to install the required dependencies
python3.12 -m venv .venv
source .venv/bin/activate
pip install -e .Note
For those who are not familiar with uv, uv run is a command that activate at runtime the virtual environment while running the command.
You can skip this step if you are not using OpenRouter API or if you are using the default paths.
# copy .env.example to .env
cp .env.example .env
# edit .env file
vim .envMemBench is a benchmark dataset, hosted on HuggingFace, introduced alongside MemCoach.
uv run hf download --repo-type dataset laitifranz/MemBench --local-dir dataset/
unzip dataset/images.zip -d dataset/ && mv dataset/images dataset/ppr10k # rename the folder to ppr10kImportant
Launch the scripts from the project root directory.
Tip
The repository keeps a portable Python wrapper to activate the virtual environment and set the PYTHONPATH to the project root:
bash scripts/schedule_python.sh <PATH_TO_PYTHON_SCRIPT> <ARGUMENTS_FOR_THE_SCRIPT>You can also use the following command with uv to run the script:
PYTHONPATH=$(pwd) uv run <PATH_TO_PYTHON_SCRIPT> <ARGUMENTS_FOR_THE_SCRIPT>Tip
The scripts directory contains various utility scripts for job scheduling on SLURM clusters. Use bash scripts/schedule_sbatch.sh -h to get more information. Take a look at the examples in the scripts/slurm_configs directory to see how to setup a SLURM environment.
Generate the memorability scores with our target predictor model.
bash scripts/schedule_python.sh src/pipelines/membench_gen/generate_target_scores.py --mlp_checkpoint_path "ckpt/target_predictor/memorability/ours/model_weights.pth"Tip
If you run the scripts on a SLURM environment cluster, you can parallelize the generation by using SLURM array argument.
Note
The model settings provided in the config files are for the InternVL3.5-8B model. You can change the model settings by editing the config files.
bash scripts/schedule_python.sh src/pipelines/membench_gen/constr_data_gen/runner.py --config_path config/data_generation/teacher/internvl3_5_8B.yamlbash scripts/schedule_python.sh src/pipelines/zero_shot/runner.py --config_path config/data_generation/student/internvl3_5_8B.yamlbash scripts/schedule_python.sh src/pipelines/method/training.py --config_path config/method_steering/training/internvl3_5_8B_positive.yamlbash scripts/schedule_python.sh src/pipelines/method/training.py --config_path config/method_steering/training/internvl3_5_8B_negative.yamlNote
We provide the pre-built steering vectors for the InternVL3.5-8B model used for the paper experiments in the ckpt/memcoach directory. You can use them to skip the activation extraction stage.
Run inference using the exact activation files produced in Stage B:
bash scripts/schedule_python.sh src/pipelines/method/inference.py # add --config-name config_paper to use our steering pre-built vectorsTip
By default, Hydra will use the config/method_steering/inference/config.yaml file pointing to config/method_steering/inference/internvl3_5_8B.yaml. You can override the key dicts in the config file by passing them as arguments to the script, e.g.:
bash scripts/schedule_python.sh src/pipelines/method/inference.py activation_settings.coeff=55 activation_settings.target_layer=12 runtime.include_datetime=trueBased on the editing evaluation you want to perform, you can choose the appropriate config file. We provide config files for the flux baseline, teacher oracle, zero-shot, and MemCoach editing evaluation.
- For the MemCoach evaluation:
bash scripts/schedule_python.sh src/pipelines/evaluation/editing/runner.py --config_path config/evaluation/editing/memcoach.yamlbash scripts/schedule_python.sh src/analysis/editing_metrics.py --root experiments/evaluation/editing --run-scope latestTest out MemCoach in real time by capturing images directly from your mobile device through a FastAPI backend exposed via ngrok.
Note
- Since the camera access on mobile devices requires a secure context, we need to use a proxy to forward the requests to the FastAPI server via HTTPS. We use ngrok for this purpose
- API requests are logged in the
outputs/api_requestsdirectory. Check web/camera/README.md for more information - The default steering settings loaded in the MemCoach API are configured in the
config/method_steering/inference/internvl3_5_8B_paper.yamlfile.
NGROK_AUTHTOKEN=<your_token_here> uvx ngrok http 8000Save the ngrok forward URL for later use.
PYTHONPATH=$(pwd) uv run -m uvicorn src.api.app:app --host 0.0.0.0 --port 8000https://<your_ngrok_subdomain>.ngrok-free.dev/camera/?api=https://<your_ngrok_subdomain>.ngrok-free.devFor transparency and reproducibility, we provide our evaluation artifacts for MemCoach on InternVL3.5-8B model. We report the IR and RM metrics.
Option A — Compact download (recommended): A single zip archive containing all 4 experiment folders is available to avoid hitting the Hugging Face rate limit:
uv run hf download --repo-type dataset laitifranz/MemBench-InternVL3.5-Eval MemBench-InternVL3.5-Eval-Artifacts.zip --local-dir artifacts/
unzip artifacts/MemBench-InternVL3.5-Eval-Artifacts.zip -d artifacts/hf_internvl3_5_8B_evalOption B — Full dataset download (for individual downloads):
HF_XET_HIGH_PERFORMANCE=1 uv run hf download --repo-type dataset laitifranz/MemBench-InternVL3.5-Eval --include "teacher_oracle/*" --local-dir artifacts/hf_internvl3_5_8B_evalNote
You may need to resume the download if you hit the Hugging Face rate limit.
Then run the analysis pipeline:
bash scripts/schedule_python.sh src/analysis/editing_metrics.py --root artifacts/hf_internvl3_5_8B_eval --run-scope latestmake setup-pre-commit # one-time setup: install pre-commit globally with uv + install git hooks + autoupdate
make check # run pre-commit hooks (format + lint via ruff + security checks via gitleaks)
make clean-logs # clean logs
make run-tests # run tests via pytestIf you find this work useful to your research, please consider citing as:
@inproceedings{laiti2026memcoach,
title={How to Take a Memorable Picture? Empowering Users with Actionable Feedback},
author={Laiti, Francesco and Talon, Davide and Staiano, Jacopo and Ricci, Elisa},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2026}
}Thanks to these great repositories: dottxt-ai/outlines to make structured output easy with LLMs, google/python-fire to make CLI management easy, PPR10K for the PPR10M dataset, and web-haptics for the experimental haptic feedback solution in the web demo.
The needs of this project led to a contribution back to the community: PR #1728 was merged into dottxt-ai/outlines, improving the handling of multimodal chat inputs for Transformers models.
Since this project relies a lot on automatic generation of data, the generated feedback can be different due to the stochastic nature of models. We tried to make the pipeline as reproducible as possible, but there might be some variations in the generated feedback using your machine and virtual env setup. See Reproducing Paper Results for more information.
