Skip to content

reactmultimodalchallenge/baseline_react2026

Repository files navigation

Official baseline code for the Forth REACT Challenge (react2026)

[Homepage] [Reference Paper] [Code]

This repository provides baseline methods for the Forth REACT Challenge

Baseline paper:

MARS dataset:

Challenge Description

Given the spatio-temporal behaviours expressed by a speaker at the time period, the proposed REACT 2025 Challenge will consist of the following two sub-challenges whose theoretical underpinnings have been defined and detailed in this paper.

Task 1 - Offline Appropriate Facial Reaction Generation

This task aims to develop a deep learning model that takes the entire speaker behaviour sequence as the input, and generates multiple appropriate and realistic / naturalistic spatio-temporal facial reactions, consisting of AUs, facial expressions, valence and arousal state representing the predicted facial reaction. As a result, facial reactions are required to be generated for the task given each input speaker behaviour.

Task 2 - Online Appropriate Facial Reaction Generation

This task aims to develop a deep learning model that estimates each frame, rather than taking all frames into consideration. The model is expected to gradually generate all facial reaction frames to form multiple appropriate and realistic / naturalistic spatio-temporal facial reactions consisting of AUs, facial expressions, valence and arousal state representing the predicted facial reaction. As a result, facial reactions are required to be generated for the task given each input speaker behaviour.

🛠️ Dependency Installation

We provide detailed instructions for setting up the environment using conda. First, create and activate a new environment:

conda create -n react python=3.10
conda activate react

1. Install PyTorch

First, check your CUDA version:

nvidia-smi

Visit Pytorch official website to get the appropriate installation command. For example:

conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia

2. Install PyTorch3D Dependencies

Install the following dependencies:

conda install -c fvcore -c iopath -c conda-forge fvcore iopath

For CUDA versions older than 11.7, you will need to install the CUB library.

conda install -c bottler nvidiacub

3. Install PyTorch3D

First, verify your CUDA version in Python:

import torch
torch.version.cuda

Download the appropriate PyTorch3D package from Anaconda based on your Python, CUDA, and PyTorch versions. For example, for Python 3.10, CUDA 11.6, and PyTorch 1.12.0:

# linux-64_pytorch3d-0.7.5-py310_cu116_pyt1120.tar.bz2
conda install linux-64_pytorch3d-0.7.5-py310_cu116_pyt1120.tar.bz2

4. Install Additional Dependencies

Install all remaining dependencies specified in requirements.txt:

pip install -r requirements.txt

👨‍🏫 Get Started

Data

Challenge Data Description (Homepage):

We divided the datasets into training, test, and validation sets following an estimated 60%/20%/20% splitting ratio. Specifically, we split the datasets with a subject-independent strategy (i.e., the same subject was never included in the train and test sets).

  • video-raw folder contains raw videos (with the resolution of 1920 * 1080)
  • video-face-crop folder contains face-cropped videos (with the resolution of 384 * 384)
  • facial-attributes folder contains sequences of frame-level 25-dimension facial attributes (15 AUs’ occurrences, valence and arousal intensities, and the probabilities of eight categorical facial expressions)
  • coefficients folder contains sequences of 58-dimension (52-d expression, 3-d rotation, and 3-d translation) 3DMM coefficients extracted from corresponding videos
  • audio folder contains wav files extracted from raw video files

Appropriate real facial reactions (Ground-Truths):

  • During data recording, the semantic contexts are carefully controlled through the 23 distinct sessions (session0, session1, …, session22), each of which is guided by a few pre-defined sentences posted by the speaker. This provides a consistent session-specific context across dyadic interactions between different speakers and listeners. More specifically, for the speaker behaviour expressed in a specific session, we define all facial reactions expressed by different listeners under the same session to be appropriate facial reactions (i.e., ground-truth) for responding to it.

Data organization (./data) is listed below: The example of data structure.


├── val
├── test
├── train
    ├── coefficients (.npy)
    ├── video-face-crop (.mp4)
    ├── video-raw (.mp4)
        ├── speaker
            ├── session0
                ├── Camera-2024-06-21-103121-103102.mp4
                ├── ...
            ├── ...
            ├── session22
                ├── Camera-2024-07-17-104338-104241.mp4
                ├── ...
        ├── listener
            ├── session0
                ├── Camera-2024-06-21-103121-103102.mp4
                ├── ...
            ├── ...
            ├── session22
                ├── Camera-2024-07-17-104338-104241.mp4
                ├── ...
    ├── facial-attributes (.npy)
        ├── speaker
            ├── session0
                ├── Camera-2024-06-21-103121-103102.npy
                ├── ...
            ├── ...
            ├── session22
                ├── Camera-2024-07-17-104338-104241.npy
                ├── ...
        ├── listener
            ├── session0
                ├── Camera-2024-06-21-103121-103102.npy
                ├── ...
            ├── ...
            ├── session22
                ├── Camera-2024-07-17-104338-104241.npy
                ├── ...
    ├── audio (.wav)
        ├── speaker
            ├── session0
                ├── Camera-2024-06-21-103121-103102.wav
                ├── ...
            ├── ...
        ├── listener
            ├── session0
                ├── Camera-2024-06-21-103121-103102.wav
                ├── ...
            ├── ...

External Tool Preparation

We use 3DMM coefficients to represent a 3D listener or speaker, and for further 3D-to-2D frame rendering. The baselines leverage 3DMM model to extract 3DMM coefficients, and render 3D facial reactions.

  • You should first download 3DMM (FaceVerse version 2 model) at this page

    and then put it in the folder (external/FaceVerse/data/).

    We provide our extracted 3DMM coefficients (which are used for our baseline visualisation) at OneDrive.

    We also provide the mean_face.npy at this OneDrive link and std_face.npy at this OneDrive link and reference_full.npy at this Onedrive link for 3DMM coefficients Data Normalization. Please download and put them in the folder (external/FaceVerse/).

Then, we use a 3D-to-2D tool PIRender to render final 2D facial reaction frames.

  • We re-trained the PIRender, and the well-trained model is provided at the checkpoint. Please put it in the folder (external/PIRender/).

Finally, please download the compressed folder named pretrained_models from this link, and extract it into the project root directory.

Training

Generic online:

1. PerFRDiff + EEG

python main.py \
    --config-name generic_online/motion_diffusion \
    trainer.batch_size=8 \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.model.diff_model.eeg_head.enabled=true \
    trainer.generic.train_eeg_head_only=false

2. TransVAE + EEG

python main.py \
    --config-name generic_online/motion_transvae \
    trainer.batch_size=2 \
    trainer.max_seq_len=256 \
    trainer.window_size=16 \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.train_eeg_head_only=false \
    trainer.model.eeg_head.enabled=true

Personalized online:

PerFRDiff rewrite-weight + EEG

(a) Condition Input: Listener historical facial behaviours

python main.py \
    --config-name personalized_online/perfrdiff_rewrite_weight \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.generic.train_eeg=true \
    trainer.generic.train_eeg_head_only=false \
    trainer.main_model.args.personal_condition_mode=3dmm_only \
    trainer.pretrained.diffusion_prior=<diffusion-prior-model-path/checkpoint.pth> \
    trainer.pretrained.diffusion_decoder=<diffusion-decoder-model-path/checkpoint.pth> \
    trainer.pretrained.eeg_head_checkpoint=<eeg-head-checkpoint-path/checkpoint.pth>

The paths diffusion-prior-model-path, diffusion-decoder-model-path, and eeg-head-checkpoint-pathpoint point to the checkpoints saved from training generic_online/motion_diffusion.

(b) Condition Input: Personality_only

python main.py \
    --config-name personalized_online/perfrdiff_rewrite_weight \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.generic.train_eeg=true \
    trainer.generic.train_eeg_head_only=false \
    trainer.main_model.args.personal_condition_mode=personality_only \
    trainer.pretrained.diffusion_prior=<diffusion-prior-model-path/checkpoint.pth> \
    trainer.pretrained.diffusion_decoder=<diffusion-decoder-model-path/checkpoint.pth> \
    trainer.pretrained.eeg_head_checkpoint=<eeg-head-checkpoint-path/checkpoint.pth>

The paths diffusion-prior-model-path, diffusion-decoder-model-path, and eeg-head-checkpoint-pathpoint point to the checkpoints saved from training generic_online/motion_diffusion.

(c) Condition Input: Listener historical facial behaviours + Personality_only

python main.py \
    --config-name personalized_online/perfrdiff_rewrite_weight \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.generic.train_eeg=true \
    trainer.generic.train_eeg_head_only=false \
    trainer.main_model.args.personal_condition_mode=3dmm_personality \
    trainer.pretrained.diffusion_prior=<diffusion-prior-model-path/checkpoint.pth> \
    trainer.pretrained.diffusion_decoder=<diffusion-decoder-model-path/checkpoint.pth> \
    trainer.pretrained.eeg_head_checkpoint=<eeg-head-checkpoint-path/checkpoint.pth>

The paths diffusion-prior-model-path, diffusion-decoder-model-path, and eeg-head-checkpoint-pathpoint point to the checkpoints saved from training generic_online/motion_diffusion.

Generic offline:

1. Motion Diffusion + EEG, first-stage backbone for S-PerReactor

Train this generic offline diffusion backbone before training personalized_offline/perreactor_offline. It saves the DiffusionPriorNetwork, TransformerDenoiser, and EEGPredictionHead checkpoints used by S-PerReactor.

python main.py \
    --config-name generic_offline/motion_diffusion \
    trainer.batch_size=4 \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.generic.train_eeg_head_only=false \
    trainer.model.diff_model.eeg_head.enabled=true

The checkpoints are saved under:

save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/

2. TransVAE + EEG

python main.py \
    --config-name generic_offline/motion_transvae \
    trainer.batch_size=4 \
    trainer.max_seq_len=750 \
    trainer.window_size=8 \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.train_eeg_head_only=false \
    trainer.model.eeg_head.enabled=true

3. ReGNN + EEG

(a) Run this command from the regnn/ directory:

cd ./regnn

(b) Extract the image features using the pre-trained swin_transformer (pretrained weights already provided in ./pretrained_models):

python feature_extraction.py

(c) Train the REGNN by running the following shell:

python train.py \
    --logs-dir "Gmm-logs-eeg-head" \
    --data-dir ./datasets/REACT2026/ \
    --enable-eeg-head \
    --eeg-loss-weight 0.25 \
    --lr 0.0001 \
    --gamma 0.1 \
    --warmup-factor 0.01 \
    --milestones 9 \
    --batch-size 64 \
    --layers 2 \
    --act "ELU" \
    --seed 1 \
    --train-iters 100 \
    --norm \
    --neighbor-pattern "all" \
    --convert-type "direct" \
    --loss-mid

Personalized offline:

1. S-PerReactor + EEG

S-PerReactor reuses the pretrained generic offline diffusion prior and decoder, freezes the generic backbone by default, and trains the listener personal adapter. Set trainer.perreactor.personal_condition_mode to choose the personal condition: history_only, personality_only, or history_personality. Set trainer.generic.train_eeg=false to train only the S-PerReactor adapter without EEG supervision.

(a) Condition Input: Listener historical emotion behaviours

python main.py \
    --config-name personalized_offline/perreactor_offline \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.generic.train_eeg=true \
    trainer.generic.train_eeg_head_only=false \
    trainer.perreactor.personal_condition_mode=history_only \
    trainer.pretrained.diffusion_prior=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/DiffusionPriorNetwork/checkpoint_best.pth \
    trainer.pretrained.diffusion_decoder=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/TransformerDenoiser/checkpoint_best.pth \
    trainer.pretrained.eeg_head_checkpoint=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/EEGPredictionHead/checkpoint_best.pth

(b) Condition Input: Personality traits

python main.py \
    --config-name personalized_offline/perreactor_offline \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.generic.train_eeg=true \
    trainer.generic.train_eeg_head_only=false \
    trainer.perreactor.personal_condition_mode=personality_only \
    trainer.pretrained.diffusion_prior=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/DiffusionPriorNetwork/checkpoint_best.pth \
    trainer.pretrained.diffusion_decoder=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/TransformerDenoiser/checkpoint_best.pth \
    trainer.pretrained.eeg_head_checkpoint=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/EEGPredictionHead/checkpoint_best.pth

(c) Condition Input: Listener historical emotion behaviours + Personality traits

python main.py \
    --config-name personalized_offline/perreactor_offline \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.generic.train_eeg=true \
    trainer.generic.train_eeg_head_only=false \
    trainer.perreactor.personal_condition_mode=history_personality \
    trainer.pretrained.diffusion_prior=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/DiffusionPriorNetwork/checkpoint_best.pth \
    trainer.pretrained.diffusion_decoder=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/TransformerDenoiser/checkpoint_best.pth \
    trainer.pretrained.eeg_head_checkpoint=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/EEGPredictionHead/checkpoint_best.pth

(d) Second-stage EEG head-only training

python main.py \
    --config-name personalized_offline/perreactor_offline \
    stage=fit \
    data_dir=./datasets/REACT2026/ \
    trainer.generic.train_eeg=true \
    trainer.generic.train_eeg_head_only=true \
    trainer.perreactor.personal_condition_mode=<same-as-adapter-training> \
    trainer.pretrained.adapter_checkpoint=<perreactor-adapter-checkpoint/checkpoint_best.pth> \
    trainer.pretrained.diffusion_prior=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/DiffusionPriorNetwork/checkpoint_best.pth \
    trainer.pretrained.diffusion_decoder=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/TransformerDenoiser/checkpoint_best.pth \
    trainer.pretrained.eeg_head_checkpoint=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/EEGPredictionHead/checkpoint_best.pth

The generic offline checkpoints come from the first-stage generic_offline/motion_diffusion run above. The S-PerReactor adapter checkpoint is saved under save/perreactor_offline/<data-name>/offline/<run-id>/PerReactor/.

Pretrained weights
  • to be released
Evaluation

For evaluation, please refer to test function in ./trainer/motion_diffusion.py (PerFRDiff baseline) or ./trainer/motion_transvae.py (Trans-VAE baseline). The metric computations are implemented in ./framework/utils/compute_metrics.py. The validation set can be treated as the test set by loading it via the provided dataloader file. As in the baseline paper, all facial reactions from different participants within the same session are defined as ground-truths. The pretrained model weights will be released soon.

Generic online:

1. PerFRDiff + EEG

python main.py \
    --config-name generic_online/motion_diffusion \
    trainer.batch_size=1 \
    stage=test \
    data_dir=./datasets/REACT2026/ \
    resume_id=<train-experiment-id> \
    trainer.generic.eval_eeg=true \
    trainer.model.diff_model.eeg_head.enabled=true

2. TransVAE + EEG

python main.py \
    --config-name generic_online/motion_transvae \
    trainer.batch_size=1 \
    trainer.max_seq_len=256 \
    trainer.window_size=16 \
    stage=test \
    data_dir=./datasets/REACT2026/ \
    trainer.data_transform=zero_center \
    resume_id=<train-experiment-id>  \
    trainer.eval_eeg=true \
    trainer.eval_eeg_metrics=true \
    trainer.eval_facial_metrics=true \
    trainer.save_results=true \
    trainer.renderer.do_render=false

Personalized online:

PerFRDiff rewrite-weight + EEG

(a) Condition Input: Listener historical facial behaviours

python main.py \
    --config-name personalized_online/perfrdiff_rewrite_weight \
    trainer.batch_size=1 \
    stage=test \
    data_dir=./datasets/REACT2026/ \
    resume_id=<train-experiment-id> \
    trainer.generic.eval_eeg=true \
    trainer.main_model.args.personal_condition_mode=3dmm_only \
    trainer.pretrained.diffusion_prior=<diffusion-prior-model-path/checkpoint.pth> \
    trainer.pretrained.diffusion_decoder=<diffusion-decoder-model-path/checkpoint.pth> \
    trainer.pretrained.eeg_head_checkpoint=<eeg-head-checkpoint-path/checkpoint.pth>

(b) Condition Input: Personality_only

python main.py \
    --config-name personalized_online/perfrdiff_rewrite_weight \
    trainer.batch_size=1 \
    stage=test \
    data_dir=./datasets/REACT2026/ \
    resume_id=<train-experiment-id> \
    trainer.generic.eval_eeg=true \
    trainer.main_model.args.personal_condition_mode=personality_only \
    trainer.pretrained.diffusion_prior=<diffusion-prior-model-path/checkpoint.pth> \
    trainer.pretrained.diffusion_decoder=<diffusion-decoder-model-path/checkpoint.pth> \
    trainer.pretrained.eeg_head_checkpoint=<eeg-head-checkpoint-path/checkpoint.pth>

(c) Condition Input: Listener historical facial behaviours + Personality_only

python main.py \
    --config-name personalized_online/perfrdiff_rewrite_weight \
    trainer.batch_size=1 \
    stage=test \
    data_dir=./datasets/REACT2026/ \
    resume_id=<train-experiment-id> \
    trainer.generic.eval_eeg=true \
    trainer.main_model.args.personal_condition_mode=3dmm_personality \
    trainer.pretrained.diffusion_prior=<diffusion-prior-model-path/checkpoint.pth> \
    trainer.pretrained.diffusion_decoder=<diffusion-decoder-model-path/checkpoint.pth> \
    trainer.pretrained.eeg_head_checkpoint=<eeg-head-checkpoint-path/checkpoint.pth>

Generic offline:

1. TransVAE + EEG

python main.py \
    --config-name generic_offline/motion_transvae \
    stage=test \
    data_dir=./datasets/REACT2026/ \
    trainer.batch_size=1 \
    trainer.max_seq_len=750 \
    trainer.window_size=8 \
    trainer.data_transform=zero_center \
    resume_id=<train-experiment-id> \
    trainer.eval_eeg=true \
    trainer.eval_eeg_metrics=true \
    trainer.eval_facial_metrics=true \
    trainer.save_results=true \
    trainer.renderer.do_render=false

2. ReGNN + EEG

python train.py \
  --test \
  --logs-dir "Gmm-logs-eeg-head" \
  --data-dir "./datasets/REACT2026/" \
  --model-pth "./baseline_react2026-main2/regnn/Gmm-logs-eeg-head/mhp-eeg-head-last-seed1.pth" \
  --enable-eeg-head \
  --eval-eeg \
  --metric-threads 1 \
  --eval-clip-batch-size 1 \
  --layers 2 \
  --act "ELU" \
  --seed 1 \
  --norm \
  --neighbor-pattern "all" \
  --convert-type "direct"

Personalized offline:

1. S-PerReactor + EEG

Use the same trainer.perreactor.personal_condition_mode as the training run (history_only, personality_only, or history_personality). Set trainer.generic.eval_eeg=false to evaluate facial metrics only, or trainer.generic.eval_eeg=true to also save GT_EEG, PRED_EEG, and EEG_MASK in results.pt.

python main.py \
    --config-name personalized_offline/perreactor_offline \
    trainer.batch_size=1 \
    stage=test \
    data_dir=./datasets/REACT2026/ \
    resume_id=<perreactor-train-experiment-id> \
    trainer.generic.eval_eeg=true \
    trainer.perreactor.personal_condition_mode=<same-as-training> \
    trainer.pretrained.diffusion_prior=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/DiffusionPriorNetwork/checkpoint_best.pth \
    trainer.pretrained.diffusion_decoder=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/TransformerDenoiser/checkpoint_best.pth \
    trainer.pretrained.eeg_head_checkpoint=save/motion_diffusion/react_2025/offline/checkpoints/<generic-offline-run-id>/EEGPredictionHead/checkpoint_best.pth

If the resumed S-PerReactor checkpoint was trained with trainer.generic.train_eeg=true, eeg_head.* is loaded from that checkpoint. Otherwise, keep trainer.pretrained.eeg_head_checkpoint set when trainer.generic.eval_eeg=true.

🖊️ Citation

Submissions should cite the following papers:

Theory paper and baseline paper:

[1] Song, Siyang, Micol Spitale, Yiming Luo, Batuhan Bal, and Hatice Gunes. "Multiple Appropriate Facial Reaction Generation in Dyadic Interaction Settings: What, Why and How?." arXiv preprint arXiv:2302.06514 (2023).

[2] Song, Siyang, Micol Spitale, Xiangyu Kong, Hengde Zhu, Cheng Luo, Cristina Palmero, German Barquero et al. "React 2025: the third multiple appropriate facial reaction generation challenge." In Proceedings of the 33rd ACM International Conference on Multimedia, pp. 13979-13984. 2025.

[3] Song, Siyang, Micol Spitale, Cheng Luo, Cristina Palmero, German Barquero, Hengde Zhu, Sergio Escalera et al. "React 2024: the second multiple appropriate facial reaction generation challenge." In 2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG), pp. 1-5. IEEE, 2024.

[4] Song, Siyang, Micol Spitale, Cheng Luo, Germán Barquero, Cristina Palmero, Sergio Escalera, Michel Valstar et al. "REACT2023: The First Multiple Appropriate Facial Reaction Generation Challenge." In Proceedings of the 31st ACM International Conference on Multimedia, pp. 9620-9624. 2023.

Annotation, basic feature extraction tools and baselines:

[6] Song, Siyang, Yuxin Song, Cheng Luo, Zhiyuan Song, Selim Kuzucu, Xi Jia, Zhijiang Guo, Weicheng Xie, Linlin Shen, and Hatice Gunes. "GRATIS: Deep Learning Graph Representation with Task-specific Topology and Multi-dimensional Edge Features." arXiv preprint arXiv:2211.12482 (2022).

[7] Luo, Cheng, Siyang Song, Weicheng Xie, Linlin Shen, and Hatice Gunes. (2022, July) "Learning multi-dimensional edge feature-based au relation graph for facial action unit recognition." Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (pp. 1239-1246).

[8] Toisoul, Antoine, Jean Kossaifi, Adrian Bulat, Georgios Tzimiropoulos, and Maja Pantic. "Estimation of continuous valence and arousal levels from faces in naturalistic conditions." Nature Machine Intelligence 3, no. 1 (2021): 42-50.

[9] Eyben, Florian, Martin Wöllmer, and Björn Schuller. "Opensmile: the munich versatile and fast open-source audio feature extractor." In Proceedings of the 18th ACM international conference on Multimedia, pp. 1459-1462. 2010.

Submissions are encouraged to cite previous personalized facial reaction generation papers:

[10] Zhu, Hengde, Xiangyu Kong, Weicheng Xie, Xin Huang, Linlin Shen, Lu Liu, Hatice Gunes, and Siyang Song. "Perfrdiff: Personalised weight editing for multiple appropriate facial reaction generation." In Proceedings of the 32nd ACM International Conference on Multimedia, pp. 9495-9504. 2024.

[11] Zhu, Hengde, Xiangyu Kong, Weicheng Xie, Xin Huang, Xilin He, Lu Liu, Linlin Shen, Wei Zhang, Hatice Gunes, and Siyang Song. "PerReactor: Offline Personalised Multiple Appropriate Facial Reaction Generation." In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 2, pp. 1665-1673. 2025.

[12] Song, Siyang, Zilong Shao, Shashank Jaiswal, Linlin Shen, Michel Valstar, and Hatice Gunes. "Learning Person-specific Cognition from Facial Reactions for Automatic Personality Recognition." IEEE Transactions on Affective Computing (2022).

[13] Shao, Zilong, Siyang Song, Shashank Jaiswal, Linlin Shen, Michel Valstar, and Hatice Gunes. "Personality recognition by modelling person-specific cognitive processes using graph representation." In proceedings of the 29th ACM international conference on multimedia, pp. 357-366. 2021.

Submissions are encouraged to cite previous generic facial reaction generation papers:

[14] Huang, Yuchi, and Saad M. Khan. "Dyadgan: Generating facial expressions in dyadic interactions." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 11-18. 2017.

[15] Huang, Yuchi, and Saad Khan. "A generative approach for dynamically varying photorealistic facial expressions in human-agent interactions." In Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp. 437-445. 2018.

[16] Barquero, German, Sergio Escalera, and Cristina Palmero. "Belfusion: Latent diffusion for behavior-driven human motion prediction." In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2317-2327. 2023.

[17] Zhou, Mohan, Yalong Bai, Wei Zhang, Ting Yao, Tiejun Zhao, and Tao Mei. "Responsive listening head generation: a benchmark dataset and baseline." In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXVIII, pp. 124-142. Cham: Springer Nature Switzerland, 2022.

[18] Luo, Cheng, Siyang Song, Weicheng Xie, Micol Spitale, Zongyuan Ge, Linlin Shen, and Hatice Gunes. "Reactface: Online multiple appropriate facial reaction generation in dyadic interactions." IEEE Transactions on Visualization and Computer Graphics 31, no. 9 (2024): 6190-6207.

[19] Xu, Tong, Micol Spitale, Hao Tang, Lu Liu, Hatice Gunes, and Siyang Song. "Reversible graph neural network-based reaction distribution learning for multiple appropriate facial reactions generation." IEEE Transactions on Affective Computing (2026).

[20] Liang, Cong, Jiahe Wang, Haofan Zhang, Bing Tang, Junshan Huang, Shangfei Wang, and Xiaoping Chen. "Unifarn: Unified transformer for facial reaction generation." In Proceedings of the 31st ACM International Conference on Multimedia, pp. 9506-9510. 2023.

[21] Yu, Jun, Ji Zhao, Guochen Xie, Fengxin Chen, Ye Yu, Liang Peng, Minglei Li, and Zonghong Dai. "Leveraging the latent diffusion models for offline facial multiple appropriate reactions generation." In Proceedings of the 31st ACM International Conference on Multimedia, pp. 9561-9565. 2023.

[22] Hoque, Ximi, Adamay Mann, Gulshan Sharma, and Abhinav Dhall. "BEAMER: Behavioral Encoder to Generate Multiple Appropriate Facial Reactions." In Proceedings of the 31st ACM International Conference on Multimedia, pp. 9536-9540. 2023.

[23] Nguyen, Dang-Khanh, Prabesh Paudel, Seung-Won Kim, Ji-Eun Shin, Soo-Hyung Kim, and Hyung-Jeong Yang. "Multiple facial reaction generation using gaussian mixture of models and multimodal bottleneck transformer." In 2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG), pp. 1-5. IEEE, 2024.

[24] Hu, Guanyu, Jie Wei, Siyang Song, Dimitrios Kollias, Xinyu Yang, Zhonglin Sun, and Odysseus Kaloidas. "Robust facial reactions generation: An emotion-aware framework with modality compensation." In 2024 IEEE International Joint Conference on Biometrics (IJCB), pp. 1-10. IEEE, 2024.

[25] Liu, Zhenjie, Cong Liang, Jiahe Wang, Haofan Zhang, Yadong Liu, Caichao Zhang, Jialin Gui, and Shangfei Wang. "One-to-many appropriate reaction mapping modeling with discrete latent variable." In 2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG), pp. 1-5. IEEE, 2024.

[26] Dam, Quang Tien, Tri Tung Nguyen Nguyen, Dinh Tuan Tran, and Joo-Ho Lee. "Finite scalar quantization as facial tokenizer for dyadic reaction generation." In 2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG), pp. 1-5. IEEE, 2024.

[27] Luo, Jiachen, Jiajun He, Shuai Shen, Lin Wang, Huy Phan, Joshua Reiss, Lin Haijun, Bjoern Schuller, Zeyu Fu, and Siyang Song. "MReactor: Offline Multiple Appropriate Facial Reaction Generation with Hierarchical Cognitive Disentanglement." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3354-3363. 2026.

[28] Xie, Weicheng, Chunlin Yan, Siyang Song, Zitong Yu, Linlin Shen, and Laizhong Cui. "Smooth Online Multiple Appropriate Facial Reaction Generation." In Proceedings of the 33rd ACM International Conference on Multimedia, pp. 5804-5813. 2025.

[29] Mao, Qirong, Qiwei Wu, Na Liu, Yakui Ding, and Lijian Gao. "Scattering-Conditioned Diffusion Models for Multiple Appropriate Facial Reaction Generation." In Proceedings of the 33rd ACM International Conference on Multimedia, pp. 13985-13991. 2025.

[30] Wang, Peng, Pujun Xue, Xiaofeng Liu, and Tongjuan Ji. "Explaining Listener Reactions: Personality-Guided Facial Response Generation with Cross-Modal Attention." In Proceedings of the 33rd ACM International Conference on Multimedia, pp. 13997-14003. 2025.

[31] Huang, Jiajian, and Zitong Yu. "Multiple Appropriate Facial Reaction Generation Based on Multi-View Transformation of Speaker Video." In Proceedings of the 33rd ACM International Conference on Multimedia, pp. 13992-13996. 2025.

[32] Nguyen, Minh-Duc, Hyung-Jeong Yang, Ngoc-Huynh Ho, Soo-Hyung Kim, Seungwon Kim, and Ji-Eun Shin. "Vector quantized diffusion models for multiple appropriate reactions generation." In 2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG), pp. 1-5. IEEE, 2024.

[33] Lv, Qincheng, Xiaofeng Liu, Jie Li, Rongrong Ni, Pujun Xue, and Siyang Song. "Hierarchical multimodal decoupling-fusion framework for offline multiple appropriate facial reaction generation." In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1-5. IEEE, 2025.

[34] Luo, Cheng, Siyang Song, Siyuan Yan, Zhen Yu, and Zongyuan Ge. "ReactDiff: Fundamental Multiple Appropriate Facial Reaction Diffusion Model." In Proceedings of the 33rd ACM International Conference on Multimedia, pp. 5607-5616. 2025.

[35] Li, Jiaming, Sheng Wang, Xin Wang, Yitao Zhu, Honglin Xiong, Zixu Zhuang, and Qian Wang. "Reactdiff: Latent diffusion for facial reaction generation." Neural Networks 189 (2025): 107596.

🤝 Acknowledgement

Thanks to the open source of the following projects:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages