Effective Knowledge Distillation on Reasoning Chains for Aspect-Based Sentiment and Emotion Analysis
This repository provides a framework for knowledge distillation in aspect-based sentiment and emotion analysis. A LLM (teacher) generates step-by-step reasoning traces and annotations for unlabelled text, which are then used to train a smaller student model to learn both intermediate reasoning steps and final predictions.
-
Teacher-Student Knowledge Distillation:
- The teacher (large language model) annotates unlabelled data with reasoning chains for five tasks:
- Aspect extraction
- Syntactic parsing
- Opinion extraction
- Sentiment classification
- Emotion classification
- The student (smaller model) is trained on these annotations to mimic the teacher's reasoning and predictions.
- The teacher (large language model) annotates unlabelled data with reasoning chains for five tasks:
-
Pipeline Visualization:
- Clone the repository:
git clone https://github.com/onatakca/synchain-absa-emotion cd synchain-absa-emotion - Create and activate a virtual environment:
python -m venv venv source venv/bin/activate - Install dependencies:
pip install -r requirements.txt
Run the teacher model to annotate unlabelled data:
python scripts/annotation/annotate.pyTrain the student model using the generated annotations and configuration files:
python scripts/modeling/knowledge_distillation.pyConfiguration files for training are located in:
scripts/modeling/configs/
Prompts and emotion label definitions are in:
scripts/qwen_model/prompts.py
- Input chunks for teacher annotation:
data/input_data/chunks_for_teacher_model_ann/
- Teacher annotated outputs (Qwen25-32b-Instruct):
data/output_data/Qwen25-32b-instruct_annotation/

