ANLP Project

Data

Refer to this Google Drive for data, models, embeddings, and results.

Purpose: Detect hateful memes by combining CLIP image/text embeddings with a lightweight cross-modal attention classifier.
Pipeline: generate LMM knowledge → build enriched CLIP embeddings → train classifier → run inference.

Generate LMM knowledge
- python generate_knowledge.py
- Writes knowledge/lmm_knowledge_{train,val,test}.json mapping meme id → descriptions (10) and emotions (10).
Generate embeddings
- python generate_embeddings.py
- Uses dataset + knowledge to emit hateful_memes_clip_embeddings_{train,val,test}.npz containing image_embeddings, text_embeddings, desc_embeddings, emotion_embeddings, text_concat_embeddings, meme_concat_embeddings, labels, ids, valid_indices.
Train
- python training.py
- Consumes image_embeddings + text_concat_embeddings for train/val and saves best_model.pth (includes image_dim and text_dim in config).
Predict
- python predictions.py
- Loads best_model.pth and test image_embeddings + text_concat_embeddings, writes predictions.npz, and prints accuracy/AUC/confusion matrix.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
knowledge		knowledge
mm_routing		mm_routing
.gitignore		.gitignore
README.md		README.md
convert_pridemm_to_hateful_format.py		convert_pridemm_to_hateful_format.py
generate_embeddings.py		generate_embeddings.py
generate_embeddings_pridemm.py		generate_embeddings_pridemm.py
generate_knowledge.py		generate_knowledge.py
generate_knowledge_pridemm.py		generate_knowledge_pridemm.py
pride_mm_embeddings.py		pride_mm_embeddings.py
training.py		training.py