Skip to content

Tutorials for the software project on mechanistic interpretability (WS24/25).

Notifications You must be signed in to change notification settings

tanikina/mi-tutorials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

Model Interpretability with Captum and Logit Lens @ UdS WS 2024/2025

🧭 Tutorial Roadmap


Main Path Notebooks

Topic Keywords Jupyter Notebook Colab Link
Interpreting LLMs for text generation Llama, Shapley values, Integrated gradients LLM_Attribution_with_Llama Open in Colab
Interpreting BERT QA models BERT, embeddings, attention attributions BERT_QA_Interpretability Open in Colab
Interpreting BERT QA models BERT, attention matrices, importance scores BERT_QA_Interpretability2 Open in Colab
Logit Lens Example Logit Lens LogitLens Open in Colab

Mini Project on Token Attribution

Mini Project Open in Colab
Mini Project (with solution) Open in Colab

About

Tutorials for the software project on mechanistic interpretability (WS24/25).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published