A curated list of Jupyter notebooks for querying and analyzing library collections as data.
-
V2.0 -- Updated March 2026. Current links, status assessments, and new resources including the GLAM Workbench, Library of Congress Jupyter Book, and modern OCR correction tools.
-
V1.0 -- Original list compiled 2020-2021. Archived for reference. Includes a BERT-based OCR correction notebook (
BERT_OCR.ipynb) that uses deprecated libraries.
Collections as data is a movement to mediate library collections in computational formats. Jupyter notebooks provide an accessible way to introduce data-based explorations of library collections. This repo curates notebooks from galleries, libraries, archives, and museums (GLAM) worldwide.