ChronoQuill

ChronoQuill's transformation pipeline leverages AI-powered HTR, layout classification, and few-shot learning to convert handwritten documents into structured Markdown. The indexing model is here on Hugging Face. For more technical details, access the working paper here.

Setup Instructions

git clone git@github.com:eth-library/ChronoQuill.git
cd ChronoQuill

Environment and Libraries

uv venv chrono
source chrono/bin/activate

uv pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
uv pip install google-genai timm dotenv

Environment Variables

Create a .env file in the project root and add your Gemini API key:

GEMINI_API_KEY=your_api_key_here

Classifier & Few-Shot Samples

chmod u+x setup.sh && ./setup.sh

Project Structure

chrono_quill.py — Main pipeline script
utils.py — Utility functions and helper classes
prompts.py — System prompts for Gemini API
few_shot/ — Few-shot ground truth samples
models/ — Pretrained model files
data/ — Input and output data

Transform TIFF & JPG into Markdown

python chrono_quill.py

License

We release ChronoQuill under the Apache 2.0 license.

References

Remarks

The pipeline is specialized to process ETH's school council protocols. For different use cases, consider pretraining your own classifier and provide suitable grount truth for few-shot learning.

Citation

If you use this pipeline in your work, please cite:

@article{marbach2026closed,
  title={Closed-Vocabulary Multi-Label Indexing Pipeline for Historical Documents},
  author={Marbach, Jeremy},
  year={2026},
  publisher={ETH Zurich},
  url={https://www.research-collection.ethz.ch/server/api/core/bitstreams/8053d4d8-51b4-4103-8164-b5068ddb3903/content}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
supplements		supplements
.gitignore		.gitignore
LICENSE		LICENSE
chrono_quill.py		chrono_quill.py
prompts.py		prompts.py
readMe.md		readMe.md
setup.sh		setup.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChronoQuill

Setup Instructions

Environment and Libraries

Environment Variables

Classifier & Few-Shot Samples

Project Structure

Transform TIFF & JPG into Markdown

License

References

Remarks

Citation

About

Uh oh!

Releases

Packages

Languages

License

eth-library/ChronoQuill

Folders and files

Latest commit

History

Repository files navigation

ChronoQuill

Setup Instructions

Environment and Libraries

Environment Variables

Classifier & Few-Shot Samples

Project Structure

Transform TIFF & JPG into Markdown

License

References

Remarks

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages