Raw SF novel corpus and the scripts that turn it into model-ready data. Contains the source .txt novels (text/), per-novel chunked JSON outputs for BERT (data/), and the conversion scripts.
Multi-task learning code for plot element classification. Includes the training/evaluation entry points (main.py, train.py, evaluation.py), model definitions (BERT / ELECTRA / RoBERTa / BGE-M3), configs, and the .pkl datasets used for training.
GPT-based science-fiction novel generator. Uses the token-length distribution from plot_element.json to sample target lengths, then prompts GPT chapter-by-chapter; output novels are written to novel_output/. API key is loaded from a local .env file (gitignored).