A fine-tuned DistilBERT model that classifies text as human-written or AI-generated. Comes with a Gradio web interface for easy testing.
I built this as part of my work on AI literacy at VU Amsterdam. Detecting AI-generated text is a real and growing problem in education, and I wanted to see how well a simple transformer classifier could do.
The model is a DistilBERT (66M params) fine-tuned on the GPT-wiki-intro dataset, which contains ~150k pairs of human-written and GPT-generated Wikipedia introductions. After 3 epochs of training it gets around 98% accuracy on the validation set.
Obviously this won't catch everything. It's trained on Wikipedia-style text and older GPT output, so newer models like GPT-4 or Claude will be harder to detect. But it's a solid demo of how transfer learning works for this kind of task.
git clone https://github.com/jasp-nerd/ai-text-detector.git
cd ai-text-detector
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txtThe dataset downloads automatically from Hugging Face when you first run training.
python train.pyTakes about 30-60 min on a GPU, or 2-3 hours on CPU. The trained model gets saved to ./model/.
Web interface:
python app.py
# opens at http://localhost:7860Command line:
python predict.pyIn your own code:
from predict import TextDetectorPredictor
p = TextDetectorPredictor('./model')
result = p.predict("Some text to check...")
print(result['prediction'], result['confidence'])- Trained on Wikipedia text, so it may underperform on other domains (tweets, essays, code, etc.)
- Older GPT-2 style output, newer models are harder to detect
- Short texts (<100 chars) are unreliable
- Binary classification only, doesn't tell you which AI wrote it