Skip to content

jasp-nerd/ai-text-detector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Text Detector

A fine-tuned DistilBERT model that classifies text as human-written or AI-generated. Comes with a Gradio web interface for easy testing.

I built this as part of my work on AI literacy at VU Amsterdam. Detecting AI-generated text is a real and growing problem in education, and I wanted to see how well a simple transformer classifier could do.

How it works

The model is a DistilBERT (66M params) fine-tuned on the GPT-wiki-intro dataset, which contains ~150k pairs of human-written and GPT-generated Wikipedia introductions. After 3 epochs of training it gets around 98% accuracy on the validation set.

Obviously this won't catch everything. It's trained on Wikipedia-style text and older GPT output, so newer models like GPT-4 or Claude will be harder to detect. But it's a solid demo of how transfer learning works for this kind of task.

Setup

git clone https://github.com/jasp-nerd/ai-text-detector.git
cd ai-text-detector

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

pip install -r requirements.txt

Training

The dataset downloads automatically from Hugging Face when you first run training.

python train.py

Takes about 30-60 min on a GPU, or 2-3 hours on CPU. The trained model gets saved to ./model/.

Usage

Web interface:

python app.py
# opens at http://localhost:7860

Command line:

python predict.py

In your own code:

from predict import TextDetectorPredictor

p = TextDetectorPredictor('./model')
result = p.predict("Some text to check...")
print(result['prediction'], result['confidence'])

Limitations

  • Trained on Wikipedia text, so it may underperform on other domains (tweets, essays, code, etc.)
  • Older GPT-2 style output, newer models are harder to detect
  • Short texts (<100 chars) are unreliable
  • Binary classification only, doesn't tell you which AI wrote it

About

Fine-tuned DistilBERT classifier for detecting AI-generated text

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages