verify-bib

A Claude Code skill (also usable as a standalone CLI) that catches AI-hallucinated citations in BibTeX files. Inspired by TrueCite, built on Semantic Scholar.

Why

LLMs confidently fabricate plausible-looking citations. Running a paper draft through this tool surfaces:

Pure hallucinations — papers that don't exist anywhere.
Author corruption — real paper, wrong author list.
Venue sloppiness — real paper, venue field garbled (e.g. arXiv preprint cited as a journal article).

Install

git clone https://github.com/mufanq/verify-bib-skill.git
cd verify-bib-skill
pip install -r requirements.txt

To use as a Claude Code skill, symlink into your skills directory:

ln -s "$(pwd)" ~/.claude/skills/verify-bib

API key is optional

Runs out of the box without a key (shared 5000 req / 5 min pool). For higher throughput request a free key at https://www.semanticscholar.org/product/api and:

export SEMANTIC_SCHOLAR_API_KEY=your_key_here   # ~/.zshrc or ~/.bashrc

The script auto-detects whether a key is set:

No key → unauthenticated requests, 0.2 s sleep between entries.
Key present → x-api-key header, 1.05 s sleep between entries.

You can also pass --api-key sk_... on the command line to override. The key is never logged or committed — .env and common secret files are in .gitignore.

Use

# Human-readable report
python3 verify_bib.py references.bib

# Machine-readable (pipe into jq, CI, etc.)
python3 verify_bib.py references.bib --json

Exit codes: 0 all clean, 1 issues found, 2 file not found. Suitable as a pre-submission gate.

How it works

Parse .bib with pybtex.
For each entry, query Semantic Scholar's /paper/search/match with the title (fall back to /paper/search if the match endpoint returns nothing).
Compute three fuzzy scores (rapidfuzz token-set ratio on title & venue, last-name set overlap on authors).
verified = title_score ≥ 0.85. Author / venue scores are surfaced as additional flags.
Cache successful lookups in ~/.cache/verify-bib/s2_cache.sqlite for 30 days.

Design mirrors TrueCite

The scoring + judgment model follows the reverse-engineered behavior of wispaper.ai/agents/true-cite — title match is the primary verdict, author / venue mismatches are surfaced as secondary flags rather than hard failures. This matches how real BibTeX files drift: the paper is usually real, but author lists and venue strings are often truncated or auto-generated from lossy sources.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
requirements.txt		requirements.txt
verify_bib.py		verify_bib.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

verify-bib

Why

Install

API key is optional

Use

How it works

Design mirrors TrueCite

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

verify-bib

Why

Install

API key is optional

Use

How it works

Design mirrors TrueCite

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages