Tired of PDFs with cryptic names like hhaf081.pdf or 1-s2.0-S0377221718308774-main.pdf?
rename-academic-pdf automatically renames your academic paper pdf files to meaningful filenames with title, author(s), year, and journal. It can also generate BibTeX files and convert PDFs to markdown in one command. For example:
paper.pdf → Author2024-PaperTitle-Journal.pdf
- 📄 Smart renaming — Extracts metadata from DOI, arXiv, or paper content
- 📚 BibTeX export — Automatically build your bibliography file
- 📝 Markdown conversion — Convert PDFs to markdown for AI/LLM workflows
- 🔄 Batch processing — Rename hundreds of papers with one command
- 🌐 7+ academic APIs — CrossRef, OpenAlex, Semantic Scholar, arXiv, PubMed, and more
- ⚡ No API key required — Works out of the box
# Install
pip install -U rename-academic-pdf
# Rename a single PDF
rename-academic-pdf paper.pdf
# Batch rename all PDFs
rename-academic-pdf *.pdf
# Preview changes without renaming
rename-academic-pdf paper.pdf --dry-run
# Export BibTeX entries
rename-academic-pdf *.pdf --bib-file references.bib
# Generate markdown versions along with BibTeX entries
pip install -U "rename-academic-pdf[all]"
rename-academic-pdf *.pdf --markdown-dir ./markdown/ --bib-file references.bibpip install rename-academic-pdf
# With optional features
pip install "rename-academic-pdf[all]" # LLM fallback + markdown conversiongit clone https://github.com/maifeng/rename-academic-pdf.git
cd rename-academic-pdf
pip install -e .Requirements: Python 3.7+
- Intelligent identifier extraction: DOI, arXiv ID, PMID from filename, PDF text, and metadata
- Multi-API cascade: Queries 7+ academic databases with smart fallbacks
- BibTeX export: Fetch or generate BibTeX entries with PDF/markdown paths
- Markdown conversion: Convert PDFs to markdown using markitdown
- Journal abbreviations: Built-in abbreviations for 100+ journals and custom overrides
- Batch processing: Rename multiple PDFs with wildcards (
*.pdf,**/*.pdf) - LLM fallback: Use GPT models to extract metadatawhen APIs fail (optional)
- No API key required: Most APIs are free (optional keys for better rate limits)
Default format: AuthorsYear-Title-Journal.pdf
- ≤ 5 authors: All authors concatenated (e.g.,
SmithJones2024-...) - > 5 authors: First author + "EtAl" (e.g.,
SmithEtAl2024-...)
You can override the default format string using command line options or in a config file (see Configuration File section).
| Preset | Template | Example |
|---|---|---|
default |
{author}{year}-{title}-{journal} |
Author2025-PaperTitle-JournalName.pdf |
compact |
{author}{year}-{title} |
Author2025-PaperTitle.pdf |
full |
{author}-{year}-{title}-{journal} |
Author-2025-PaperTitle-JournalName.pdf |
minimal |
{author}{year} |
Author2025.pdf |
year_first |
{year}-{author}-{title} |
2025-Author-PaperTitle.pdf |
journal_first |
{journal}-{author}{year}-{title} |
JournalName-Author2025-PaperTitle.pdf |
rename-academic-pdf paper.pdf --format compact # No journal
rename-academic-pdf paper.pdf --format minimal # Author + year only
rename-academic-pdf paper.pdf --format year_first # Year firstCreate your own format using template variables:
{author}- Author name(s): all authors if ≤5, FirstAuthorEtAl if >5{year}- Publication year{title}- Paper title{journal}- Journal abbreviation
rename-academic-pdf paper.pdf --format-string '{journal}_{year}_{author}'
rename-academic-pdf paper.pdf --format-string '{author}-{title}'--first-author-only: Use only first author
rename-academic-pdf paper.pdf --first-author-only
# Output: Smith2024-Title-Journal.pdf (instead of SmithJonesBrown2024-...)--separator (- or _): Change separator character
rename-academic-pdf paper.pdf --separator _
# Output: Smith2024_Title_Journal.pdf--journal-abbrev-file: Use custom journal abbreviations file
rename-academic-pdf paper.pdf --journal-abbrev-file ~/my-journals.json
# Uses custom abbreviations from the specified JSON file
# Can be saved in ~/.rename-academic-pdf/journal_abbreviations.json for automatic loading--max-title-length: Maximum title length in filename (default: 80)
rename-academic-pdf paper.pdf --max-title-length 120
# Longer titles allowed (truncates at word boundary, never mid-word)--bib-file: Append BibTeX entries to a file
rename-academic-pdf paper.pdf --bib-file ~/references.bib
# Fetches BibTeX from DOI.org or arXiv, or generates from metadata--markdown-dir: Generate markdown versions of PDFs
rename-academic-pdf paper.pdf --markdown-dir ~/markdown/
# Converts PDFs to markdown using markitdown
# Requires: pip install "rename-academic-pdf[markdown]"The script tries multiple APIs in cascade order:
- DOI → DOI.org → CrossRef → DataCite → Semantic Scholar
- arXiv ID → arXiv API → Semantic Scholar
- SSRN ID → Convert to DOI (
10.2139/ssrn.{id}) → DOI.org → CrossRef - PMID → PubMed API
- Semantic Scholar (200M+ papers, CS/AI focus)
- DBLP (Computer science bibliography)
- OpenAlex (200M+ papers, all fields)
- DOI.org: Authoritative DOI resolver (Citeproc JSON)
- CrossRef: 130M+ journal articles (including SSRN)
- DataCite: Datasets, conferences, grey literature
- arXiv: STEM preprints
- SSRN: Working papers (via DOI lookup)
- PubMed: Biomedical literature
- Semantic Scholar: CS/AI papers (optional API key)
- DBLP: Computer science papers
- OpenAlex: Comprehensive, free, no API key
# ~/.bashrc or ~/.zshrc
export SEMANTIC_SCHOLAR_API_KEY="your-api-key-here"
export PUBMED_API_KEY="your-api-key-here" # For faster rate limits
export EMAIL="your@email.com" # For CrossRef polite pool
export OPENAI_API_KEY="your-api-key-here" # For --llm flag (OpenAI)
export OPENROUTER_API_KEY="your-api-key-here" # For --llm flag (OpenRouter)Get a free Semantic Scholar API key: https://www.semanticscholar.org/product/api
When the --llm flag is enabled, the script will use an LLM as a fallback after all API-based methods fail. It extracts metadata from the first 3 pages of PDF text. This could be useful for working papers without doi. The default model is gpt-4.1-mini. Supports other OpenAI and OpenRouter model.
# Uses OPENAI_API_KEY
rename-academic-pdf *.pdf --llm
rename-academic-pdf *.pdf --llm --llm-model gpt-4o-miniUse provider/model format to automatically use OpenRouter:
# Uses OPENROUTER_API_KEY (auto-detected from model format)
rename-academic-pdf *.pdf --llm --llm-model anthropic/claude-3-haiku
rename-academic-pdf *.pdf --llm --llm-model google/gemini-2.0-flash-001Requirements:
pip install openai(orpip install "rename-academic-pdf[llm]")OPENAI_API_KEYfor OpenAI models, orOPENROUTER_API_KEYfor OpenRouter models set in environment variables.
The package includes built-in abbreviations for 100+ major academic journals. For example:
- "Journal of Management Information Systems" → "JMIS"
- "Information Systems Research" → "ISR"
- "Review of Financial Studies" → "RFS"
You can provide your own journal abbreviations to override or extend the built-in list. The package searches for custom abbreviation files in the following order:
- Command-line argument:
--journal-abbrev-file path/to/file.json - User's home directory:
~/.rename-academic-pdf/journal_abbreviations.json - Default bundled file: Built-in abbreviations
Create a JSON file with the following structure:
{
"comment": "My custom journal abbreviations",
"abbreviations": {
"Journal of Interesting Research": "JIR",
"Quarterly Review of Examples": "QRE",
"Proceedings of Example Conference": "PEC"
}
}Option 1: Command-line argument
rename-academic-pdf paper.pdf --journal-abbrev-file ~/my-journals.jsonOption 2: User home directory (automatically loaded)
# Create the directory
mkdir -p ~/.rename-academic-pdf
# Copy or create your custom file
cp my-journals.json ~/.rename-academic-pdf/journal_abbreviations.json
# Run normally - custom abbreviations will be used automatically
rename-academic-pdf paper.pdfYou can set default options by creating a config file at ~/.rename-academic-pdf/config.json:
{
"format_string": "{author}_{year}_{journal}_{title}",
"first_author_only": true,
"max_title_length": 100,
"llm": true,
"llm_model": "gpt-4o-mini",
"bib_file": "~/papers.bib",
"markdown_dir": "~/paper_markdown"
}| Option | Type | Default | Description |
|---|---|---|---|
format |
string | "default" |
Format preset (default, compact, full, minimal, year_first, journal_first) |
format_string |
string | - | Custom format string (overrides format if both set) |
separator |
string | "-" |
Separator character ("-" or "_") |
first_author_only |
boolean | false |
Use only first author |
max_title_length |
integer | 80 |
Maximum title length in filename (truncates at word boundary) |
llm |
boolean | false |
Enable LLM fallback |
llm_model |
string | "gpt-4.1-mini" |
LLM model for --llm mode |
bib_file |
string | - | Path to BibTeX file to append entries to |
markdown_dir |
string | - | Directory to save markdown versions of PDFs |
Command-line arguments always override config file settings.
MIT License - see LICENSE file
Created by Feng Mai.
☕ If this tool saved you time, consider buying me a coffee