A Rust CLI for searching arXiv, listing the latest papers in a category, downloading papers, and managing a local paper library.
This project is not published to crates.io yet. Install it from source:
git clone <your-repo-url> arxiv-cli
cd arxiv-cli
cargo install --path .or install it from npm:
npm install -g arxiv-cliVerify the install:
arxiv --helpRun without global install during development:
cargo run -- --help- Search arXiv by keyword, title, author, category, and submitted date range
- List the newest papers in a category, always sorted by submission time
- Show full paper metadata including abstract, PDF URL, and source URL
- Download PDF, LaTeX source, or both
- Automatically extract downloaded source archives by default
- Register a default download directory
- Download multiple papers in parallel
- Maintain a local JSON-backed library
- Output either human-readable tables or JSON
# keyword search
arxiv search "diffusion models"
# title-only search
arxiv search --title "skill"
# search with filters
arxiv search "transformer" --author "Vaswani" --category cs.CL --from 2024-01-01 --to 2024-12-31
# list newest papers in a category
arxiv latest cs.CL --limit 10
# show one paper
arxiv show 1706.03762
# download PDF and source
arxiv download 1706.03762 2401.12345 --format both --jobs 4
# set a default download directory
arxiv config set-download-dir ~/Documents/papers/arxivUse search when you want flexible retrieval across general arXiv text fields.
Examples:
arxiv search "skill"
arxiv search "diffusion models" --limit 20
arxiv search "llm" --category cs.CL --sort submitted
arxiv search --title "large language model"
arxiv search --author "Yann LeCun"
arxiv search "transformer" --from 2025-01-01 --to 2025-12-31
arxiv search "skill" --include-abstract
arxiv search "skill" --json
arxiv search "skill" --json --include-abstractKey parameters:
QUERY: optional keyword query. Multiple words are combined withAND--title <TITLE>: search title only--author <AUTHOR>: filter by author name--category <CATEGORY>: filter by arXiv category such ascs.CL,cs.LG,math.PR--from <YYYY-MM-DD>/--to <YYYY-MM-DD>: submitted-date range--sort relevance|updated|submitted: choose ranking field--order asc|desc: choose sort direction--limit <N>: number of results to return--start <N>: pagination offset--include-abstract: include abstracts in table output and JSON output--json: print machine-readable JSON. By default JSON omitsabstract_text; add--include-abstractto include it
Search behavior:
arxiv search "diffusion models"meansdiffusion AND modelsarxiv search --title "diffusion models"searches that phrase in the title field- Plain keyword search uses arXiv's broad text search fields, which include title/abstract-style matching
Use latest when you already know the category and only want the newest submissions.
Examples:
arxiv latest cs.CL
arxiv latest cs.CL --limit 20
arxiv latest cs.CL --from 2025-01-01 --to 2025-12-31
arxiv latest math.PR --include-abstract
arxiv latest cs.LG --json
arxiv latest cs.LG --json --include-abstractKey parameters:
<CATEGORY>: required arXiv category, such ascs.CL,cs.LG,stat.ML--limit <N>: number of papers to return, default10--from <YYYY-MM-DD>/--to <YYYY-MM-DD>: submitted-date range--include-abstract: include abstracts in table output and JSON output--json: print JSON instead of a table. By default JSON omitsabstract_text; add--include-abstractto include it
Behavior:
latestis always sorted by submitted time descending- It is designed for category feeds, not institution/affiliation search
Use show to inspect one paper in detail.
arxiv show 1706.03762
arxiv show 1706.03762 --jsonReturned information includes:
- title
- authors
- abstract
- categories
- published / updated time
- PDF URL
- source URL
- local library status if the paper is already saved
Use download to save PDF, source, or both to disk.
Examples:
arxiv download 1706.03762
arxiv download 1706.03762 --format source
arxiv download 1706.03762 2401.12345 --format both --jobs 4
arxiv download 1706.03762 --output ~/Downloads/arxiv
arxiv download 1706.03762 --force
arxiv download 1706.03762 --no-library-updateKey parameters:
[IDS]...: one or more arXiv IDs--format pdf|source|both: what to download--output <DIR>: override the default download directory for this command--jobs <N>: parallel download concurrency, default4--force: re-download even if the file already exists--no-library-update: skip writing download status back to the local library
Output behavior:
- successful downloads print the paper ID
- the CLI also prints the saved file path for PDF and/or source
- source downloads are extracted by default; the printed
source:path is the extracted directory
Example output:
downloaded 1706.03762v7
pdf: /.../1706.03762v7.pdf
source: /.../arXiv-1706.03762v7 # tar.gz is extracted to this directory automatically
Use library to maintain a local index of papers you care about.
Examples:
arxiv library add 1706.03762
arxiv library list
arxiv library list --downloaded-only
arxiv library list --category cs.CL
arxiv library list --author "Vaswani"
arxiv library show 1706.03762
arxiv library remove 1706.03762
arxiv library remove 1706.03762 --purge-filesSubcommands:
library add <ID...>: save metadata into the local library without downloadinglibrary list: list library entrieslibrary show <ID>: show one local entry with file paths and statuslibrary remove <ID>: remove one entry from the index--purge-files: remove associated local files too
Use config to inspect or update default behavior.
Examples:
arxiv config show
arxiv config set-download-dir ~/Documents/papers/arxiv
arxiv pathUse cases:
- register a default download folder once, then omit
--output - print resolved config/data/library/download paths for scripting
arxiv search "test-time scaling" --category cs.CL --limit 5
arxiv show 2501.12345arxiv latest cs.CL --limit 20
arxiv latest cs.CL --from 2026-01-01 --to 2026-04-01arxiv config set-download-dir ~/Documents/papers/arxiv
arxiv download 1706.03762 2401.12345 --format both --jobs 4arxiv library add 1706.03762 2401.12345
arxiv library list
arxiv library show 1706.03762By default the CLI stores:
config.tomlin the platform config directorylibrary.jsonin the platform data directory- downloaded files under
papers/<arxiv_id>/
Print the resolved paths with:
arxiv pathlatestcurrently works on arXiv categories, not institution/affiliation feeds- Search date filters apply to submitted time
- If you want script-friendly output, prefer
--json
cargo test