Skip to content

danielphan-dp/prism-docs

Repository files navigation

Prism Docs

CLI toolkit for PDF editing, security, OCR, and image conversion.

Installation

git clone https://github.com/danielphan-dp/pdf-tools.git prism-docs
cd prism-docs
uv sync

For image operations:

uv sync --extra images

For OCR operations:

uv sync --extra ocr
# Also requires: sudo apt install tesseract-ocr (Linux)

Quick Start

prism-docs --help
prism-docs <command> --help

pdf-tools remains available as a command alias for compatibility.

Commands

See docs/commands for the full reference. For quick scanning, here are all CLI subcommands with their signatures (from prism-docs <command> --help).

Note: global flags like -c/--config, -v/--verbose, -q/--quiet, --dry-run, --parallel, and --output-dir apply to every command.

Basic operations

Command Signature
encrypt prism-docs encrypt [-o OUTPUT] [--owner-password OWNER_PASSWORD] [--algorithm {RC4-40,RC4-128,AES-128,AES-256}] input password
decrypt prism-docs decrypt [-o OUTPUT] input password
merge prism-docs merge output inputs [inputs ...]
watermark prism-docs watermark [-o OUTPUT] [--layer {above,below}] [--pages PAGES] input watermark
compress prism-docs compress [-o OUTPUT] inputs [inputs ...]
metadata prism-docs metadata [--action {view,edit}] [-o OUTPUT] [--title TITLE] [--author AUTHOR] [--subject SUBJECT] input

Page manipulation

Command Signature
extract-pages prism-docs extract-pages [-o OUTPUT] [--start START] [--end END] [--pages PAGES] input
extract-text prism-docs extract-text [-o OUTPUT] [--separator SEPARATOR] inputs [inputs ...]
rotate prism-docs rotate [-o OUTPUT] [--pages PAGES] input {90,180,270}
split prism-docs split [--mode {pages,ranges}] [--ranges RANGES] [--output-dir OUTPUT_DIR] input
page-numbers prism-docs page-numbers [-o OUTPUT] [--position {bottom-center,bottom-left,bottom-right,top-center,top-left,top-right}] [--format FORMAT] [--font-size FONT_SIZE] [--margin MARGIN] [--start START] [--skip-first] input
stamp prism-docs stamp [-o OUTPUT] [--position {center,top-left,top-right,bottom-left,bottom-right}] [--font-size FONT_SIZE] [--rotation ROTATION] [--opacity OPACITY] [--color COLOR] [--pages PAGES] input text
reverse prism-docs reverse [-o OUTPUT] input
interleave prism-docs interleave [-o OUTPUT] [--reverse-second] input1 input2
remove-pages prism-docs remove-pages [-o OUTPUT] input pages
overlay prism-docs overlay [-o OUTPUT] [--pages PAGES] input overlay

Image operations (requires the images extra)

Command Signature
images-to-pdf prism-docs images-to-pdf -o OUTPUT [--page-size PAGE_SIZE] [--margin MARGIN] [--fit {contain,cover,stretch}] images [images ...]
pdf-to-images prism-docs pdf-to-images [--output-dir OUTPUT_DIR] [--format {png,jpeg,webp}] [--dpi DPI] [--pages PAGES] input
extract-images prism-docs extract-images [--output-dir OUTPUT_DIR] [--format {original,png,jpeg}] [--min-size MIN_SIZE] input

Security

Command Signature
flatten prism-docs flatten [-o OUTPUT] [--annotations] [--forms] input
permissions prism-docs permissions --owner-password OWNER_PASSWORD [-o OUTPUT] [--allow-print] [--allow-copy] [--allow-modify] [--allow-annotate] [--allow-forms] input
redact prism-docs redact [-o OUTPUT] [--regions REGIONS] [--text TEXT] [--color COLOR] input

Utilities

Command Signature
info prism-docs info [-v] [--json] input
validate prism-docs validate [--strict] inputs [inputs ...]
crop prism-docs crop [-o OUTPUT] [--left LEFT] [--right RIGHT] [--top TOP] [--bottom BOTTOM] [--margin MARGIN] [--percent PERCENT] [--pages PAGES] input
resize prism-docs resize [-o OUTPUT] [--size {A4,A3,A5,Letter,Legal,Tabloid}] [--width WIDTH] [--height HEIGHT] [--scale SCALE] [--fit {contain,cover,stretch}] [--pages PAGES] input
bookmarks prism-docs bookmarks [--action {view,extract,add}] [-o OUTPUT] [--from-file FROM_FILE] input

OCR operations (requires the ocr extra)

Command Signature
ocr prism-docs ocr [-o OUTPUT] [--lang LANG] [--dpi DPI] [--psm PSM] [--oem OEM] [--pages PAGES] [--timeout TIMEOUT] input
searchable-pdf prism-docs searchable-pdf [-o OUTPUT] [--lang LANG] [--dpi DPI] [--psm PSM] [--timeout TIMEOUT] input
ocr-extract prism-docs ocr-extract [-o OUTPUT] [--lang LANG] [--dpi DPI] [--psm PSM] [--preprocess {none,threshold,blur,sharpen,denoise}] [--threshold THRESHOLD] [--contrast CONTRAST] [--brightness BRIGHTNESS] [--invert] [--format {text,hocr,tsv,box,data}] input
ocr-batch prism-docs ocr-batch [--output-dir OUTPUT_DIR] [--lang LANG] [--dpi DPI] [--psm PSM] [--output-type {txt,pdf}] [--fast] inputs [inputs ...]
ocr-data prism-docs ocr-data [-o OUTPUT] [--lang LANG] [--dpi DPI] [--psm PSM] [--min-confidence MIN_CONFIDENCE] [--level {word,line,block,page}] input
ocr-detect-lang prism-docs ocr-detect-lang [-o OUTPUT] [--dpi DPI] [--fallback-lang FALLBACK_LANG] [--sample-pages SAMPLE_PAGES] input
ocr-multi-lang prism-docs ocr-multi-lang [-o OUTPUT] [--langs LANGS] [--dpi DPI] [--psm PSM] input
ocr-table prism-docs ocr-table [-o OUTPUT] [--lang LANG] [--dpi DPI] [--format {csv,tsv,json}] [--pages PAGES] input
ocr-table-v2 prism-docs ocr-table-v2 [-o OUTPUT] [--lang LANG] [--format {csv,tsv,json,xlsx}] [--pages PAGES] [--implicit-rows] [--no-implicit-rows] [--borderless] [--no-borderless] [--min-confidence MIN_CONFIDENCE] input

CLI utilities

Command Signature
config prism-docs config {show,init,path}
list prism-docs list

Configuration

~/.config/prism-docs/config.yaml   # Global
./prism-docs.yaml                  # Project

See docs/configuration.md for options.

Python API

from prism_docs import run_operation

run_operation("encrypt", "input.pdf", password="secret")
run_operation("merge", "first.pdf", output_path="out.pdf", merge_inputs=["a.pdf", "b.pdf"])

See docs/api.md for full reference.

Development

uv sync
uv run pytest
uv run ruff check src/

About

PDF Document Processing Research Toolkit

Topics

Resources

Stars

Watchers

Forks

Languages