PDF-Auto-A11y

PDF-Auto-A11y is a Java-based PDF accessibility remediation tool that validates and automatically fixes common tagging and document-structure issues in tagged PDFs. It is designed for accessibility professionals who want to automate the repetitive structural corrections that would otherwise be done manually in Acrobat or similar remediation tools.

Quick Start

Requirements:

Java 21+
Maven 3.x

Clone the repository and run the CLI wrapper:

git clone https://github.com/boyechko/pdf-autoa11y.git
cd pdf-autoa11y

./pdf-autoa11y input.pdf
# writes: input_autoa11y.pdf

Launch the GUI:

./pdf-autoa11y-gui

The wrapper scripts compile the project on first run and download Maven dependencies as needed. If you prefer to build explicitly first:

mvn clean package

What It Does

PDF-Auto-A11y runs a pipeline of checks against the input PDF. Each check gets its own pipeline step so intermediate results can be inspected. Checks cover document-level metadata and annotations as well as structure-tree validation and repair.

By default, the tool can automatically remediate issues such as:

Missing document language (Lang, defaulting to en-US)
Missing logical tab order (Tabs/S)
Missing tagged-PDF marker (MarkInfo/Marked)
Missing Document root element
Untagged link annotations by creating Link tags
Unexpected pushbutton widget annotations
Broken ligature mappings
Unverified PDF/UA conformance claims in XMP metadata
Common list-structure problems such as missing LI / LBody, flattened list hierarchies, and mislabeled bullets
Needless nesting, empty tags, some mistagged artifacts, and related structure cleanup

Some issues are detected but not automatically fixed. For example:

Missing alt text on figures
PDFs that are image-only and need OCR first
PDFs with no structure tree at all

Documents must already be tagged. The tool validates missing structure trees, but it does not create a full structure tree for an untagged PDF.

When It Helps

This tool is useful when:

You have tagged PDFs with recurring list or tag-shape issues from Word, InDesign, or similar authoring tools
You want consistent remediation across many files
You want to identify structural issues before manual review
You want to automate the repetitive fixes and focus manual effort on issues that require judgment

Usage

CLI

./pdf-autoa11y [options] <input.pdf> [output.pdf|output_dir/]

If output is omitted, the result is written next to the input as <base>_autoa11y.pdf. If output is a directory, the output filename is generated inside that directory.

Common options:

-a, --analyze analyze only; do not write an output PDF
-r, --report save the console output to a text report file
-r=<file>, --report=<file> write the report to a custom path
-p, --password <pw> open encrypted PDFs
-q, --quiet show only errors and final status
-v, --verbose show detailed processing information
-vv, --debug show full debug output
-t, --print-tree print the structure tree during processing
--dump-tree print the structure tree with MCRs/annotations and exit
--dump-roles print the structure tree roles and exit
--skip-checks <a,b,c> skip specific checks
--only-checks <a,b,c> run only specific checks
--sidecar <file> use a specific sidecar config file
--create-sidecar create a template sidecar config file and exit
-f, --force write an output PDF even if no fixes were applied

Run ./pdf-autoa11y --help for the full CLI help text.

Examples:

# Remediate using the default output name
./pdf-autoa11y document.pdf

# Analyze only and save a report next to the input
./pdf-autoa11y -a -r document.pdf

# Choose both output and report paths
./pdf-autoa11y --report=out/report.txt document.pdf out/fixed.pdf

# Write output into a directory
./pdf-autoa11y document.pdf out/

# Process an encrypted PDF
./pdf-autoa11y -p mypassword document.pdf

# Inspect the structure tree without remediating
./pdf-autoa11y --dump-tree document.pdf

GUI

./pdf-autoa11y-gui

The GUI lets you select a PDF, enter a password if needed, view processing output in real time, and save the remediated PDF.

Reports and Output

Use -r/--report to save the same console output to a text file.

If you pass -r/--report without a path, the report defaults to <base>_autoa11y.txt:

In analyze mode, the report is written next to the input PDF
In remediation mode, the report is written next to the output PDF

Analysis and remediation can report different issue counts. Analysis scans the original document in a single pass, while remediation applies fixes sequentially, so earlier fixes may resolve issues before later checks run.

The output includes:

Checks that passed
Issues found
Fixes applied
Warnings that still require manual review
Page references or object references when available

Temporary Pipeline Files

Remediation runs as a stepwise pipeline and writes intermediate PDFs to a temp directory. By default, that location is:

/tmp/pdf-autoa11y/pipeline/<input-filename>/

You can override the base directory with either:

AUTOA11Y_PIPELINE_DIR=/path/to/dir
-Dautoa11y.pipeline.dir=/path/to/dir

Intermediate files are currently kept by default, which is useful for debugging and inspecting each processing step.

Project Docs

CHANGELOG.md for release notes
CONTRIBUTING.md for development workflow
docs/sidecar.md for per-PDF sidecar config reference
docs/checks-and-fixes.md for the full list of checks and fixes
docs/architecture-decisions.md for architecture decision records

Development

Run tests:

mvn test

Persist test PDFs for review:

mvn test -Dpdf.autoa11y.testOutputDir=/tmp/pdf-autoa11y-tests

Run the app through Maven:

mvn exec:java -Dexec.args="input.pdf"

Apply code formatting:

mvn spotless:apply

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

This project uses iText PDF, which is also licensed under AGPL. As a derivative work, PDF-Auto-A11y must remain under AGPL-compatible terms.

In practical terms:

You may use, modify, and distribute this software
If you modify it and make it available over a network, you must also make the modified source available under AGPL-3.0
Software incorporating this code must also comply with AGPL obligations

See LICENSE for the full license text.

Name		Name	Last commit message	Last commit date
Latest commit History 668 Commits
docs		docs
src		src
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pdf-autoa11y		pdf-autoa11y
pdf-autoa11y-gui		pdf-autoa11y-gui
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF-Auto-A11y

Quick Start

What It Does

When It Helps

Usage

CLI

GUI

Reports and Output

Temporary Pipeline Files

Project Docs

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PDF-Auto-A11y

Quick Start

What It Does

When It Helps

Usage

CLI

GUI

Reports and Output

Temporary Pipeline Files

Project Docs

Development

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages