📄 Automated PDF Time Extractor

A Node.js CLI tool designed to batch-parse PDF reports, extract time-tracking data using Regular Expressions, and calculate aggregated totals efficiently — with parallel processing, tests, and CI.

💡 Context & Motivation

This project was created to solve a real-world productivity bottleneck encountered during a consulting engagement.

Billable hours were distributed across dozens of auto-generated, unstructured PDF reports. Manually opening each file, locating time entries, and summing values was slow, repetitive, and error-prone.

The goal was to build a reliable, reusable, and auditable CLI tool that performs this task automatically.

🚀 Features

Batch processing of PDF files
Controlled parallel processing
Configurable concurrency
Command Line Interface (CLI)
Optional debug logging
Unit tests and CLI contract tests
GitHub Actions CI pipeline

📦 Installation

Local development

npm install

Run via npx (after publish)

npx pdf-time-extractor

🛠️ Usage

pdf-time-extractor [directory] [options]

Examples

# Default (uses ./documents)
pdf-time-extractor

# Custom directory
pdf-time-extractor ./documents

# Parallel processing
pdf-time-extractor ./documents --concurrency 6

# Enable debug logs
pdf-time-extractor ./documents --verbose

⚙️ Options

--concurrency <number> Number of parallel workers (default: 4)
--verbose Enable debug output
--help Display usage information
--version Display the current version

📂 Project Structure

src/
├── cli/           # CLI interface and argument parsing
├── core/          # Core processing logic
├── services/      # External I/O and integrations
├── utils/         # Pure utility functions
tests/             # Unit and CLI tests
.github/workflows/ # CI configuration
documents/         # Sample / input PDFs

🧪 Testing & CI

Run tests locally:

npm test
npm run test:coverage

All tests and coverage checks run automatically on every push and pull request via GitHub Actions.

📄 License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
coverage		coverage
documents		documents
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 Automated PDF Time Extractor

💡 Context & Motivation

🚀 Features

📦 Installation

Local development

Run via npx (after publish)

🛠️ Usage

Examples

⚙️ Options

📂 Project Structure

🧪 Testing & CI

📄 License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📄 Automated PDF Time Extractor

💡 Context & Motivation

🚀 Features

📦 Installation

Local development

Run via npx (after publish)

🛠️ Usage

Examples

⚙️ Options

📂 Project Structure

🧪 Testing & CI

📄 License

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages