A service for uploading PDF documents and extracting their text content asynchronously.
The system consists of two components:
- PdfProcessor.Api — REST API for uploading PDF files and retrieving processing results. Accepts a file, saves it to disk, creates a record in the database, and publishes a task to the queue.
- PdfProcessor.Worker — background service that listens to the RabbitMQ queue, extracts text from PDFs, and saves the result to the database.
Infrastructure: PostgreSQL (data storage), RabbitMQ (message queue), shared Docker volume for PDF files.
| Method | Path | Description |
|---|---|---|
POST |
/api/documents |
Upload a PDF (multipart/form-data, field file, max 50 MB) |
GET |
/api/documents |
List all documents |
GET |
/api/documents/{id}/content |
Get extracted text for a document |
Swagger UI is available at http://localhost:8080.
Document statuses: pending → processing → completed / failed.
# Copy the environment configuration and fill in the values
cp .env.example .env
# Start all services
docker compose up --buildOnce started:
- API: http://localhost:8080
- RabbitMQ Management UI: http://localhost:15672 (guest / guest)
Requirements: .NET 10 SDK, PostgreSQL, RabbitMQ.
# API
dotnet run --project src/PdfProcessor.Api
# Worker (in a separate terminal)
dotnet run --project src/PdfProcessor.WorkerSet the following environment variables or add them to appsettings.Development.json:
ConnectionStrings__DefaultConnection=Host=localhost;Database=pdfprocessing;Username=postgres;Password=postgres
RabbitMQ__Host=localhost
RabbitMQ__Username=guest
RabbitMQ__Password=guest
# API
docker build -t pdf-processor-api -f Dockerfile .
# Worker
docker build -t pdf-processor-worker -f src/PdfProcessor.Worker/Dockerfile .