Text Mate Backend is a powerful Python FastAPI service that provides advanced text analysis and transformation capabilities powered by AI. This repository contains the backend services for the Text Mate application; the frontend is built with Nuxt.js and available at https://github.com/DCC-BS/text-mate-frontend.
DCC Documentation & Guidelines | DCC Website
- Text Rewriting: Advanced text transformation with customizable parameters
- Document Advisor: Validates text against reference documents and style guides
- Word Synonyms: Intelligent synonym suggestions based on context
- Sentence Rewrite: Context-aware sentence transformation
- Document Conversion: Convert documents using Docling service (PDF, DOCX, etc.)
Many specialized AI-powered text transformations:
- Summarize: Generate concise summaries of long texts
- Bullet Points: Convert paragraphs into structured bullet points
- Formality: Adjust text formality level (formal/informal)
- Medium Length: Optimize text for medium-length output
- Plain Language: Simplify complex text to plain language
- Social Media: Optimize content for social media platforms
- Proofread: Comprehensive grammar and style checking
- Character Speech: Adapt text to character voice and speech patterns
- Custom: Flexible custom text transformations
- Streaming Responses: Real-time text generation with streaming support
- Health Probes: Built-in health check endpoints for all services
- Logfire Integration: Advanced debugging and monitoring in development mode
- Azure AD Authentication: Enterprise-ready authentication with Azure AD
- Framework: FastAPI with Python 3.13+
- Package Manager: uv
- Dependency Injection: Dependency-Injector
- LLM Integration: pydantic-ai for AI model integration
- AI Model: Qwen3 32B served via vLLM
- Document Processing: Docling
- Containerization: Docker and Docker Compose
- Monitoring: Logfire for observability
- Python: 3.13 or higher
- uv package manager: Installation guide
- Docker & Docker Compose: For containerized deployment
- NVIDIA GPU with CUDA support:
- Minimum 2 GPUs recommended (one for vLLM, one for Docling)
- GPU memory: ~20GB for Qwen3-32B-AWQ model
- CUDA toolkit installed
- varlock: For environment variables validation (optional but recommended)
- pass-cli: For varlock with Proton Pass integration
Create a .env file in the project root with the required environment variables:
AUTH_MODE=none # or azure
LOG_LEVEL=debug
HMAC_SECRET=... # create a new secret with openssl rand 32 | base64
The following environment variables have defaults and can be overridden as needed:
| Variable | Description | Default | Type |
|---|---|---|---|
| Environment Settings | |||
APP_MODE |
Application mode (controls varlock validation) | dev |
enum: dev, ci, build, prod |
IS_PROD |
Flag for production mode (used by logger) | Auto-calculated from APP_MODE | boolean |
| Ports | |||
PORT |
FastAPI backend app port | 8000 |
port |
LLM_API_PORT |
LLM API port | 8001 |
port |
CLIENT_PORT |
Client application port | 3000 |
port |
DOCLING_API_PORT |
Docling API port | 5001 |
port |
| URLs | |||
CLIENT_URL |
Client application URL | http://localhost:3000 (dev) |
URL |
DOCLING_URL |
Docling service URL | http://localhost:5001/v1 (dev) |
URL |
LLM_URL |
LLM API URL | http://localhost:8001/v1 (dev) |
URL |
LLM_HEALTH_CHECK_URL |
LLM health check URL | http://localhost:8001/health (dev) |
URL |
| LLM Configuration | |||
LLM_MODEL |
Model for LLM API | Qwen/Qwen3-32B-AWQ |
string |
LLM_API_KEY |
API key for OpenAI authentication | none |
string (sensitive in prod) |
| Service Keys | |||
DOCLING_API_KEY |
Docling API key | none |
string (sensitive in prod) |
HUGGING_FACE_HUB_TOKEN |
Hugging Face API token | - | string (optional, sensitive) |
| Docker Cache Directories | |||
CACHE_DIR |
Base cache directory | ~/.cache |
path |
HUGGING_FACE_CACHE_DIR |
Hugging Face cache directory | ${CACHE_DIR}/huggingface |
path |
Note: URLs are automatically set based on the
APP_MODE. In production, these must be configured explicitly.
When AUTH_MODE=azure, the following Azure AD variables are required:
| Variable | Description | Default | Type |
|---|---|---|---|
AZURE_CLIENT_ID |
Azure AD application client ID | - | UUID (required) |
AZURE_TENANT_ID |
Azure AD tenant ID | - | UUID (required) |
AZURE_FRONTEND_CLIENT_ID |
Azure AD frontend application client ID | - | UUID (required) |
AZURE_SCOPE_DESCRIPTION |
Azure AD authentication scope | user_impersonation |
string |
Note: You can create a Hugging Face token here.
Use varlock to validate the env variables:
varlock loadInstall dependencies and pre-commit hooks:
make installOr manually:
uv sync
uv run pre-commit installThe application consists of four main services:
| Service | Port | Description |
|---|---|---|
| FastAPI Backend | 8000 | Main application API |
| vLLM Service | 8001 | Qwen3-32B-AWQ model inference (v0.17.1) |
| Docling | 5001 | Document conversion service |
- GPU 0 (
device_ids: ["0"]): vLLM service for LLM inference - GPU 1 (
device_ids: ["1"]): Docling for document processing
# Start all required services with Docker
make docker-up
# Start the development server
make devThe API will be available at http://localhost:8000
Access the interactive API documentation:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- Health Check: http://localhost:8000/health
# Run code quality checks (format, lint, type check)
make check
# Run tests
make test
# Run tests with coverage
uv run pytest --cov=src/text_mate_backend tests/
# Run specific test file
uv run pytest tests/test_example.py| Command | Description |
|---|---|
make install |
Install dependencies and pre-commit hooks |
make dev |
Start development server with hot reload |
make run |
Start production server |
make test |
Run test suite |
make check |
Run format, lint, type check, and varlock scan |
make docker-up |
Start all Docker services |
make docker-down |
Stop all Docker services |
make docker-logs |
View Docker service logs |
make help |
Show all available commands |
make runOr manually:
FORCE_COLOR=1 varlock run -- uv run fastapi run ./src/text_mate_backend/app.py --port 8000docker pull ghcr.io/dcc-bs/text-mate-backend:latest# Build the Docker image
docker build -t text-mate-backend .
# Run the container
docker run -p 8000:8000 text-mate-backendsrc/text_mate_backend/
├── app.py # FastAPI application entry point
├── container.py # Dependency injection container
├── agents/ # AI agent implementations
│ └── agent_types/
│ ├── quick_actions/ # Quick action agents (8 types)
│ ├── advisor_agent.py # Document advisor
│ ├── sentence_rewrite_agent.py
│ └── word_synonym_agent.py
├── models/ # Pydantic data models and schemas
├── routers/ # API endpoint definitions
│ ├── advisor.py # Document advisor endpoint
│ ├── convert_route.py # Document conversion endpoint
│ ├── quick_action.py # Quick actions endpoint
│ ├── sentence_rewrite.py # Sentence rewrite endpoint
│ └── word_synonym.py # Word synonym endpoint
├── services/ # Business logic services
│ ├── actions/ # Quick action service
│ └── document_conversion_service.py
└── utils/ # Utility functions and helpers
├── auth.py # Authentication utilities
├── configuration.py # Configuration management
└── middleware.py # Request/response middleware
text_mate_tools/ # Utility scripts
├── preprocess_document_rules.py # AI-assisted rule extraction from PDFs
├── count_rules_per_file.py # Rule count per collection and source PDF
└── analyse_ruels.py # Rule analysis across all collections
assets/docs/
├── rules/ # Rule collections (one JSON per collection)
│ ├── bundeskanzlei.json # Merged Bundeskanzlei rules (51 rules)
│ └── merkblatt_behoerdenbriefe.json # Behördenbriefe rules (14 rules)
├── meta/
│ └── bund_dokumente.json # Collection metadata shown to API consumers
└── *.pdf # Source PDF documents
assets/actions/ # Role-gated custom quick actions (Markdown)
├── goblin.md # Example: admin-only action
└── middleage-slang.md # Example: admin-only action
tests/ # Unit and integration tests
Custom quick actions let you add role-gated LLM instructions without touching Python code. They appear alongside the built-in quick actions (Summarize, Plain Language, etc.) in the frontend and are executed by the same POST /quick-action endpoint.
- At startup the backend scans
assets/actions/*.mdand loads every file as aUserAction. GET /user-actionreturns only the actions the current user may see, filtered by their Azure Entra ID roles.- The frontend calls
POST /quick-actionwith{ "action": "<id>", "text": "..." }. - If the
actionvalue is not a built-in action ID, the service looks it up in the loaded user actions and uses its Markdown body as the LLM system prompt.
Each action is a single Markdown file in assets/actions/ with a YAML frontmatter block:
---
id: my-action-id
name: Display Name
groups: ["role-name-in-azure"]
---
Write the LLM instruction here. This becomes the system prompt.
You can use full Markdown — headings, lists, code blocks — to structure the prompt.| Field | Required | Description |
|---|---|---|
id |
yes | Unique identifier. Used as the action value in POST /quick-action. Must not clash with built-in action IDs (plain_language, bullet_points, summarize, social_mediafy, formality, medium, custom, proofread, character_speech). |
name |
yes | Display name shown to the user in the frontend. |
groups |
no | List of Azure Entra ID role names that may see and run this action. Empty list ([] or omitted) makes it visible to all authenticated users. |
The body (everything after the closing ---) is sent verbatim as the LLM instruction. It has access to the user's input text.
groups values are matched against the roles on the authenticated user's Azure Entra ID token. A user must have at least one of the listed roles to see the action. When AUTH_MODE=none (dev), the /user-action endpoint returns an empty list (no user context available).
---
id: goblin
name: Goblin Rewrite
groups: ["admin"]
---
Rewrite a text like you are a goblin.This action is only visible to users with the admin role in Azure Entra ID. Any other user will not see it in GET /user-action and cannot trigger it.
- Create a
.mdfile inassets/actions/following the format above. - Restart the backend — actions are loaded once at startup.
- Verify the action appears for the right users via
GET /user-action.
No code change required. The file name does not matter; only the
idfield is used.
The Document Advisor validates text against editorial rules sourced from Bundeskanzlei PDFs. Rules are organized into collections — logical groups exposed to API consumers:
| Collection ID | File | Source PDFs |
|---|---|---|
bundeskanzlei |
assets/docs/rules/bundeskanzlei.json |
Schreibweisungen, Rechtschreibleitfaden, Empfehlungen Anglizismen, Geschlechtergerechte Sprache |
merkblatt_behoerdenbriefe |
assets/docs/rules/merkblatt_behoerdenbriefe.json |
Merkblatt Behördenbriefe |
Each rule has:
name— short rule titledescription— full rule descriptionfile_name— source PDF filename (used for citation in violations)page_number— page in the source PDFexample—Falsch: ... | Richtig: ...stringcollection— collection ID (used for filtering; must matchidinbund_dokumente.json)
Collection metadata shown to API consumers is in assets/docs/meta/bund_dokumente.json. Each entry has:
id— collection ID (matchesRule.collection)title/description/author/edition— display metadatafiles— list of downloadable source PDFsaccess— list of roles, or["all"]for public access
Option A — Manual: Edit the collection JSON directly.
Add a rule object to the rules array in the appropriate file (e.g. assets/docs/rules/bundeskanzlei.json):
{
"name": "Rule name",
"description": "Full rule description.",
"file_name": "schreibweisungen.pdf",
"page_number": 42,
"example": "Falsch: ... | Richtig: ...",
"collection": "bundeskanzlei"
}file_name must be an existing PDF under assets/docs/. collection must match the id in bund_dokumente.json.
Option B — AI extraction from a PDF: Use the preprocessing tool to extract rules automatically, then review and merge.
# Extract rules from a PDF into a staging directory
uv run --env-file .env src/text_mate_tools/preprocess_document_rules.py \
assets/docs/schreibweisungen.pdf \
--collection bundeskanzlei \
--output ./staging/rules
# Review the output
cat staging/rules/schreibweisungen.json
# Copy rules into the collection file (manual merge or jq)After editing, run make check to verify everything is valid.
-
Add rules JSON — create
assets/docs/rules/<collection-id>.jsonwith thecollectionfield set on every rule. -
Add the source PDF — place the PDF in
assets/docs/. -
Register the collection — add an entry to
assets/docs/meta/bund_dokumente.json:
{
"title": "Collection display name",
"description": "Short description for the UI",
"author": "Author name",
"edition": "Edition string",
"id": "<collection-id>",
"files": ["source.pdf"],
"access": ["all"]
}- Run checks —
make check.
API impact: Adding a new collection is a non-breaking change — consumers only see the new entry when they call
GET /advisor/docs. Renaming or removing a collection ID is breaking.
# Count rules per collection and source PDF
uv run src/text_mate_tools/count_rules_per_file.py
# Detailed analysis (char counts per collection)
uv run src/text_mate_tools/analyse_ruels.pyIssue: Out of memory errors when starting vLLM service
Solutions:
- Ensure GPU has at least 20GB memory
- Reduce
--gpu-memory-utilizationin docker-compose.yml (default: 0.90) - Reduce
--max-model-len(default: 6000)
Issue: Cannot download model from Hugging Face
Solutions:
- Verify
HUGGING_FACE_HUB_TOKENis set correctly - Ensure token has read access to the model repository
- Create token at https://huggingface.co/settings/tokens
Issue: Health check endpoint returns errors
Solutions:
- Check if all services are running:
docker ps - View service logs:
make docker-logs - Verify URLs in
.envmatch Docker service names - Check GPU availability:
nvidia-smi
Issue: varlock validation fails
Solutions:
- Ensure pass-cli is installed and authenticated
- Check Proton Pass credentials
- Verify
.env.schemasyntax - Run
varlock loadfor detailed errors
Issue: Azure AD authentication fails
Solutions:
- Verify all Azure environment variables are set
- Check
AZURE_CLIENT_IDandAZURE_TENANT_IDare correct - Ensure Azure AD app registration is configured properly
- Verify redirect URIs match your application URL
MIT © Data Competence Center Basel-Stadt
Datenwissenschaften und KI
Developed with ❤️ by DCC - Data Competence Center
