HTTP service that automatically tags and captions photos using a vision AI model, writing the results back into the image's EXIF/IPTC/XMP metadata. Tags and captions are generated in both English and German.
- Accepts HEIC, HEIF, and JPEG images
- Returns the same format it receives (HEIC in → HEIC out)
- Sends images to any OpenAI-compatible vision endpoint
- Writes metadata using
exiftool:XMP:Subject— all keywords (EN + DE), works in HEIC and JPEGXMP-dc:Description-en/de— language-tagged captionsIPTC:Keywords+IPTC:Caption-Abstract— JPEG only
- All prompts and endpoint settings are configurable via
config.yaml
- Docker + Docker Compose
1. Clone and configure
Copy config.yaml and edit it:
vision:
base_url: "https://api.openai.com/v1" # any OpenAI-compatible endpoint
api_key: "" # or use VISION_API_KEY env var
model: "gpt-4o"
max_tokens: 512
image_max_size: 1920 # resize longest edge before sending
prompts:
system: >
You are an expert photo metadata specialist. Analyze images carefully and return
accurate, descriptive tags and captions in both English and German.
Always respond with valid JSON only.
user: >
Analyze this image and return a JSON object with exactly four fields:
- "tags_en": an array of concise English keyword strings
- "tags_de": the same tags translated into German
- "caption_en": a single English sentence describing the image
- "caption_de": the same caption translated into German
server:
host: "0.0.0.0"
port: 8000The API key can also be set via environment variable (takes precedence over config.yaml):
export VISION_API_KEY=sk-...2. Start
docker compose up --buildThe config.yaml is mounted into the container, so you can edit it and restart without rebuilding:
docker compose restartTag an image:
curl -X POST http://localhost:8000/tag \
-F "file=@photo.heic" \
--output tagged.heiccurl -X POST http://localhost:8000/tag \
-F "file=@photo.jpg" \
--output tagged.jpgVerify written metadata:
exiftool tagged.heic | grep -E "Subject|Description|Keywords|Caption"Health check:
curl http://localhost:8000/healthtaggerv2/
├── main.py # FastAPI app, POST /tag endpoint
├── vision.py # Image preparation and vision API call
├── metadata.py # exiftool wrapper for writing EXIF/IPTC/XMP
├── config.py # Config loader (YAML + env var overrides)
├── config.yaml # Runtime configuration
├── Dockerfile
├── docker-compose.yml
└── requirements.txt
- HEIC metadata: exiftool writes XMP into HEIC files. IPTC is not supported in HEIC — XMP is used instead, which is recognized by Lightroom, Photos, and digiKam.
- Model compatibility:
response_format: json_objectis intentionally not sent, as many OpenAI-compatible backends do not support it. The service extracts JSON from the model response with a fallback parser that handles markdown code fences. - Image resizing: images are resized to
image_max_sizeon the longest edge before being sent to the API. The original file is not affected — resizing is only for the API call.