Stop guessing macros or manually logging food. Calorie-Counter-AI is a high-performance Telegram bot that "looks" at your food and tells you exactly what's on the plate.
Built with Aiogram 3.x, it combines Vision Models (like Gemini 2.0 Flash) with Whisper Audio Transcription to act as your personal nutritionist.
Unlike rigid apps, this is a conversation.
- See: Snap a photo, and the AI estimates volume, weight, and calories using 3D logic.
- Hear: Send a voice note to describe ingredients or ask for advice.
- Chat: Ask general nutrition questions ("Is this healthy?") in plain text.
- ποΈ Visual 3D Analysis: Uses advanced Vision AI to estimate food volume based on reference objects (cutlery, plates).
- π£οΈ Voice-First: Integrated with self-hosted Whisper to transcribe your meal descriptions or questions instantly.
- π§ Context Aware: Knows the current date and time to give relevant advice (e.g., "It's late for a heavy meal").
- π Bilingual: Automatically detects English or Ukrainian input and responds in the correct language.
- β‘ Webhook Powered: Zero-latency response using Aiohttp architecture.
The project is fully containerized. You can run it in seconds.
docker build -t calorie-bot .
Replace the variables with your actual data.
docker run -d \
--name calorie-bot \
-p 8000:8000 \
-e TELEGRAM_TOKEN="123456:ABC-DEF1234ghIkl..." \
-e BASE_URL="https://your-public-domain.com" \
-e WHISPER_API_URL="http://whisper-service:9000/v1" \
-e OPENROUTER_API_URL="sk-or-v1-..." \
-e MODEL_NAME="google/gemini-2.0-flash-exp" \
-e PORT=8000 \
calorie-bot
Control the bot using Environment Variables.
| Variable | Required | Description |
|---|---|---|
| TELEGRAM_TOKEN | β | Your bot token from @BotFather. |
| BASE_URL | β | Your public HTTPS domain (SSL required for Webhooks). |
| OPENROUTER_API_KEY | β | API Key from OpenRouter (to access LLMs). |
| WHISPER_API_URL | β | Address of your Whisper backend (internal or cloud). |
| MODEL_NAME | β | AI Model to use (Default: google/gemini-2.0-flash-exp). |
| PORT | β | Internal app port (Default: 8000). |
| TOPIC_ID | β | Telegram Forum Topic ID. When set, the bot only responds to messages from that topic and silently ignores all others. Useful for supergroups with multiple Topics where each topic hosts a separate bot. |
This bot is a Gateway. It orchestrates different AI services to provide a seamless experience.
1. Vision & Reasoning (OpenRouter) We use OpenRouter to access top-tier models like Gemini 2.0 Flash or GPT-4o.
- Task: Analyze image pixels, estimate density/volume, calculate macros, and translate text.
- Config: Change
MODEL_NAMEto swap models without redeploying code.
2. Ears (Whisper) We use a dedicated Whisper service for privacy and speed.
- Task: Transcribe voice notes into text before sending them to the LLM.
- Setup: Run a local docker container (e.g.,
onerahmet/openai-whisper-asr-webservice).
- User sends a Photo, Voice, or Text π€
- Bot receives Webhook β‘
- If Voice: Bot sends audio to
WHISPER_API_URL-> gets text π - If Photo: Bot encodes image to Base64 πΌοΈ
- Bot sends payload (Image + Text + Date) to
OPENROUTER_API_KEYπ§ - AI analyzes calories, healthiness, and detects language πΊπ¦/πΊπΈ
- Bot formats the JSON response and replies to user π¬
Got a cool idea? Maybe adding a database for history tracking or user profiles? Fork the repo, make your changes, and open a Pull Request.
This project is open-source and available under the MIT License.
