Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions .github/workflows/validate.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
name: Validate

on:
push:
branches: [master, main]
pull_request:
branches: [master, main]

jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: LICENSE exists
run: test -s LICENSE || (echo "::error::LICENSE missing or empty" && exit 1)

- name: CHANGELOG.md exists
run: test -s CHANGELOG.md || (echo "::error::CHANGELOG.md missing or empty" && exit 1)

- uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest pytest-asyncio

- name: Python compile check
run: |
set -e
fail=0
for f in $(git ls-files '*.py'); do
if ! python -m py_compile "$f"; then
echo "::error file=$f::python compile error"
fail=1
fi
done
exit $fail

- name: Run pytest suite
env:
TELEGRAM_TOKEN: dummy-test-token
ADMIN_IDS: "0"
GEMINI_API_KEY: dummy
OPENROUTER_API_KEY: dummy
GROQ_API_KEY: dummy
run: python -m pytest tests/ -v --tb=short

- name: SVG files are well-formed XML
run: |
set -e
fail=0
for f in $(git ls-files 'docs/*.svg' '*.svg'); do
if ! python -c "import xml.etree.ElementTree as ET; ET.parse('$f')" 2>/dev/null; then
echo "::error file=$f::malformed SVG XML"
fail=1
fi
done
exit $fail

- name: All docs/* assets referenced from README exist
run: |
set -e
fail=0
for ref in $(grep -hoE 'docs/[a-zA-Z0-9_/-]+\.(svg|png|jpg|jpeg|gif)' README.md README.ru.md | sort -u); do
if [ ! -f "$ref" ]; then
echo "::error file=README.md::missing referenced asset $ref"
fail=1
fi
done
exit $fail

- name: Internal Markdown links resolve
run: |
set -e
fail=0
for src in README.md README.ru.md CHANGELOG.md CONTRIBUTING.md CLAUDE.md; do
[ -f "$src" ] || continue
base="$(dirname "$src")"
for tgt in $(grep -hoE '\]\([^)]+\)' "$src" | sed 's/](\(.*\))/\1/' | sed 's/#.*$//'); do
case "$tgt" in
http*|mailto:*|"") continue ;;
esac
[ "$base" = "." ] && resolved="$tgt" || resolved="$base/$tgt"
if [ ! -e "$resolved" ] && [ ! -e "$tgt" ]; then
echo "::error file=$src::broken internal link → $tgt"
fail=1
fi
done
done
exit $fail
65 changes: 65 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Changelog

All notable changes to this project will be documented in this file.
Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) · [SemVer](https://semver.org/spec/v2.0.0.html).

## [0.3.0] — 2026-05-03

### Changed

- **License changed from MIT → PolyForm Noncommercial 1.0.0.** This project is non-commercial: it was built for friends and family by a sibling-of-T1D developer and is shared with the community for personal / family use only. Commercial use — including hosting it as a paid service or bundling it into a paid product — is not permitted.

### Added

- `LICENSE` is now the official PolyForm Noncommercial 1.0.0 text (replacing the previous MIT license)
- `README.ru.md` — full Russian version mirroring the new English structure (the previous bilingual inline format has been split into two locale files for consistency with sister repos and easier translation maintenance)
- Hero `Status — non-commercial use only` callout linking to the license and the contributor priorities
- `Roadmap & known limitations` section grouping safety disclaimer, active improvements (LLM prompt-tightening, dietetic accuracy, user-facing answer verification, external KBJU databases, regional adaptation), and current limits (LLM misidentification edge cases, no CGM integration, EN-locale verification ongoing, grounding only on Gemini chain, metric units only)
- Hero screenshot `docs/screenshots/01-kbju-result.png` — final KBJU table + daily progress
- Four supporting screenshots — `02-photo-recognition`, `03-daily-report`, `04-onboarding-setup`, `05-onboard-confirm` — arranged in a 2-column gallery with captions
- `docs/architecture.svg` — pipeline diagram (Telegram → handlers → litellm Router with two chains → user confirmation gate → nutrition + database → daily progress, with Google Search grounding sidecar)
- `CHANGELOG.md` (this file) — earlier history reconstructed below from `git log` / commit messages
- `CONTRIBUTING.md` with explicit non-commercial clause and a concrete priority list
- `.github/workflows/validate.yml` — runs the 72-test pytest suite on push and PR, plus `python -m py_compile` on every `.py`, SVG well-formed XML, internal Markdown links resolve, presence of `LICENSE` and `CHANGELOG.md`
- "Stars" and "Validate CI" badges; static `Tests: 72` badge replaced with the dynamic Validate badge
- "Related" section cross-linking to all sister Claude Code repos by the same author (anti-regression-setup, ai-context-hierarchy, claude-statusline, lingua-companion)
- Author signature expanded with full name and Habr / dev.to profile links
- Local `tests/manual screenshots/` paths and any deployment URLs are deliberately omitted from the public README per author instruction

### Notes

- Topics on GitHub applied separately via `gh api` after merge.
- Default branch remains `master` for now; rename to `main` deferred to a separate change because it would invalidate any external bookmarks / CI badges pointing at `master`.

## [0.2.0] — 2026-04-03

### Added — initial portfolio-quality README & live fixes
- Bilingual inline RU/EN `README.md` rewritten as a portfolio piece (commit `f0745f0`)
- `CLAUDE.md` aligned with the README (same commit)
- `PicklePersistence` for `python-telegram-bot` so the bot survives systemd restarts without losing in-flight conversations (commit `eceebf4`)
- All command handlers added as `entry_points` so the bot works even if a user types `/today` before `/start` (commit `41acab9`)
- Service injection in `post_init` after persistence restore — fixes a race where the database service was unavailable to the conversation handler on cold start (commit `ecb735d`)
- Correction-prompt always returns `is_food=true` and falls back to the original items when the LLM produces an empty correction — prevents silent food deletion on edit (commit `59c1a30`)
- `allow_reentry=False` on `entry_points` to prevent the user from accidentally re-entering onboarding while in the middle of a correction (commit `2619043`)
- LLM response handler tolerates unexpected fields like `volume_ml` that some providers add (commit `fb9811b`)
- Access check moved to before onboarding; database reference fix in settings (commit `0f638e1`)

### Architecture (as of this version)

- Two LLM chains via `litellm` Router with auto-failover: vision (Gemini 2.5 Flash → OpenRouter Gemini) for photos, text (Gemini → OpenRouter → Groq Llama 4) for descriptions
- Google Search grounding via `google-genai` for low-confidence branded products
- 72 pytest cases covering every handler and service
- `diabot.service` systemd unit for production
- Privacy-first: photos never written to disk (only Telegram `file_id`), `/export`, `/delete_my_data`, GDPR-style consent on first launch
- Multi-user with admin approval workflow (admins from `.env`, additional users via `/adduser` / approval queue)

## [0.1.0] — earlier

### Added
- Initial implementation: photo → recognition → confirm → KBJU + XE → diary
- Onboarding state machine: consent → gender → height → weight → age → targets
- Daily KBJU targets via the Mifflin-St Jeor formula with manual override
- Reply keyboard for navigation, inline keyboard for confirmation
- SQLite via `aiosqlite` for users / meals / glucose / targets
- Bilingual i18n (RU default, EN fully supported) with prompts living next to the locale strings in `locales/`
- 1 XE = 12 g carbs default, configurable per user
45 changes: 45 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Contributing

Thanks for considering a contribution. **Important: this project is licensed under [PolyForm Noncommercial 1.0.0](LICENSE).** Contributions are welcome under the same terms — code you submit becomes part of a non-commercial project, and you agree that you (and downstream users) are not allowed to use it commercially. If you need a commercial licence, contact the author first.

DiaBot is in active development. The core flow works end to end — but several layers need genuine improvement. **Contributors with diabetes / dietetics / endocrinology backgrounds are especially welcome** because the gaps below are domain-knowledge gaps, not just engineering gaps.

## Priorities (highest impact first)

1. **Tighter LLM prompting & schema validation.** The vision chain currently estimates portions in a relatively loose way. PRs welcome to: lower temperature, add explicit JSON-schema constraints, validate units more strictly, return confidence per item, retry on schema violations with a tighter follow-up prompt. Files to look at: `services/llm.py`, `locales/ru.py` and `locales/en.py` (prompts live in the locale files because they are language-tied).
2. **More precise dietetic calculations.** Verified portion-size heuristics, glycaemic load (not just GI label), fibre subtraction in digestible carbs, lactose handling, protein-impact-on-glucose for high-protein meals. Files: `services/nutrition.py`, `models/schemas.py`.
3. **User-facing answer verification.** Add an explicit confidence score per recognised item, render it in the confirmation card, surface a "double-check this" flag at low confidence, prompt the user to weigh portions when the LLM is uncertain. Files: `services/llm.py` (return structured confidence), `handlers/confirm.py` (render).
4. **External KBJU database integration.** Today the bot relies entirely on the LLM for KBJU values. Adding [USDA FoodData Central](https://fdc.nal.usda.gov/), [OpenFoodFacts](https://world.openfoodfacts.org/), or regional Russian / Eastern-European sources as a verification layer would make values much more reliable. Sketch: after recognition, look up each item in the external DB, fall back to LLM only on miss.
5. **Regional adaptation.** Food norms and cuisines differ between EU / US / RU / Asia. Branded products differ. Locally common dishes are recognised unevenly. Roadmap: locale-specific prompt addenda, locale-specific external DB priority, optional locale flag in the user profile.
6. **Native localisations beyond RU / EN.** New locale files in `locales/<code>.py` mirroring the structure of `ru.py` / `en.py`. Translation must include both UI strings and LLM prompts (they are tied to language).

## What we will not merge

- Anything that bypasses the two-step confirmation flow. The `recognise → confirm` step is safety-critical; do not "optimise" it away even if it adds a tap.
- Changes that expose more LLM raw output to the user without confirmation. The bot's response always passes through human review.
- Anything that stores food photos to disk. Only Telegram `file_id` is stored; bytes are streamed and discarded.
- Hard dependencies on a single LLM provider. The whole point of the `litellm` Router is auto-failover; do not collapse the chain.
- Commercial-use forks or features that gate functionality behind a paywall. The licence forbids commercial use.

## Pull request checklist

- [ ] `python -m pytest tests/ -v` — all 72 tests pass (and any new tests you added pass)
- [ ] `python -m py_compile $(git ls-files '*.py')` clean
- [ ] If you touched a handler: thin I/O only — business logic moved into `services/`
- [ ] If you touched LLM prompts: both `locales/ru.py` and `locales/en.py` updated
- [ ] If user-visible behaviour changed: both `README.md` and `README.ru.md` mirrored
- [ ] `CHANGELOG.md` entry added under a new minor / patch version
- [ ] No new file written to disk for user data without an explicit `/export` / `/delete_my_data` path
- [ ] Adheres to the existing code style (English code / comments / docstrings, type hints everywhere, HTML parse_mode for Telegram, no hardcoded user-facing strings)

## Style

- All code, comments, docstrings in **English**. Bot-facing strings only in `locales/`.
- Logging via `logging` (INFO for actions, DEBUG for LLM requests). No `print()`.
- Telegram messages use HTML parse_mode (not Markdown). Existing helpers handle escaping.
- LLM responses use JSON mode (`response_format: json_object`) with retry on invalid JSON.
- One feature per PR. Stack PRs if you have multiple.

## Author / maintainer

[@CreatmanCEO](https://github.com/CreatmanCEO) — Nick Podolyak. Open an issue first for anything larger than a single fix or a single locale.
152 changes: 131 additions & 21 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,21 +1,131 @@
MIT License

Copyright (c) 2026 Creatman

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# PolyForm Noncommercial License 1.0.0

<https://polyformproject.org/licenses/noncommercial/1.0.0>

## Acceptance

In order to get any license under these terms, you must agree
to them as both strict obligations and conditions to all
your licenses.

## Copyright License

The licensor grants you a copyright license for the
software to do everything you might do with the software
that would otherwise infringe the licensor's copyright
in it for any permitted purpose. However, you may
only distribute the software according to [Distribution
License](#distribution-license) and make changes or new works
based on the software according to [Changes and New Works
License](#changes-and-new-works-license).

## Distribution License

The licensor grants you an additional copyright license
to distribute copies of the software. Your license
to distribute covers distributing the software with
changes and new works permitted by [Changes and New Works
License](#changes-and-new-works-license).

## Notices

You must ensure that anyone who gets a copy of any part of
the software from you also gets a copy of these terms or the
URL for them above, as well as copies of any plain-text lines
beginning with `Required Notice:` that the licensor provided
with the software. For example:

> Required Notice: Copyright Yoyodyne, Inc. (http://example.com)

## Changes and New Works License

The licensor grants you an additional copyright license to
make changes and new works based on the software for any
permitted purpose.

## Patent License

The licensor grants you a patent license for the software that
covers patent claims the licensor can license, or becomes able
to license, that you would infringe by using the software.

## Noncommercial Purposes

Any noncommercial purpose is a permitted purpose.

## Personal Uses

Personal use for research, experiment, and testing for
the benefit of public knowledge, personal study, private
entertainment, hobby projects, amateur pursuits, or religious
observance, without any anticipated commercial application,
is use for a permitted purpose.

## Noncommercial Organizations

Use by any charitable organization, educational institution,
public research organization, public safety or health
organization, environmental protection organization,
or government institution is use for a permitted purpose
regardless of the source of funding or obligations resulting
from the funding.

## Fair Use

You may have "fair use" rights for the software under the
law. These terms do not limit them.

## No Other Rights

These terms do not allow you to sublicense or transfer any of
your licenses to anyone else, or prevent the licensor from
granting licenses to anyone else. These terms do not imply
any other licenses.

## Patent Defense

If you make any written claim that the software infringes or
contributes to infringement of any patent, your patent license
for the software granted under these terms ends immediately. If
your company makes such a claim, your patent license ends
immediately for work on behalf of your company.

## Violations

The first time you are notified in writing that you have
violated any of these terms, or done anything with the software
not covered by your licenses, your licenses can nonetheless
continue if you come into full compliance with these terms,
and take practical steps to correct past violations, within
32 days of receiving notice. Otherwise, all your licenses
end immediately.

## No Liability

***As far as the law allows, the software comes as is, without
any warranty or condition, and the licensor will not be liable
to you for any damages arising out of these terms or the use
or nature of the software, under any kind of legal claim.***

## Definitions

The **licensor** is the individual or entity offering these
terms, and the **software** is the software the licensor makes
available under these terms.

**You** refers to the individual or entity agreeing to these
terms.

**Your company** is any legal entity, sole proprietorship,
or other kind of organization that you work for, plus all
organizations that have control over, are under the control of,
or are under common control with that organization. **Control**
means ownership of substantially all the assets of an entity,
or the power to direct its management and policies by vote,
contract, or otherwise. Control can be direct or indirect.

**Your licenses** are all the licenses granted to you for the
software under these terms.

**Use** means anything you do with the software requiring one
of your licenses.
Loading
Loading