You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This weekly audit covers the miso-gallery codebase (Flask app, server-rendered gallery), with a focus on code consistency, operational gaps, dependency hygiene, and incremental degradation since the last audit (2026-06-03). The app is in a generally healthy state with strong security posture, but several areas of accumulated technical debt and gaps were identified.
Top Findings
P0 — Critical
None identified. No active security vulnerabilities, data loss risks, or blocked workflows were found.
P1 — High
iter_gallery_media() / iter_gallery_folders() coexist with newer iter_gallery_items() — dead code ambiguity
The newer iter_gallery_items() centralizes the bounded iterator pattern but iter_gallery_media() and iter_gallery_folders() are still present and used by several endpoints (llm_images, llm_folders, llm_recent, llm_tags, add_tag). These legacy functions duplicate the same bounded-scan logic with slightly different exclusion handling. Inconsistencies could lead to drift where one iterator excludes paths differently from another. Evidence:app.py ~lines 165-210: three separate rglob-based iterators with near-identical loop bodies. The is_excluded_gallery_path check is applied in iter_gallery_items() but iter_gallery_media() uses inline excluded_dirs logic.
llm_tags tag storage is a no-op (logs only)
The /api/llm/tags endpoint and the web UI /tag route both accept tags but only log them — no backend storage exists. This was a UI-first partial fix for issue feat(gallery): add tagging and filtering #91. Tags are a visible feature: the "Tag" button renders on every image card and the API accepts/validates them, but they disappear on reload. Users and LLM callers have no way to retrieve tags later. Evidence:app.py/tag route (~line 590): log_security_event("add_tag", "success", ...) then return {"status": "ok"} — no DB or file writes. llm_tags` similarly only logs.
Duplicate thumbnail cache removal across delete paths — logic divergence remove_thumbnail_cache_for() walks THUMBNAIL_CACHE_DIR with iterdir() filtering by prefix. The /delete/ route calls it, /bulk-delete calls it, and LLM delete routes call it. Each call does its own loop. This is duplicated in 5+ places. A refactor into a single batch-purge function would reduce inconsistency risk. Evidence: Appears in delete (~line 410), bulk_delete (~lines 455, 465), llm_delete (~line 841), llm_bulk_delete (~line 868), llm_dedup (~line 896).
RATE_LIMIT_ROUTE_LIMITS JSON env var is fragile and untested
The override mechanism in security.py (_load_route_overrides()) parses a JSON env var for custom rate limits. If the JSON is malformed, it silently falls back to defaults with only a warning log. There are no tests for this path. If an operator sets a bad value thinking it's active, the overrides silently disappear. Evidence:security.py_load_route_overrides() — no unit tests cover this function in test_security_edges.py or any other test file.
Release workflow publishes Docker tags to ghcr.io/misospace/miso-gallery but manual-release creates a tag — the release.yaml and publish-release.yml publish to different tags
The Release workflow (release.yaml) publishes ghcr.io/misospace/miso-gallery with SHA-based tags (via docker/metadata-action with type=sha,prefix={{branch}}-) during the merge step, but creates no semver tag. The Publish Release workflow (publish-release.yml) creates the semver tag and GH release. Both workflows then trigger the Release workflow on tag creation, which builds the multi-arch image again. This means the image is built twice: once during the PR merge build (as main-<sha>) and once during the release tag (as :<semver>). The :latest tag is never updated by any workflow. Evidence:.github/workflows/release.yaml merges digests with type=sha,prefix={{branch}}- tags. .github/workflows/publish-release.yml tags the merge commit. :latest is never explicitly pushed.
P2 — Medium
file_sha256() reads 1MB chunks but uses iter() pattern that may leave files open on error
The function opens the file but only closes on normal completion. Generator exception safety is not guaranteed; the file handle may leak if an exception occurs mid-iteration. Evidence:app.py line ~224: with path.open("rb") as handle: — needs try/finally or contextlib.closing.
grep -m1 in release workflow version check
The Release workflow checks APP_VERSION with grep -oP 'or "\K[^"]+' | head -1 which can match multiple lines if APP_VERSION appears in comments or docstrings. A more deterministic extraction (e.g., app.py import + regex) is safer. Evidence:.github/workflows/release.yaml: VERSION=$(grep '^APP_VERSION = ' app.py | head -1 | grep -oP 'or "\K[^"]+' | head -1)
dir_size() and _dir_size() modules are duplicated trash.py has both a module-level dir_size() and a private _dir_size(). The module-level one skips symlinks (security), while the private one does not skip symlinks. move_to_trash() calls _dir_size() (not dir_size()) for the post-move metadata, meaning the trash metadata size estimate may follow symlinks. Evidence:trash.py lines ~16-26 (dir_size skips symlinks) vs lines ~135-141 (_dir_size does not skip symlinks).
Python 3.14 in CI but 3.12 in dependency audit — inconsistency
The lint and test workflows use python-version: '3.14', while the dependency audit workflow uses "3.12". This means dependency audit runs against a different Python version than the code actually runs on. Vulnerabilities specific to 3.14 (or deps compiled for 3.14) won't be caught. Evidence:.github/workflows/lint.yaml (3.14), .github/workflows/tests.yaml (3.14), .github/workflows/dependency-audit.yaml (3.12).
bulk_delete folder preflight path validation has dead code
The sanitize_path() call inside the folder size estimation loop has a # sanitize_path() rejects paths containing ... comment but no error handling — it just calls continue on failure without logging. Evidence:app.py ~lines 445-448: if not sanitize_path(rel_path): # continue.
health.py defines routes via Blueprint but app.py registers them redundantly via app.add_url_rule health.py creates a health_bp Blueprint with routes like /health/storage. But app.py registers them again via app.add_url_rule("/health", ...) etc. The Blueprint routes in health.py are never actually registered on the app — only the add_url_rule registrations in app.py take effect. This is confusing and the Blueprint is dead code. Evidence:health.py: health_bp = Blueprint("health", __name__) with @health_bp.route("/health/storage") — but app.py registers via app.add_url_rule("/health/storage", ...).
P3 — Low
APP_VERSION default is "0.1.18" — should match semantic release
The hardcoded default in app.py has not been bumped since the last release. While the env var override handles deployments, the source-of-truth default is stale. This creates confusion when reading the source.
conftest.py uses pathlib.Path for ROOT but re-adds str(ROOT) to sys.path
Minor: Path.__str__ returns the path string, which is already what sys.path.insert() expects. The str() call is redundant.
Trash restore uses shutil.copytree for directories (non-atomic) instead of rename restore_from_trash() uses shutil.copytree() for directory restore while move_to_trash() uses rename. This means directory restores are not atomic and can leave partial state on failure.
docs/runbook.md release section references npm version — miso-gallery has no package.json
The runbook shows "npm version 0.1.x --no-git-tag-version" for bumping APP_VERSION, but the repo is Python-only with no npm.
Evidence: Files and Observations
Code Duplication — iter_gallery_items vs iter_gallery_media vs iter_gallery_folders
app.py lines 165-210: three separate functions doing DATA_FOLDER.rglob("*") bounded by GALLERY_SCAN_LIMIT
iter_gallery_items() was added later as a unified function but iter_gallery_media() and iter_gallery_folders() remain deployed
Callers: find_duplicate_media() uses iter_gallery_media(); iter_gallery_folders() is called by llm_folders; llm_images uses iter_gallery_media(); iter_gallery_items() currently has no callers
The updated list in the API response is misleading — it shows paths that would be tagged but the tags are never persisted
Dead Blueprint in health.py
# health.py creates this:health_bp=Blueprint("health", __name__)
@health_bp.route("/health/storage")# ... but this Blueprint is never imported/registered in app.py# Instead app.py uses:app.add_url_rule("/health/storage", "storage_health", storage_health, methods=["GET"])
Duplicate thumnail cleanup
Pattern repeated at app.py lines:
410: remove_thumbnail_cache_for(rel_path) in delete()
455, 465: in bulk_delete() (for files and folders)
841: in llm_delete()
868: in llm_bulk_delete()
896: in llm_dedup()
Recommended Issue Breakdown
P1 — Consolidate gallery iterators: replace iter_gallery_media() and iter_gallery_folders() with unified iter_gallery_items()
Migrate all callers (llm_images, llm_folders, llm_recent, llm_tags, find_duplicate_media) to use the single bounded iterator. Remove legacy duplicates.
P1 — Implement tag persistence: store tags as sidecar files or SQLite
Add actual storage backend for tags (JSON sidecar per image or flat SQLite DB). Update /tag and /api/llm/tags to persist and /api/llm/images to include tags in metadata response.
P2 — Extract thumbnail cache cleanup into a single batch function
Replace 5+ inline remove_thumbnail_cache_for() calls with a single batch_remove_thumbnails(paths: list[str]) function that takes a list of rel_paths and does one directory walk.
P2 — Add tests for _load_route_overrides()
Cover valid JSON, malformed JSON, empty string, non-dict JSON, and boundary values.
P1 — Fix release workflow: restore :latest tag publishing and eliminate double build
Add type=raw,value=latest to the Release workflow's metadata-action tags, and consolidate release/publish workflows to avoid building the image twice per release.
P2 — Fix file_sha256() file handle safety
Add explicit finally block or wrap the generator in a context manager to guarantee file handle release on error.
P2 — Strengthen APP_VERSION extraction in release workflow
Replace grep -oP with a Python one-liner that imports the module and reads the constant directly.
P2 — Standardize dir_size() and _dir_size() in trash.py
Either make both skip symlinks or eliminate the private variant and use the public one consistently (including in move_to_trash()).
P3 — Align dependency-audit Python version with CI Python version
Change dependency-audit.yaml from python-version: "3.12" to "3.14" to match lint/tests workflows.
P3 — Fix dead comment/logging in bulk_delete folder preflight
Remove dead continue with no logging and add proper security event logging for sanitize_path failures.
P2 — Remove dead health_bp Blueprint from health.py
Either register the Blueprint in app.py or remove it to eliminate confusion.
P3 — Remove npm version reference from docs/runbook.md
Replace with sed -i ... or Python script instruction.
Not Worth Doing Yet
The dead Blueprint issue is cosmetic but low impact. The health.py routes work fine via app.add_url_rule(). Only fix this if the file is already being touched.
Python 3.14 in CI vs 3.12 for dep audit — the dependency audit runs weekly and checks for known CVEs; Python 3.14 vs 3.12 doesn't change the vulnerability surface for pure-Python deps significantly. Only fix if the audit CI starts false-positive flagging 3.12-only vulnerabilities.
conftest.py redundant str() call — cosmetic only, no runtime impact.
Tag UX is visible but functionless, but the API does exist. Adding a full tag store may be more work than the feature value justifies at this point. Consider whether tags are worth keeping as a user-facing feature before investing.
npm version in runbook — low priority since the manual-release workflow doesn't actually run npm; the runbook instructions are stale but harmless.
Summary
Overall Risk Level: Moderate
This weekly audit covers the miso-gallery codebase (Flask app, server-rendered gallery), with a focus on code consistency, operational gaps, dependency hygiene, and incremental degradation since the last audit (2026-06-03). The app is in a generally healthy state with strong security posture, but several areas of accumulated technical debt and gaps were identified.
Top Findings
P0 — Critical
None identified. No active security vulnerabilities, data loss risks, or blocked workflows were found.
P1 — High
iter_gallery_media()/iter_gallery_folders()coexist with neweriter_gallery_items()— dead code ambiguityThe newer
iter_gallery_items()centralizes the bounded iterator pattern butiter_gallery_media()anditer_gallery_folders()are still present and used by several endpoints (llm_images,llm_folders,llm_recent,llm_tags,add_tag). These legacy functions duplicate the same bounded-scan logic with slightly different exclusion handling. Inconsistencies could lead to drift where one iterator excludes paths differently from another.Evidence:
app.py~lines 165-210: three separate rglob-based iterators with near-identical loop bodies. Theis_excluded_gallery_pathcheck is applied initer_gallery_items()butiter_gallery_media()uses inline excluded_dirs logic.llm_tagstag storage is a no-op (logs only)The
/api/llm/tagsendpoint and the web UI/tagroute both accept tags but only log them — no backend storage exists. This was a UI-first partial fix for issue feat(gallery): add tagging and filtering #91. Tags are a visible feature: the "Tag" button renders on every image card and the API accepts/validates them, but they disappear on reload. Users and LLM callers have no way to retrieve tags later.Evidence:
app.py/tagroute (~line 590):log_security_event("add_tag", "success", ...)thenreturn {"status": "ok"} — no DB or file writes.llm_tags` similarly only logs.Duplicate thumbnail cache removal across delete paths — logic divergence
remove_thumbnail_cache_for()walks THUMBNAIL_CACHE_DIR withiterdir()filtering by prefix. The/delete/route calls it,/bulk-deletecalls it, and LLM delete routes call it. Each call does its own loop. This is duplicated in 5+ places. A refactor into a single batch-purge function would reduce inconsistency risk.Evidence: Appears in
delete(~line 410),bulk_delete(~lines 455, 465),llm_delete(~line 841),llm_bulk_delete(~line 868),llm_dedup(~line 896).RATE_LIMIT_ROUTE_LIMITSJSON env var is fragile and untestedThe override mechanism in
security.py(_load_route_overrides()) parses a JSON env var for custom rate limits. If the JSON is malformed, it silently falls back to defaults with only a warning log. There are no tests for this path. If an operator sets a bad value thinking it's active, the overrides silently disappear.Evidence:
security.py_load_route_overrides()— no unit tests cover this function intest_security_edges.pyor any other test file.Release workflow publishes Docker tags to
ghcr.io/misospace/miso-gallerybut manual-release creates a tag — therelease.yamlandpublish-release.ymlpublish to different tagsThe Release workflow (
release.yaml) publishesghcr.io/misospace/miso-gallerywith SHA-based tags (viadocker/metadata-actionwithtype=sha,prefix={{branch}}-) during the merge step, but creates no semver tag. The Publish Release workflow (publish-release.yml) creates the semver tag and GH release. Both workflows then trigger the Release workflow on tag creation, which builds the multi-arch image again. This means the image is built twice: once during the PR merge build (asmain-<sha>) and once during the release tag (as:<semver>). The:latesttag is never updated by any workflow.Evidence:
.github/workflows/release.yamlmerges digests withtype=sha,prefix={{branch}}-tags..github/workflows/publish-release.ymltags the merge commit.:latestis never explicitly pushed.P2 — Medium
file_sha256()reads 1MB chunks but usesiter()pattern that may leave files open on errorThe function opens the file but only closes on normal completion. Generator exception safety is not guaranteed; the file handle may leak if an exception occurs mid-iteration.
Evidence:
app.pyline ~224:with path.open("rb") as handle:— needstry/finallyor contextlib.closing.grep -m1in release workflow version checkThe Release workflow checks APP_VERSION with
grep -oP 'or "\K[^"]+' | head -1which can match multiple lines ifAPP_VERSIONappears in comments or docstrings. A more deterministic extraction (e.g.,app.pyimport + regex) is safer.Evidence:
.github/workflows/release.yaml:VERSION=$(grep '^APP_VERSION = ' app.py | head -1 | grep -oP 'or "\K[^"]+' | head -1)dir_size()and_dir_size()modules are duplicatedtrash.pyhas both a module-leveldir_size()and a private_dir_size(). The module-level one skips symlinks (security), while the private one does not skip symlinks.move_to_trash()calls_dir_size()(notdir_size()) for the post-move metadata, meaning the trash metadata size estimate may follow symlinks.Evidence:
trash.pylines ~16-26 (dir_sizeskips symlinks) vs lines ~135-141 (_dir_sizedoes not skip symlinks).Python 3.14 in CI but 3.12 in dependency audit — inconsistency
The lint and test workflows use
python-version: '3.14', while the dependency audit workflow uses"3.12". This means dependency audit runs against a different Python version than the code actually runs on. Vulnerabilities specific to 3.14 (or deps compiled for 3.14) won't be caught.Evidence:
.github/workflows/lint.yaml(3.14),.github/workflows/tests.yaml(3.14),.github/workflows/dependency-audit.yaml(3.12).bulk_deletefolder preflight path validation has dead codeThe
sanitize_path()call inside the folder size estimation loop has a# sanitize_path() rejects paths containing ...comment but no error handling — it just callscontinueon failure without logging.Evidence:
app.py~lines 445-448:if not sanitize_path(rel_path): # continue.health.pydefines routes via Blueprint butapp.pyregisters them redundantly viaapp.add_url_rulehealth.pycreates ahealth_bpBlueprint with routes like/health/storage. Butapp.pyregisters them again viaapp.add_url_rule("/health", ...)etc. The Blueprint routes inhealth.pyare never actually registered on the app — only theadd_url_ruleregistrations inapp.pytake effect. This is confusing and the Blueprint is dead code.Evidence:
health.py:health_bp = Blueprint("health", __name__)with@health_bp.route("/health/storage")— butapp.pyregisters viaapp.add_url_rule("/health/storage", ...).P3 — Low
APP_VERSIONdefault is"0.1.18"— should match semantic releaseThe hardcoded default in
app.pyhas not been bumped since the last release. While the env var override handles deployments, the source-of-truth default is stale. This creates confusion when reading the source.conftest.pyusespathlib.Pathfor ROOT but re-addsstr(ROOT)tosys.pathMinor:
Path.__str__returns the path string, which is already whatsys.path.insert()expects. Thestr()call is redundant.Trash restore uses
shutil.copytreefor directories (non-atomic) instead ofrenamerestore_from_trash()usesshutil.copytree()for directory restore whilemove_to_trash()usesrename. This means directory restores are not atomic and can leave partial state on failure.docs/runbook.mdrelease section referencesnpm version— miso-gallery has nopackage.jsonThe runbook shows "npm version 0.1.x --no-git-tag-version" for bumping APP_VERSION, but the repo is Python-only with no npm.
Evidence: Files and Observations
Code Duplication —
iter_gallery_itemsvsiter_gallery_mediavsiter_gallery_foldersapp.pylines 165-210: three separate functions doingDATA_FOLDER.rglob("*")bounded byGALLERY_SCAN_LIMITiter_gallery_items()was added later as a unified function butiter_gallery_media()anditer_gallery_folders()remain deployedfind_duplicate_media()usesiter_gallery_media();iter_gallery_folders()is called byllm_folders;llm_imagesusesiter_gallery_media();iter_gallery_items()currently has no callersTag storage — no-op
/tagroute (app.py~line 590): logs event and returns{"status": "ok"}/api/llm/tags(app.py~line 810): logs event and returns{"status": "ok", "updated": [...], "tags": [...]}updatedlist in the API response is misleading — it shows paths that would be tagged but the tags are never persistedDead Blueprint in health.py
Duplicate thumnail cleanup
Pattern repeated at app.py lines:
remove_thumbnail_cache_for(rel_path)indelete()bulk_delete()(for files and folders)llm_delete()llm_bulk_delete()llm_dedup()Recommended Issue Breakdown
P1 — Consolidate gallery iterators: replace
iter_gallery_media()anditer_gallery_folders()with unifiediter_gallery_items()Migrate all callers (
llm_images,llm_folders,llm_recent,llm_tags,find_duplicate_media) to use the single bounded iterator. Remove legacy duplicates.P1 — Implement tag persistence: store tags as sidecar files or SQLite
Add actual storage backend for tags (JSON sidecar per image or flat SQLite DB). Update
/tagand/api/llm/tagsto persist and/api/llm/imagesto include tags in metadata response.P2 — Extract thumbnail cache cleanup into a single batch function
Replace 5+ inline
remove_thumbnail_cache_for()calls with a singlebatch_remove_thumbnails(paths: list[str])function that takes a list of rel_paths and does one directory walk.P2 — Add tests for
_load_route_overrides()Cover valid JSON, malformed JSON, empty string, non-dict JSON, and boundary values.
P1 — Fix release workflow: restore
:latesttag publishing and eliminate double buildAdd
type=raw,value=latestto the Release workflow's metadata-action tags, and consolidate release/publish workflows to avoid building the image twice per release.P2 — Fix
file_sha256()file handle safetyAdd explicit
finallyblock or wrap the generator in a context manager to guarantee file handle release on error.P2 — Strengthen
APP_VERSIONextraction in release workflowReplace
grep -oPwith a Python one-liner that imports the module and reads the constant directly.P2 — Standardize
dir_size()and_dir_size()in trash.pyEither make both skip symlinks or eliminate the private variant and use the public one consistently (including in
move_to_trash()).P3 — Align dependency-audit Python version with CI Python version
Change dependency-audit.yaml from
python-version: "3.12"to"3.14"to match lint/tests workflows.P3 — Fix dead comment/logging in
bulk_deletefolder preflightRemove dead
continuewith no logging and add proper security event logging for sanitize_path failures.P2 — Remove dead
health_bpBlueprint from health.pyEither register the Blueprint in
app.pyor remove it to eliminate confusion.P3 — Remove
npm versionreference from docs/runbook.mdReplace with
sed -i ...or Python script instruction.Not Worth Doing Yet
health.pyroutes work fine viaapp.add_url_rule(). Only fix this if the file is already being touched.conftest.pyredundantstr()call — cosmetic only, no runtime impact.npm versionin runbook — low priority since the manual-release workflow doesn't actually run npm; the runbook instructions are stale but harmless.Decomposed into