Release v0.3.0 by frapercan · Pull Request #7 · frapercan/PROTEA

frapercan · 2026-03-25T13:14:51Z

Release v0.3.0 — Re-ranker, evaluation pipeline, annotate workflow, UI overhaul

Features

Re-ranker neural model: temporal holdout training pipeline with LightGBM, feature engineering (alignments, taxonomy), and scoring configs
CAFA evaluation pipeline: automated evaluation with multiple metrics (Fmax, Smin, AUPR)
Annotate workflow: end-to-end functional annotation from FASTA upload to GO term prediction
Scoring engine: configurable scoring configs with evidence weights
Connection pool, DLQ, structured logging, health probes, stale job reaper
Full i18n: 5 locales (EN/ES/DE/PT/ZH) via next-intl
Frontend overhaul: scoring config UI, support page, evaluation views, human-readable labels

Tests

Coverage expanded from 65% to 88% (283 → 831 tests)

Docs

ADRs, operational runbook, re-ranker design spec
Full Sphinx documentation update

CI

Bump GitHub Actions to v6 (checkout, setup-python)
Fix all ruff, flake8, and mypy lint errors

Commits since v0.2.0

2033e0d feat(infra): connection pool, DLQ, structured logging, health probes, stale reaper
7ee749f test: expand coverage from 65% to 88% (283 -> 831 tests)
096823e docs: ADRs, operational runbook, and re-ranker design spec
092f110 release: v0.3.0 — re-ranker, evaluation pipeline, annotate workflow, UI overhaul
8b92868 fix(lint): resolve ruff errors
53e62c9 fix(lint): resolve flake8 E501 and mypy type errors
c244d25 ci: bump actions to v6, drop FORCE_JAVASCRIPT_ACTIONS_TO_NODE24

… stale reaper - DB connection pool (pool_size=20, max_overflow=40, pool_recycle=3600) - Publisher: thread-local connection reuse, exponential backoff (5 attempts) - Consumer: dead letter queue (protea.dlx -> protea.dead-letter) on all queues - Consumer: OperationConsumer emit writes JobEvent to parent job - Health endpoints: /health (liveness) and /health/ready (DB + RabbitMQ) - Structured JSON logging with --log-format flag in worker - StaleJobReaper: marks RUNNING jobs as FAILED after 1h timeout - BaseWorker: adaptive backoff for RetryLaterError (capped at 600s) - Cancel endpoint: cancels both QUEUED and RUNNING children - Composite indexes on (annotation_set_id, accession) and (prediction_set_id, accession) - Taxonomy DB warmup at worker startup for prediction queues - Multi-stage Dockerfile with healthcheck - docker-compose: all 11 services with memory limits

New test files: - test_logging.py (15 tests): JSONFormatter, configure_logging - test_evaluation.py (+35 tests): load_children_map, build_negative_keys, compute_evaluation_data - test_run_cafa_evaluation.py (50 tests): full operation coverage - test_load_goa_annotations.py (+45 tests): GAF parsing, store buffer, execute - test_load_quickgo_annotations.py (+33 tests): TSV parsing, pagination, ECO mapping - test_annotations_router.py (70 tests): all 23 endpoints - test_embeddings_router.py (+44 tests): configs, predict, prediction sets, CAFA TSV - test_proteins_router.py (17 tests): stats, list, detail, annotations - test_admin_router.py (4 tests): reset-db - test_scoring_router.py (+7 tests): scored TSV, metrics Extended test files: - test_queue.py (+20 tests): OperationConsumer on_message, QueueConsumer retry, DLQ - test_base_worker.py (+24 tests): parent cancel, two-session, publish, reaper, warmup - test_core.py (+11 tests): fetch_uniprot_metadata paths - test_infrastructure.py (+7 tests): health endpoints, app factory, pool config - test_insert_proteins.py (+22 tests): FASTA parsing, store_records, pagination - test_load_ontology_snapshot.py (+17 tests): OBO parsing, relationships, backfill

Architecture Decision Records (6 ADRs): - 001: KNN on CPU, not pgvector or GPU - 002: Two-session worker pattern - 003: QueueConsumer vs OperationConsumer - 004: Dead letter queue and retry strategy - 005: Thread-local RabbitMQ connections - 006: Sequence deduplication by MD5 Operational runbook covering: start/stop, health checks, scaling, stuck jobs, batch failures, CUDA OOM, DLQ inspection, DB maintenance RERANKER.md: formal spec for temporal holdout re-ranker (cross-attention architecture, LambdaRank loss, WebDataset pipeline, LightGBM baseline)

…UI overhaul Major features: - Neural re-ranker: train_reranker operation, ReRankerModel ORM, reranker UI page - Expanded CAFA evaluation pipeline with scoring router and detailed metrics - Annotate router and showcase router for streamlined user workflows - Floating jobs widget, breadcrumbs, context banner, tooltip components - Frontend overhaul: redesigned pages, improved navigation, i18n updates - Thesis PDF served from frontend Infrastructure: - 4 new Alembic migrations for re-ranker schema - API deps module, extended scoring endpoints - Experiment and evaluation helper scripts - Updated documentation (results, evaluation architecture) - Version bump to 0.3.0 Tests: - New test suites: reranker, train_reranker, annotate router, showcase router, integration - Expanded: predict_go_terms, compute_embeddings, scoring router, embeddings router

…rder

codecov · 2026-03-25T13:18:50Z

Codecov Report

❌ Patch coverage is 60.28834% with 606 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.07%. Comparing base (2eeb474) to head (c244d25).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
protea/core/operations/train_reranker.py	31.32%	410 Missing ⚠️
protea/core/operations/run_cafa_evaluation.py	20.35%	90 Missing ⚠️
protea/core/operations/predict_go_terms.py	41.66%	42 Missing ⚠️
protea/api/routers/scoring.py	79.12%	38 Missing ⚠️
protea/api/routers/annotate.py	92.92%	7 Missing ⚠️
protea/core/reranker.py	93.54%	6 Missing ⚠️
protea/core/operations/compute_embeddings.py	61.53%	5 Missing ⚠️
protea/infrastructure/queue/consumer.py	94.73%	2 Missing ⚠️
protea/infrastructure/queue/publisher.py	92.30%	2 Missing ⚠️
protea/api/routers/embeddings.py	92.85%	1 Missing ⚠️
... and 3 more

Additional details and impacted files

@@             Coverage Diff             @@
##             main       #7       +/-   ##
===========================================
+ Coverage   65.01%   82.07%   +17.06%     
===========================================
  Files          55       63        +8     
  Lines        4550     5959     +1409     
===========================================
+ Hits         2958     4891     +1933     
+ Misses       1592     1068      -524

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

frapercan added 7 commits March 18, 2026 13:34

fix(lint): resolve ruff errors — unused imports, semicolons, import o…

8b92868

…rder

fix(lint): resolve flake8 E501 and mypy type errors

53e62c9

ci: bump actions to v6, drop FORCE_JAVASCRIPT_ACTIONS_TO_NODE24

c244d25

frapercan merged commit cd433b8 into main Mar 25, 2026
9 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.3.0#7

Release v0.3.0#7
frapercan merged 7 commits intomainfrom
develop

frapercan commented Mar 25, 2026

Uh oh!

codecov bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

frapercan commented Mar 25, 2026

Release v0.3.0 — Re-ranker, evaluation pipeline, annotate workflow, UI overhaul

Features

Tests

Docs

CI

Commits since v0.2.0

Uh oh!

codecov bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Mar 25, 2026 •

edited

Loading