Conversation
… stale reaper - DB connection pool (pool_size=20, max_overflow=40, pool_recycle=3600) - Publisher: thread-local connection reuse, exponential backoff (5 attempts) - Consumer: dead letter queue (protea.dlx -> protea.dead-letter) on all queues - Consumer: OperationConsumer emit writes JobEvent to parent job - Health endpoints: /health (liveness) and /health/ready (DB + RabbitMQ) - Structured JSON logging with --log-format flag in worker - StaleJobReaper: marks RUNNING jobs as FAILED after 1h timeout - BaseWorker: adaptive backoff for RetryLaterError (capped at 600s) - Cancel endpoint: cancels both QUEUED and RUNNING children - Composite indexes on (annotation_set_id, accession) and (prediction_set_id, accession) - Taxonomy DB warmup at worker startup for prediction queues - Multi-stage Dockerfile with healthcheck - docker-compose: all 11 services with memory limits
New test files: - test_logging.py (15 tests): JSONFormatter, configure_logging - test_evaluation.py (+35 tests): load_children_map, build_negative_keys, compute_evaluation_data - test_run_cafa_evaluation.py (50 tests): full operation coverage - test_load_goa_annotations.py (+45 tests): GAF parsing, store buffer, execute - test_load_quickgo_annotations.py (+33 tests): TSV parsing, pagination, ECO mapping - test_annotations_router.py (70 tests): all 23 endpoints - test_embeddings_router.py (+44 tests): configs, predict, prediction sets, CAFA TSV - test_proteins_router.py (17 tests): stats, list, detail, annotations - test_admin_router.py (4 tests): reset-db - test_scoring_router.py (+7 tests): scored TSV, metrics Extended test files: - test_queue.py (+20 tests): OperationConsumer on_message, QueueConsumer retry, DLQ - test_base_worker.py (+24 tests): parent cancel, two-session, publish, reaper, warmup - test_core.py (+11 tests): fetch_uniprot_metadata paths - test_infrastructure.py (+7 tests): health endpoints, app factory, pool config - test_insert_proteins.py (+22 tests): FASTA parsing, store_records, pagination - test_load_ontology_snapshot.py (+17 tests): OBO parsing, relationships, backfill
Architecture Decision Records (6 ADRs): - 001: KNN on CPU, not pgvector or GPU - 002: Two-session worker pattern - 003: QueueConsumer vs OperationConsumer - 004: Dead letter queue and retry strategy - 005: Thread-local RabbitMQ connections - 006: Sequence deduplication by MD5 Operational runbook covering: start/stop, health checks, scaling, stuck jobs, batch failures, CUDA OOM, DLQ inspection, DB maintenance RERANKER.md: formal spec for temporal holdout re-ranker (cross-attention architecture, LambdaRank loss, WebDataset pipeline, LightGBM baseline)
…UI overhaul Major features: - Neural re-ranker: train_reranker operation, ReRankerModel ORM, reranker UI page - Expanded CAFA evaluation pipeline with scoring router and detailed metrics - Annotate router and showcase router for streamlined user workflows - Floating jobs widget, breadcrumbs, context banner, tooltip components - Frontend overhaul: redesigned pages, improved navigation, i18n updates - Thesis PDF served from frontend Infrastructure: - 4 new Alembic migrations for re-ranker schema - API deps module, extended scoring endpoints - Experiment and evaluation helper scripts - Updated documentation (results, evaluation architecture) - Version bump to 0.3.0 Tests: - New test suites: reranker, train_reranker, annotate router, showcase router, integration - Expanded: predict_go_terms, compute_embeddings, scoring router, embeddings router
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #7 +/- ##
===========================================
+ Coverage 65.01% 82.07% +17.06%
===========================================
Files 55 63 +8
Lines 4550 5959 +1409
===========================================
+ Hits 2958 4891 +1933
+ Misses 1592 1068 -524 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release v0.3.0 — Re-ranker, evaluation pipeline, annotate workflow, UI overhaul
Features
Tests
Docs
CI
Commits since v0.2.0
2033e0dfeat(infra): connection pool, DLQ, structured logging, health probes, stale reaper7ee749ftest: expand coverage from 65% to 88% (283 -> 831 tests)096823edocs: ADRs, operational runbook, and re-ranker design spec092f110release: v0.3.0 — re-ranker, evaluation pipeline, annotate workflow, UI overhaul8b92868fix(lint): resolve ruff errors53e62c9fix(lint): resolve flake8 E501 and mypy type errorsc244d25ci: bump actions to v6, drop FORCE_JAVASCRIPT_ACTIONS_TO_NODE24