refactor orion-server#2034
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 37a09a6bec
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let _ = | ||
| BuildEvent::update_build_complete_result(&build_id, exit_code, &state.conn) | ||
| .await; |
There was a problem hiding this comment.
Persist v1 build completion in
builds table
This unified completion branch now handles both TaskBuildCompleteV2 and TaskBuildComplete, but it only calls BuildEvent::update_build_complete_result, which writes to callisto::build_events. V1 tasks created via /task are inserted into entity::builds (handle_immediate_task_dispatch uses BuildRepository::insert_build), so their end_at/exit_code are never updated on completion. In practice, completed v1 builds can remain stuck as running/pending in task/build status APIs because the authoritative builds row is left unchanged.
Useful? React with 👍 / 👎.
| let _ = TargetRepository::update_state( | ||
| &state.conn, | ||
| target_id, |
There was a problem hiding this comment.
Write v2 target completion to the correct target store
This completion path always updates target state through TargetRepository::update_state, which operates on the targets table, but v2 dispatch creates/stores target IDs in callisto::build_targets (BuildTarget::insert_default_target in handle_immediate_task_dispatch_v2). For v2 builds, this means completion updates are applied to the wrong table (or no-op), so target completion/error state is not persisted correctly for v2 status consumers.
Useful? React with 👍 / 👎.
1658ef6
There was a problem hiding this comment.
Pull request overview
Refactors orion-server by splitting the previously monolithic api.rs into focused service modules, introducing an explicit entity/ + repository/ layering, and relocating OpenAPI configuration—while also cleaning up dead code allowances and unused code in adjacent crates.
Changes:
- Extracted API logic into
service/api_v2_service.rsand WebSocket handling intoservice/ws_service.rs; introducedapp_state.rsandapi_doc.rs. - Introduced
orion-server/src/entity/*(SeaORM entities) andorion-server/src/repository/*(DB operations), removing oldermodel/*entity modules andorion_common. - Removed various unused/dead-code items and dependencies across
orion-server,ceres,mono,saturn, andapi-model.
Reviewed changes
Copilot reviewed 41 out of 43 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| saturn/src/entitystore.rs | Removes #[allow(dead_code)] on as_entities. |
| saturn/src/context.rs | Removes #[allow(dead_code)] from error enums. |
| orion-server/src/service/ws_service.rs | New WebSocket worker protocol handling extracted from API. |
| orion-server/src/service/target_status_cache_service.rs | New target-status in-memory cache + query helpers. |
| orion-server/src/service/target_build_status_service.rs | Refactors ActiveModel creation to use an input struct. |
| orion-server/src/service/mod.rs | Exposes new service modules. |
| orion-server/src/service/api_v2_service.rs | New extracted API implementation (task/build/log/worker endpoints). |
| orion-server/src/server.rs | Switches to new AppState/ApiDoc placement and queue manager startup. |
| orion-server/src/scheduler.rs | Updates imports and routes target operations through repository. |
| orion-server/src/repository/tasks.rs | New task DB helper repository. |
| orion-server/src/repository/targets.rs | New target DB helper repository (find-or-create + update state). |
| orion-server/src/repository/orion_tasks.rs | Removes DTO definitions (moved to model/dto) and dead-code allows. |
| orion-server/src/repository/mod.rs | New repository module root. |
| orion-server/src/repository/builds.rs | New build DB helper repository. |
| orion-server/src/repository/build_targets.rs | Updates TargetState import; keeps unused method behind allow. |
| orion-server/src/repository/build_events.rs | New build-event repository helper. |
| orion-server/src/orion_common/model.rs | Deleted (common paging types moved to api-model). |
| orion-server/src/orion_common/mod.rs | Deleted (module removed). |
| orion-server/src/model/tasks.rs | Deleted old SeaORM entity module (moved to entity/). |
| orion-server/src/model/task_status.rs | New task-status enum module for API responses. |
| orion-server/src/model/targets.rs | Deleted old SeaORM entity module (moved to entity/). |
| orion-server/src/model/mod.rs | Narrows model module to DTOs + task status only. |
| orion-server/src/model/dto.rs | New consolidated DTOs used by API + OpenAPI. |
| orion-server/src/model/build_events.rs | Deleted old DTO/build-event helper (moved elsewhere). |
| orion-server/src/main.rs | Converts binary to call into library crate (orion_server::...). |
| orion-server/src/lib.rs | Exposes new modules (api_doc, app_state, entity, repository). |
| orion-server/src/entity/tasks.rs | New SeaORM tasks entity module. |
| orion-server/src/entity/targets.rs | New SeaORM targets entity module (+ TargetState + DTO helper). |
| orion-server/src/entity/mod.rs | New entity module root. |
| orion-server/src/entity/builds.rs | Removes model-layer helper methods from entity module. |
| orion-server/src/buck2.rs | Removes unused download/exec helpers; keeps mono base URL setter. |
| orion-server/src/app_state.rs | New shared application state (scheduler/conn/log/cache/shutdown). |
| orion-server/src/api_doc.rs | New OpenAPI (utoipa) doc configuration module. |
| orion-server/src/api.rs | Router refactor: delegates endpoint implementations to service modules. |
| orion-server/Cargo.toml | Removes unused deps (tokio-retry, reqwest, jupiter). |
| mono/src/git_protocol/ssh.rs | Removes dead-code allow from ClientMap. |
| mono/src/api/guard/cedar_guard.rs | Removes unused Path import and dead helper function. |
| io-orbit/src/bin/migrate_local_to_s3.rs | Minor control-flow simplification in task bounding logic. |
| ceres/src/pack/monorepo.rs | Removes unused buck-search/diff helpers and unused imports. |
| ceres/src/model/change_list.rs | Removes dead VerifyClPayload allow and deletes unused BuckFile. |
| api-model/src/buck2/ws.rs | Removes #[allow(dead_code)] from WS protocol enum. |
| api-model/src/buck2/types.rs | Removes #[allow(dead_code)] from TaskPhase enum. |
| api-model/src/buck2/api.rs | Removes #[allow(dead_code)] from API request/response structs. |
| return ControlFlow::Continue(()); | ||
| } | ||
|
|
||
| let current_worker_id = worker_id.as_ref().unwrap_or(&String::new()).clone(); |
There was a problem hiding this comment.
worker_id.as_ref().unwrap_or(&String::new()).clone() borrows a temporary String, which won't compile (temporary doesn't live long enough). Use an owned default (e.g., clone the Option<String> and unwrap_or_default()), or restructure to avoid referencing a temporary.
| Message::Close(_) => { | ||
| if let Some(id) = worker_id.take() | ||
| && let Some(mut worker) = state.scheduler.workers.get_mut(&id) | ||
| { | ||
| worker.status = WorkerStatus::Lost; | ||
| } | ||
| return ControlFlow::Break(()); |
There was a problem hiding this comment.
On Message::Close, worker_id.take() clears the id, so handle_socket's cleanup path treats the connection as unregistered and won't remove the worker entry. That keeps the UnboundedSender alive and can leave the send task blocked on rx.recv() indefinitely for lost workers. Avoid take() here (keep a separate immutable registered id for cleanup), and/or explicitly remove the worker entry / close the channel on close.
| callisto::build_events::Entity::update_many() | ||
| .filter(callisto::build_events::Column::Id.eq(build_id.parse::<Uuid>().unwrap())) |
There was a problem hiding this comment.
build_id.parse::<Uuid>().unwrap() can panic on malformed input (this id comes from external sources like WebSocket messages). Return a proper DbErr/API error instead of panicking, or validate/parse the UUID at the call site and pass a Uuid into this method.
| callisto::build_events::Entity::update_many() | |
| .filter(callisto::build_events::Column::Id.eq(build_id.parse::<Uuid>().unwrap())) | |
| let build_uuid = build_id | |
| .parse::<Uuid>() | |
| .map_err(|e| DbErr::Custom(format!("invalid build_id UUID `{build_id}`: {e}")))?; | |
| callisto::build_events::Entity::update_many() | |
| .filter(callisto::build_events::Column::Id.eq(build_uuid)) |
| while let Some(msg) = rx.recv().await { | ||
| let msg_str = serde_json::to_string(&msg).unwrap_or_default(); | ||
| if sender | ||
| .send(Message::Text(Utf8Bytes::from(msg_str))) | ||
| .await |
There was a problem hiding this comment.
serde_json::to_string(&msg).unwrap_or_default() will silently send an empty/invalid payload on serialization failure, which makes debugging client/server protocol issues harder. Prefer logging the serialization error and either skipping the send or closing the connection.
| let build_models = builds::Entity::find() | ||
| .filter(builds::Column::TaskId.eq(task.id)) | ||
| .all(&state.conn) | ||
| .await | ||
| .unwrap_or_else(|_| vec![]); |
There was a problem hiding this comment.
This query also swallows DB errors by converting them to an empty vec. That can make builds disappear from the API response without surfacing an error to callers. Prefer returning an error response (or at least logging the error) instead of defaulting to empty.
| tokio::select! { | ||
| _ = send_task => {}, | ||
| result = recv_task => { | ||
| if let Ok(final_worker_id) = result { | ||
| worker_id = final_worker_id; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
tokio::select! exits early if send_task finishes first, which drops recv_task and loses the final worker_id (so a registered worker may never get cleaned up). Consider coordinating the tasks so you always know the registered worker id before cleanup (e.g., share the id via Arc<Mutex<Option<String>>>, abort the other task after one completes, and then remove the worker deterministically).
| .filter( | ||
| builds::Column::Id.eq(build_id | ||
| .parse::<uuid::Uuid>() | ||
| .unwrap_or_else(|_| Uuid::nil())), | ||
| ) |
There was a problem hiding this comment.
When updating builds.retry_count, an invalid build_id falls back to Uuid::nil(), which can silently update the wrong row or no-op without signaling an error. Handle the parse failure explicitly (log + skip, or return/break) rather than using a sentinel UUID.
| let target_models = targets::Entity::find() | ||
| .filter(targets::Column::TaskId.eq(task.id)) | ||
| .all(&state.conn) | ||
| .await | ||
| .unwrap_or_else(|_| vec![]); |
There was a problem hiding this comment.
These queries swallow DB errors by converting them to empty vectors (unwrap_or_else(|_| vec![])), which can return partial/incorrect task info with a 200 response while hiding the underlying failure. Consider propagating the DB error up so the handler can return 5xx (or at least log the error).
| pub async fn get_builds_by_task_id( | ||
| task_id: Uuid, | ||
| db: &sea_orm::DatabaseConnection, | ||
| ) -> Option<Vec<Uuid>> { | ||
| match crate::entity::builds::Entity::find() | ||
| .filter(crate::entity::builds::Column::TaskId.eq(task_id)) | ||
| .all(db) | ||
| .await | ||
| { | ||
| Ok(builds) => Some(builds.into_iter().map(|build| build.id).collect()), | ||
| Err(e) => { | ||
| tracing::error!("Failed to fetch builds for task_id {}: {}", task_id, e); | ||
| None | ||
| } | ||
| } |
There was a problem hiding this comment.
get_builds_by_task_id queries only the builds table, so a non-existent task is indistinguishable from an existing task with zero builds (it returns Some(vec![]) in both cases). This makes it impossible for /task-build-list/{id} to reliably return 404 for missing tasks as implied by the handler logic; consider checking task existence (tasks table) or returning a richer result type.
No description provided.