Skip to content

refactor orion-server#2034

Merged
genedna merged 1 commit intoweb3infra-foundation:mainfrom
benjamin-747:main
Mar 24, 2026
Merged

refactor orion-server#2034
genedna merged 1 commit intoweb3infra-foundation:mainfrom
benjamin-747:main

Conversation

@benjamin-747
Copy link
Collaborator

No description provided.

@benjamin-747 benjamin-747 requested a review from genedna March 24, 2026 12:00
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 37a09a6bec

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +243 to +245
let _ =
BuildEvent::update_build_complete_result(&build_id, exit_code, &state.conn)
.await;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Persist v1 build completion in builds table

This unified completion branch now handles both TaskBuildCompleteV2 and TaskBuildComplete, but it only calls BuildEvent::update_build_complete_result, which writes to callisto::build_events. V1 tasks created via /task are inserted into entity::builds (handle_immediate_task_dispatch uses BuildRepository::insert_build), so their end_at/exit_code are never updated on completion. In practice, completed v1 builds can remain stuck as running/pending in task/build status APIs because the authoritative builds row is left unchanged.

Useful? React with 👍 / 👎.

Comment on lines +268 to +270
let _ = TargetRepository::update_state(
&state.conn,
target_id,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Write v2 target completion to the correct target store

This completion path always updates target state through TargetRepository::update_state, which operates on the targets table, but v2 dispatch creates/stores target IDs in callisto::build_targets (BuildTarget::insert_default_target in handle_immediate_task_dispatch_v2). For v2 builds, this means completion updates are applied to the wrong table (or no-op), so target completion/error state is not persisted correctly for v2 status consumers.

Useful? React with 👍 / 👎.

let per_page = pagination.per_page.clamp(1u64, 100);
let offset = (page - 1) * per_page;
let mut total: u64 = 0;
let mut items: Vec<OrionClientInfo> = Vec::with_capacity(per_page as usize);

Check failure

Code scanning / CodeQL

Uncontrolled allocation size High

This allocation size is derived from a
user-provided value
and could allocate arbitrary amounts of memory.
@genedna genedna added this pull request to the merge queue Mar 24, 2026
Merged via the queue into web3infra-foundation:main with commit 1658ef6 Mar 24, 2026
9 of 10 checks passed
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors orion-server by splitting the previously monolithic api.rs into focused service modules, introducing an explicit entity/ + repository/ layering, and relocating OpenAPI configuration—while also cleaning up dead code allowances and unused code in adjacent crates.

Changes:

  • Extracted API logic into service/api_v2_service.rs and WebSocket handling into service/ws_service.rs; introduced app_state.rs and api_doc.rs.
  • Introduced orion-server/src/entity/* (SeaORM entities) and orion-server/src/repository/* (DB operations), removing older model/* entity modules and orion_common.
  • Removed various unused/dead-code items and dependencies across orion-server, ceres, mono, saturn, and api-model.

Reviewed changes

Copilot reviewed 41 out of 43 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
saturn/src/entitystore.rs Removes #[allow(dead_code)] on as_entities.
saturn/src/context.rs Removes #[allow(dead_code)] from error enums.
orion-server/src/service/ws_service.rs New WebSocket worker protocol handling extracted from API.
orion-server/src/service/target_status_cache_service.rs New target-status in-memory cache + query helpers.
orion-server/src/service/target_build_status_service.rs Refactors ActiveModel creation to use an input struct.
orion-server/src/service/mod.rs Exposes new service modules.
orion-server/src/service/api_v2_service.rs New extracted API implementation (task/build/log/worker endpoints).
orion-server/src/server.rs Switches to new AppState/ApiDoc placement and queue manager startup.
orion-server/src/scheduler.rs Updates imports and routes target operations through repository.
orion-server/src/repository/tasks.rs New task DB helper repository.
orion-server/src/repository/targets.rs New target DB helper repository (find-or-create + update state).
orion-server/src/repository/orion_tasks.rs Removes DTO definitions (moved to model/dto) and dead-code allows.
orion-server/src/repository/mod.rs New repository module root.
orion-server/src/repository/builds.rs New build DB helper repository.
orion-server/src/repository/build_targets.rs Updates TargetState import; keeps unused method behind allow.
orion-server/src/repository/build_events.rs New build-event repository helper.
orion-server/src/orion_common/model.rs Deleted (common paging types moved to api-model).
orion-server/src/orion_common/mod.rs Deleted (module removed).
orion-server/src/model/tasks.rs Deleted old SeaORM entity module (moved to entity/).
orion-server/src/model/task_status.rs New task-status enum module for API responses.
orion-server/src/model/targets.rs Deleted old SeaORM entity module (moved to entity/).
orion-server/src/model/mod.rs Narrows model module to DTOs + task status only.
orion-server/src/model/dto.rs New consolidated DTOs used by API + OpenAPI.
orion-server/src/model/build_events.rs Deleted old DTO/build-event helper (moved elsewhere).
orion-server/src/main.rs Converts binary to call into library crate (orion_server::...).
orion-server/src/lib.rs Exposes new modules (api_doc, app_state, entity, repository).
orion-server/src/entity/tasks.rs New SeaORM tasks entity module.
orion-server/src/entity/targets.rs New SeaORM targets entity module (+ TargetState + DTO helper).
orion-server/src/entity/mod.rs New entity module root.
orion-server/src/entity/builds.rs Removes model-layer helper methods from entity module.
orion-server/src/buck2.rs Removes unused download/exec helpers; keeps mono base URL setter.
orion-server/src/app_state.rs New shared application state (scheduler/conn/log/cache/shutdown).
orion-server/src/api_doc.rs New OpenAPI (utoipa) doc configuration module.
orion-server/src/api.rs Router refactor: delegates endpoint implementations to service modules.
orion-server/Cargo.toml Removes unused deps (tokio-retry, reqwest, jupiter).
mono/src/git_protocol/ssh.rs Removes dead-code allow from ClientMap.
mono/src/api/guard/cedar_guard.rs Removes unused Path import and dead helper function.
io-orbit/src/bin/migrate_local_to_s3.rs Minor control-flow simplification in task bounding logic.
ceres/src/pack/monorepo.rs Removes unused buck-search/diff helpers and unused imports.
ceres/src/model/change_list.rs Removes dead VerifyClPayload allow and deletes unused BuckFile.
api-model/src/buck2/ws.rs Removes #[allow(dead_code)] from WS protocol enum.
api-model/src/buck2/types.rs Removes #[allow(dead_code)] from TaskPhase enum.
api-model/src/buck2/api.rs Removes #[allow(dead_code)] from API request/response structs.

return ControlFlow::Continue(());
}

let current_worker_id = worker_id.as_ref().unwrap_or(&String::new()).clone();
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worker_id.as_ref().unwrap_or(&String::new()).clone() borrows a temporary String, which won't compile (temporary doesn't live long enough). Use an owned default (e.g., clone the Option<String> and unwrap_or_default()), or restructure to avoid referencing a temporary.

Copilot uses AI. Check for mistakes.
Comment on lines +308 to +314
Message::Close(_) => {
if let Some(id) = worker_id.take()
&& let Some(mut worker) = state.scheduler.workers.get_mut(&id)
{
worker.status = WorkerStatus::Lost;
}
return ControlFlow::Break(());
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Message::Close, worker_id.take() clears the id, so handle_socket's cleanup path treats the connection as unregistered and won't remove the worker entry. That keeps the UnboundedSender alive and can leave the send task blocked on rx.recv() indefinitely for lost workers. Avoid take() here (keep a separate immutable registered id for cleanup), and/or explicitly remove the worker entry / close the channel on close.

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +13
callisto::build_events::Entity::update_many()
.filter(callisto::build_events::Column::Id.eq(build_id.parse::<Uuid>().unwrap()))
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_id.parse::<Uuid>().unwrap() can panic on malformed input (this id comes from external sources like WebSocket messages). Return a proper DbErr/API error instead of panicking, or validate/parse the UUID at the call site and pass a Uuid into this method.

Suggested change
callisto::build_events::Entity::update_many()
.filter(callisto::build_events::Column::Id.eq(build_id.parse::<Uuid>().unwrap()))
let build_uuid = build_id
.parse::<Uuid>()
.map_err(|e| DbErr::Custom(format!("invalid build_id UUID `{build_id}`: {e}")))?;
callisto::build_events::Entity::update_many()
.filter(callisto::build_events::Column::Id.eq(build_uuid))

Copilot uses AI. Check for mistakes.
Comment on lines +42 to +46
while let Some(msg) = rx.recv().await {
let msg_str = serde_json::to_string(&msg).unwrap_or_default();
if sender
.send(Message::Text(Utf8Bytes::from(msg_str)))
.await
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serde_json::to_string(&msg).unwrap_or_default() will silently send an empty/invalid payload on serialization failure, which makes debugging client/server protocol issues harder. Prefer logging the serialization error and either skipping the send or closing the connection.

Copilot uses AI. Check for mistakes.
Comment on lines +611 to +615
let build_models = builds::Entity::find()
.filter(builds::Column::TaskId.eq(task.id))
.all(&state.conn)
.await
.unwrap_or_else(|_| vec![]);
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query also swallows DB errors by converting them to an empty vec. That can make builds disappear from the API response without surfacing an error to callers. Prefer returning an error response (or at least logging the error) instead of defaulting to empty.

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +77
tokio::select! {
_ = send_task => {},
result = recv_task => {
if let Ok(final_worker_id) = result {
worker_id = final_worker_id;
}
}
}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tokio::select! exits early if send_task finishes first, which drops recv_task and loses the final worker_id (so a registered worker may never get cleaned up). Consider coordinating the tasks so you always know the registered worker id before cleanup (e.g., share the id via Arc<Mutex<Option<String>>>, abort the other task after one completes, and then remove the worker deterministically).

Copilot uses AI. Check for mistakes.
Comment on lines +213 to +217
.filter(
builds::Column::Id.eq(build_id
.parse::<uuid::Uuid>()
.unwrap_or_else(|_| Uuid::nil())),
)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When updating builds.retry_count, an invalid build_id falls back to Uuid::nil(), which can silently update the wrong row or no-op without signaling an error. Handle the parse failure explicitly (log + skip, or return/break) rather than using a sentinel UUID.

Copilot uses AI. Check for mistakes.
Comment on lines +606 to +610
let target_models = targets::Entity::find()
.filter(targets::Column::TaskId.eq(task.id))
.all(&state.conn)
.await
.unwrap_or_else(|_| vec![]);
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These queries swallow DB errors by converting them to empty vectors (unwrap_or_else(|_| vec![])), which can return partial/incorrect task info with a 200 response while hiding the underlying failure. Consider propagating the DB error up so the handler can return 5xx (or at least log the error).

Copilot uses AI. Check for mistakes.
Comment on lines +11 to +25
pub async fn get_builds_by_task_id(
task_id: Uuid,
db: &sea_orm::DatabaseConnection,
) -> Option<Vec<Uuid>> {
match crate::entity::builds::Entity::find()
.filter(crate::entity::builds::Column::TaskId.eq(task_id))
.all(db)
.await
{
Ok(builds) => Some(builds.into_iter().map(|build| build.id).collect()),
Err(e) => {
tracing::error!("Failed to fetch builds for task_id {}: {}", task_id, e);
None
}
}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_builds_by_task_id queries only the builds table, so a non-existent task is indistinguishable from an existing task with zero builds (it returns Some(vec![]) in both cases). This makes it impossible for /task-build-list/{id} to reliably return 404 for missing tasks as implied by the handler logic; consider checking task existence (tasks table) or returning a richer result type.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants