Skip to content

feat: plug_plabels#18

Merged
dman-os merged 10 commits intomainfrom
feat/plug/plabels
Apr 1, 2026
Merged

feat: plug_plabels#18
dman-os merged 10 commits intomainfrom
feat/plug/plabels

Conversation

@dman-os
Copy link
Copy Markdown
Owner

@dman-os dman-os commented Apr 1, 2026

Summary by CodeRabbit

  • New Features

    • CI now builds/imports plugin OCI artifacts; blob staging preserves filenames/extensions.
    • Added Dayledger (ledger/accounting) and Pseudo‑Label (image/text labeling + candidate learning) plugins.
    • Runtime init system now supports dependency‑aware init sequencing with multiple run modes and gated dispatch activation.
  • Refactor

    • Consolidated test support harness for isolated runtime testing.
  • Documentation

    • Updated contributor/test guidance and preferred Rust test runner; increased CI test retry count.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2787abae-4246-4b46-afd0-06f92539b55a

📥 Commits

Reviewing files that changed from the base of the PR and between 0d385a2 and 46d5c38.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock and included by **/*
📒 Files selected for processing (2)
  • src/daybook_core/Cargo.toml
  • src/xtask/Cargo.toml
✅ Files skipped from review due to trivial changes (2)
  • src/daybook_core/Cargo.toml
  • src/xtask/Cargo.toml

📝 Walkthrough

Walkthrough

Adds OCI plug build/import tooling and two new plug crates; implements an InitRepo with dependency-aware init dispatching; adds blob staging materialization and cleanup; migrates manifest types into daybook_types; extracts shared test support; refactors dispatch lifecycle and wash-plugin/config-doc wiring; moves pseudo-label workflows into plug_plabels.

Changes

Cohort / File(s) Summary
CI & Workspace
Cargo.toml, .github/workflows/checks.yml, flake.nix
Added workspace members src/plug_dayledger, src/plug_plabels; added semver and OCI deps; CI now runs two cargo run --manifest-path …/manifest.rs xtask steps to build OCI artifacts and increases nextest retries; maestro added to devTools; removed openssl from dioxus build inputs.
Xtask / OCI tooling
src/xtask/main.rs, src/xtask/Cargo.toml
New xtask subcommand to build plug OCI artifacts: runs plug manifest binary, optionally builds WASM components, writes blobs (blobs/sha256/<digest>), rewrites componentUrls to oci://sha256:<digest>, and emits OCI layout/index under <out_root>/<manifest.id()>.
New plug crates
src/plug_dayledger/..., src/plug_plabels/...
Added plug_dayledger and plug_plabels crates (Cargo.toml, lib, manifest binaries, types). plug_plabels introduces pseudo-label types, label engine, wflows, WIT deps, and e2e helpers.
Blob staging & repo init
src/daybook_core/blobs.rs, src/daybook_core/repo.rs
Added BlobMaterializeRequest, BlobsRepo::materialize/materialize_with_meta_extension/cleanup_staging; repo startup now calls staging cleanup before DB/sql setup.
Init subsystem & Rt integration
src/daybook_core/rt/init.rs, src/daybook_core/rt.rs, src/daybook_core/app.rs, src/daybook_core/lib.rs
New InitRepo/InitStore with per-install/per-node/per-boot completion tracking, change notification loop, is_done/mark_done/running-dispatch APIs; Rt gains init_repo, boot accepts SQL pool and enqueues init dispatches; app reconciler sets InitStore prop.
Dispatch lifecycle & coordination
src/daybook_core/rt/dispatch.rs, src/daybook_core/rt.rs
Introduced DispatchStatus (Waiting/Active/Succeeded/Failed/Cancelled), persisted dispatches map, waiting/dependency APIs (activate/list/remove/set_waiting_failed), on-success hooks, deferred activation, and revised completion semantics (complete replacing remove).
Plugs & OCI import
src/daybook_core/plugs.rs, related files
Switched manifest imports to daybook_types::manifest; system plugs include inits; PlugsRepo gains per-plug config-doc ID management, ConfigDocsChanged events, and OCI import APIs that validate/rewrite componentUrls. ACL and facet-ref validation switched to config_facet_acl and daybook_types::reference.
Wash plugin / ML tooling
src/daybook_core/rt/wash_plugin.rs, .../caps.rs, .../mltools.rs
Removed pseudo-label conversions from core; DaybookPlugin now holds PlugsRepo; facet token resolution uses per-plug config docs; blob hash extraction separated and image materialization uses BlobsRepo.materialize; MIME→extension mapping added.
Test support extraction
src/daybook_core/test_support.rs, src/daybook_core/e2e.rs
Extracted DaybookTestContext and constructors into test_support; e2e.rs now pub use crate::test_support::*; and drops local harness impls.
Core pseudo-label removal / relocation
src/daybook_wflows/*, src/daybook_types/*
Removed core pseudo-label/classify/learn workflows and related PseudoLabel* types from core/daybook_types; corresponding logic reimplemented in plug_plabels with new types, wflows, and tests.
daybook_types API & macros
src/daybook_types/lib.rs, src/daybook_types/macros.rs, src/daybook_types/manifest.rs
Added manifest feature/module; moved/exported macros into macros.rs; added InitManifest/InitRunMode and PlugManifest.inits; renamed ACL accessors from config_prop_aclconfig_facet_acl; conditionalize reconcile derives.
WIT / wasi deps for plug_plabels
src/plug_plabels/wit/...
Added many WIT package files and dependency pointers (wasi:io, http, filesystem, clocks, keyvalue, logging, random, postgres, townframe bundle) needed by plug_plabels WASM components.
Docs & small changes
AGENTS.md, CONTRIBUTING.md, src/daybook_sdk/Cargo.toml, src/daybook_core/build.rs, src/daybook_core/Cargo.toml
Docs: updated test guidance and prefer cargo nextest; added daybook_sdk manifest; minor formatting in build.rs; added test-support feature, OCI deps, and workspace semver/sha2/oci libs in Cargo.tomls.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(2,117,216,0.5)
    participant CLI as Xtask CLI
    participant FS as Filesystem
    participant Cargo as Cargo
    end
    rect rgba(34,197,94,0.5)
    participant Rt as Rt
    participant InitRepo as InitRepo
    participant DispatchRepo as DispatchRepo
    participant SQL as SQLitePool
    end

    CLI->>Cargo: run plug manifest binary (manifest.rs)
    Cargo->>FS: produce manifest JSON / build WASM (if needed)
    CLI->>FS: write blobs/sha256/<digest>, manifest.json, index.json, oci-layout
    Rt->>InitRepo: InitRepo::load(big_repo, app_doc_id, actor, sql_pool)
    InitRepo->>SQL: ensure init_per_node table
    InitRepo-->>Rt: spawn notification loop
    Rt->>DispatchRepo: ensure_plug_init_dispatches(plug_id)
    alt unresolved dependencies
        DispatchRepo->>DispatchRepo: create dispatch(status=Waiting), record waiting_on_dispatch_ids
    else dependencies resolved
        DispatchRepo->>DispatchRepo: create dispatch(status=Active) -> schedule wflow job
    end
    DispatchRepo->>DispatchRepo: on completion -> complete(id, Succeeded/Failed) -> release waiting dispatches
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I nibble bytes and spin a thread,

I pack each plug in blobs of red,
Inits queued and dispatches wait,
Manifests sail through OCI's gate,
Hopping tests—builds look fed!

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/plug/plabels

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Nitpick comments (10)
Cargo.toml (1)

61-61: Consider adding plug_dayledger to workspace dependencies for consistency.

plug_plabels is added to [workspace.dependencies] but plug_dayledger (also added as a workspace member) is not. If plug_dayledger might be used as a dependency by other crates, consider adding it here for consistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Cargo.toml` at line 61, The workspace currently lists plug_plabels under
[workspace.dependencies] but omits plug_dayledger; add plug_dayledger = { path =
"./src/plug_dayledger", default-features = false } to the
[workspace.dependencies] section in Cargo.toml so plug_dayledger is available as
a workspace dependency (mirror the entry style used for plug_plabels) to keep
dependency declarations consistent and allow other crates to reference it.
.github/workflows/checks.yml (1)

75-78: LGTM!

The new CI step correctly builds OCI artifacts for both plug crates using the established nix develop .#ci-rust pattern. The sequential builds are appropriate for verification purposes.

Consider uploading the generated OCI artifacts as workflow artifacts (similar to the gradle test reports on line 84-88) if they're useful for downstream testing or deployment validation. This is entirely optional and depends on your artifact consumption needs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/checks.yml around lines 75 - 78, Add an artifacts upload
step after the "build plug OCI artifacts" step to persist the generated OCI
files: after the two nix/cargo commands in the job (the step named "build plug
OCI artifacts"), add an actions/upload-artifact@v4 step that references the
output paths produced by the xtask target (the files/directories created by the
cargo run -p xtask -- build-plug-oci commands for ./src/plug_dayledger and
./src/plug_plabels) and give sensible artifact names (e.g., plug_dayledger_oci,
plug_plabels_oci); ensure the upload step runs on success and uses the correct
path patterns so downstream workflows or reviewers can download the built OCI
artifacts.
src/daybook_wflows/lib.rs (1)

71-122: Consider removing commented-out helper functions.

These functions (row_text, row_i64, row_blob, embedding_bytes_to_f32) are now dead code. If they're needed in plug_plabels, they should be relocated there; otherwise, they should be removed to keep the codebase clean.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_wflows/lib.rs` around lines 71 - 122, The commented-out helper
functions row_text, row_i64, row_blob and embedding_bytes_to_f32 are dead code;
either delete these commented blocks from src/daybook_wflows/lib.rs or move the
implementations into the module that needs them (e.g., plug_plabels) and restore
as active functions; if moving, ensure signatures and return types
(Res<Vec<f32>> for embedding_bytes_to_f32) match existing callers and update
imports/visibility (pub(crate) as needed) so callers compile.
src/daybook_types/Cargo.toml (1)

52-54: Remove the commented-out time dependency.

If the time dependency is no longer needed, remove it entirely rather than leaving it commented out.

🧹 Suggested cleanup
 garde = { workspace = true, optional = true }
 semver = { workspace = true, optional = true }
-# time = { workspace = true, features = ["serde"] }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_types/Cargo.toml` around lines 52 - 54, Remove the commented-out
dependency line for time in Cargo.toml (the line starting with "# time = {
workspace = true, features = [\"serde\"] }") so the file no longer contains
dead/commented dependency entries; simply delete that commented line to clean up
the manifest.
src/plug_plabels/wflows/label_image.rs (2)

56-74: Silent early returns for unsupported configurations.

The multiple Ok(()) returns for unsupported embedding types/dimensions/sources are intentional no-ops, allowing other processors to potentially handle these cases. Consider adding brief comments explaining why these conditions skip processing, to aid future maintainers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/plug_plabels/wflows/label_image.rs` around lines 56 - 74, The sequence of
silent early returns in functions checking embedding properties (the
embedding.dtype/compression guard, the embedding.dim and embedding.model_tag
check against NOMIC_VISION_MODEL_ID, the parse_facet_ref match, and the
parsed_ref.facet_key.tag check for WellKnownFacetTag::Blob) should be annotated
with brief comments stating they are intentional no-ops for unsupported
embeddings so other processors can handle them; add one-line comments above each
conditional explaining which unsupported case is being skipped (e.g., "skip
non-f32 or compressed embeddings — allow other processors to handle"), "skip
wrong dimension or model", "skip invalid facet ref", and "skip non-blob facets",
referring to the exact symbols (embedding.dtype, embedding.compression,
embedding.dim, embedding.model_tag, daybook_types::url::parse_facet_ref,
parsed_ref.facet_key.tag) so maintainers understand the rationale.

76-80: Consistent validation pattern, but consider consolidating if dimension validation is redundant.

Lines 76-80 convert embedding bytes → f32 slice → bytes → JSON, which mirrors the standard pattern used in label_engine.rs::embedding_vec_to_json() and elsewhere in the codebase. Each step serves validation: embedding_bytes_to_f32 ensures valid f32 data (length multiple of 4), and embedding_f32_bytes_to_json validates dimensions match 768. Since you've already validated embedding.dim == 768 at line 60, the dimension check at line 79 is redundant. If that validation is guaranteed by the facet structure, the conversion sequence could be simplified or extracted to a reusable helper that avoids the round-trip validation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/plug_plabels/wflows/label_image.rs` around lines 76 - 80, You are doing
duplicate validation when converting embedding bytes to JSON: after checking
embedding.dim == 768 earlier, you still round-trip via embedding_bytes_to_f32 ->
embedding_f32_slice_to_le_bytes -> embedding_f32_bytes_to_json which repeats the
dimension check; to fix, consolidate by calling the shared helper used in
label_engine.rs (embedding_vec_to_json) or extract a new helper that takes the
raw embedding.vector and the known dim (768) and returns JSON without
re-checking dimensions, or remove the redundant dimension validation call inside
this flow and rely on the prior embedding.dim check; locate and update uses of
embedding_bytes_to_f32 and embedding_f32_bytes_to_json in this file
(label_image.rs) to use the shared helper or a simplified conversion path.
src/plug_plabels/wflows/label_note.rs (1)

56-66: Inconsistent error handling vs. label_image.rs.

In label_image.rs, mismatched model/dimensions result in a silent Ok(()) return, while here it's a Terminal error. This may be intentional since label_note generates its own embeddings (and should control the model), whereas label_image processes pre-existing embeddings that may vary. If intentional, consider adding a comment explaining this design difference.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/plug_plabels/wflows/label_note.rs` around lines 56 - 66, The check in
label_note.rs that returns a wflow_sdk::JobErrorX::Terminal when
embed_result.model_id or dimensions mismatch is inconsistent with label_image.rs
(which silently returns Ok(())); if this difference is intentional (because
label_note generates its own embeddings and must enforce model/dimension), add a
concise comment above the if block explaining that rationale and referencing
that label_image.rs intentionally tolerates varying incoming embeddings, so the
Terminal error here is deliberate; otherwise, change the behavior to match
label_image.rs (silent Ok(()))—the relevant symbols to update are embed_result,
NOMIC_TEXT_MODEL_ID, and the wflow_sdk::JobErrorX::Terminal return.
src/daybook_core/blobs.rs (1)

1052-1078: Add a reference-blob materialize regression test.

These new tests only exercise put(), so they won't catch the put_path_reference() case where staging semantics are different. A test that materializes a referenced blob and then mutates/removes the original file would lock the intended snapshot behavior. Based on learnings, "When working on Rust test code, use tools like snapshot tests, TDD, and repository macros to improve test longevity and usefulness."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_core/blobs.rs` around lines 1052 - 1078, The tests only cover
repo.put() and miss the reference-blob code path; add a new async test that uses
setup() then repo.put_path_reference(...) to create a referenced blob, call
repo.materialize(...) (using BlobMaterializeRequest::Filename or ::Extension) to
produce the staged file, then mutate or delete the original referenced file and
assert the materialized output remains unchanged/existing (and that
repo.cleanup_staging() still removes it); reference the functions
put_path_reference, materialize, cleanup_staging, and BlobMaterializeRequest to
locate and implement the test.
src/plug_plabels/lib.rs (1)

298-341: Use a routine contract that matches the learner workflows.

Both learn-*label-candidates manifests request a writable pseudo-label working facet, but their entrypoints only read note/blob/embedding data and update config/local state. That extra capability is unused today and makes these commands depend on a synthetic working-facet contract that the workflows themselves do not need.

Also applies to: 345-383

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/plug_plabels/lib.rs` around lines 298 - 341, The manifests for
"learn-image-label-candidates" (RoutineManifest with RoutineImpl::Wflow)
incorrectly request a writable working facet PlabelFacetTag::PseudoLabel and
include write=true in the facet_acl, even though the workflow only reads
blob/embedding/notes and updates config/local state; change the
working_facet_tag/facet_acl entries for PlabelFacetTag::PseudoLabel to read-only
(set write: false or remove write permission) or remove the working facet
entirely so the routine contract matches actual usage, and apply the same change
to the other learn-*label-candidates manifest that mirrors this pattern (update
RoutineManifest.deets -> working_facet_tag and RoutineFacetAccess for
PlabelFacetTag::PseudoLabel, and verify config_facet_acl and local_state_acl
remain as-is).
src/daybook_core/rt/init.rs (1)

381-396: Unconditional replace() call even when no change is made.

In clear_running_dispatch, the map is always replaced even when the dispatch_id doesn't match. This creates unnecessary version churn in the CRDT.

♻️ Suggested optimization
     pub async fn clear_running_dispatch(&self, init_id: &str, dispatch_id: &str) -> Res<()> {
         let init_id = init_id.to_string();
         let dispatch_id = dispatch_id.to_string();
         self.store
             .mutate_sync(move |store| {
                 let mut map = store.running_dispatches.val.0.clone();
                 if map.get(&init_id) == Some(&dispatch_id) {
                     map.remove(&init_id);
+                    store
+                        .running_dispatches
+                        .replace(self.local_actor_id.clone(), ThroughJson(map));
                 }
-                store
-                    .running_dispatches
-                    .replace(self.local_actor_id.clone(), ThroughJson(map));
             })
             .await?;
         Ok(())
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_core/rt/init.rs` around lines 381 - 396, The code in
clear_running_dispatch always calls store.running_dispatches.replace(...) even
when nothing changed; modify the mutate_sync closure in clear_running_dispatch
so it only calls replace when the map was actually mutated: clone or obtain
current map into a local variable (as now), check if map.get(&init_id) ==
Some(&dispatch_id) and only then remove(&init_id) and call
replace(self.local_actor_id.clone(), ThroughJson(map)); otherwise leave store
unmodified (do not call replace) to avoid unnecessary CRDT version churn.
Reference: clear_running_dispatch, self.store.mutate_sync,
running_dispatches.replace.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/daybook_core/blobs.rs`:
- Around line 419-441: The materialize() implementation currently uses
get_path(), which can point at mutable live files; change it to materialize from
the immutable store/object file instead: resolve the blob's immutable object
path (e.g. the stored "objects/.../*.blob" location or via an export_from_store
method) rather than get_path(), ensure you export the object into the store if
not present, then hard-link or atomic_copy_file from that immutable object path
into the staging_dir; update references in materialize() and use self.root and
the repository/store export API (or a new helper like
export_object_if_missing(hash) returning the object PathBuf) so staging always
uses a stable snapshot instead of the live source.

In `@src/daybook_core/plugs.rs`:
- Around line 1189-1226: The OCI client is created with
Client::new(oci_client::client::ClientConfig::default()) inside
import_from_oci_registry which leaves reqwest timeouts unlimited; update the
ClientConfig construction used for Client::new to build a config that sets
explicit connect/read/request timeouts (e.g., via the underlying reqwest::Client
builder) so pulls like client.pull_blob use bounded timeouts; ensure the chosen
timeouts are reasonable for large layer downloads and propagate the new
ClientConfig into Client::new where referenced.

In `@src/daybook_core/rt.rs`:
- Around line 1119-1126: The current ActiveDispatchDeets::Wflow construction
reserves a wflow_job_id for waiting dispatches but has no entry_id until
start_waiting_dispatch enqueues the job, so cancel_dispatch must not forward
cancellations for dispatches that were never enqueued; update cancel_dispatch
(and the other similar cancel path around the other block) to only call
wflow_ingress.cancel_job(...) when the dispatch actually has an
entry_id/enqueued (e.g., check that entry_id.is_some() or that the dispatch is
not in the "waiting/reserved" state) and otherwise skip/short-circuit the
cancel_job call, using the ActiveDispatchDeets::Wflow fields (wflow_job_id,
entry_id) and start_waiting_dispatch semantics to determine enqueued vs
reserved.
- Around line 585-595: The code sets the init as running via
InitRepo::set_running_dispatch(&init_id, &dispatch_id) before dispatching, but
when a waiting init is later marked failed (the block around where the waiting
dispatch is marked failed — see the logic that marks the waiting dispatch failed
between lines ~856–868), it never clears that running marker so
get_running_dispatch(init_id) returns a dead id; to fix this, after marking the
waiting dispatch failed add a call to clear the init running marker (use the
InitRepo API — e.g. self.init_repo.clear_running_dispatch(&init_id) or an
equivalent method on InitRepo) so the running marker is removed, and ensure the
same cleanup is applied in the error/early-exit paths (and keep the existing
InitMarkDone cleanup in handle_wflow_entry intact).
- Around line 482-492: The dep_base_id function must be made fallible: change
its signature to return eyre::Result<String> (or Result<String, eyre::Report>)
and replace the unwrap()/expect() calls by validating inputs and calling
eyre::bail!("<offending id>") on malformed ids; specifically, in
dep_base_id(const dep_id_full) use dep_id_full.strip_prefix('@').ok_or_else(||
eyre::eyre!(...)) or check the result before splitting, ensure parts.get(0)
exists before formatting and bail! with dep_id_full if missing, and in the non-@
branch ensure split('@').next() returns Some and otherwise bail! with the
offending dep_id_full. Use eyre::bail! so callers can handle the error instead
of panicking.

In `@src/daybook_core/rt/wash_plugin/mltools.rs`:
- Around line 119-129: The function extension_for_blob_mime currently matches
the raw token from mime.split(';').next() and so rejects mixed-case values and
never hits the intended empty check; update extension_for_blob_mime to first
extract the token, trim it, convert it to lowercase, then explicitly handle an
empty token (returning the "empty image mime" Err) before matching against
lowercase mime types like "image/jpeg", "image/png", etc.; reference the
existing token extraction (mime.split(';').next()) and perform
token.trim().to_lowercase() (or equivalent) and then match on that normalized
string to return the correct Ok("jpg"/"png"/...) or the unsupported error.

In `@src/daybook_core/test_support.rs`:
- Around line 228-246: The test harness currently calls
plugs_repo.ensure_system_plugs().await? after crate::rt::Rt::boot(...).await? so
Rt::boot() misses system plugs and queued init dispatches are skipped; move the
call to plugs_repo.ensure_system_plugs().await? to run before
crate::rt::Rt::boot(...) (i.e., ensure system plugs are present in plugs_repo
prior to invoking Rt::boot) so Rt::boot() will see the plugs and enqueue their
startup dispatches.

In `@src/plug_plabels/wflows/label_engine.rs`:
- Around line 253-288: The code currently queries and clears the "is_current"
label set globally instead of scoped to a facet_key, causing label sets to be
mixed across facets; update the SELECT that populates current_version_rows to
include "WHERE facet_key = ? AND is_current = 1" and bind config_facet_key, and
change the later UPDATE/clear statement (the one clearing is_current around
lines ~386-400) to only clear rows for that facet (e.g. "UPDATE
image_label_label_set_versions SET is_current = 0 WHERE facet_key = ?") so all
operations in this flow (the SELECT used to compute current_version_id and the
UPDATE that clears current flags) are filtered by config_facet_key (refer to the
variables/config: current_version_rows, current_version_id, config_facet_key and
the UPDATE that clears is_current).

In `@src/plug_plabels/wflows/learn_image_label_candidates.rs`:
- Around line 285-313: The code in the learn_image_label_candidates.rs block
that reads cache_rows (using row_text(row, "model_tag"), row_i64(row, "dim"),
row_blob(row, "vector"), and embedding_bytes_to_f32) should not turn malformed
or stale cache rows into a terminal JobErrorX; instead treat them as
cache-misses: on missing/invalid model_tag/dim/vector, on embedding_bytes_to_f32
errors, or on dim != vector.len() log or wrap the error as a warning and
continue as if the row wasn't present (so the code will recompute the embedding
and overwrite the cache row), rather than returning Err; apply the identical
change to the duplicate helper in learn_note_label_candidates.rs and ensure rows
are updated when a fresh embedding is computed.

In `@src/plug_plabels/wflows/learn_note_label_candidates.rs`:
- Around line 70-71: The prompt currently injects note.content verbatim into
build_note_prompt (see the llm_text call) which can exceed model context/cost;
truncate or snapshot the note before building the prompt (e.g., implement a
helper like get_note_snippet(content, MAX_NOTE_PROMPT_LEN) or a constant
MAX_NOTE_PROMPT_LEN and pass get_note_snippet(&note.content) to
build_note_prompt), replacing direct uses of note.content in the llm_text call
and the other occurrences around the 108–136 range so all LLM prompt
construction uses the bounded excerpt.

In `@src/plug_plabels/wit/deps/wasi-random-0.2.6/package.wit`:
- Around line 1-7: The package currently only defines the random interface; add
the missing insecure-seed and insecure interfaces from the official
wasi:random@0.2.6 spec to this package.wit so the package exports all three
interfaces: insecure-seed (with the insecure-seed tuple<u64, u64> symbol) and
insecure (with get-insecure-random-bytes and get-insecure-random-u64 functions)
alongside the existing random interface (get-random-bytes, get-random-u64);
insert these new interface blocks into the same package declaration to match the
official spec.

In `@src/xtask/main.rs`:
- Around line 670-687: The static component path (static_name) may start with a
leading slash which makes Path::join treat it as absolute; before creating
wasm_name and joining into wasm_path in this block (variables static_name,
wasm_name, wasm_path), normalize the path by trimming leading '/' (e.g., use
static_name.trim_start_matches('/') when deriving wasm_name or before
strip_suffix) so that joining with wasm_target_dir/wasm32-wasip2/release yields
the intended relative path—match the same normalization used in
read_build_component_bytes.

---

Nitpick comments:
In @.github/workflows/checks.yml:
- Around line 75-78: Add an artifacts upload step after the "build plug OCI
artifacts" step to persist the generated OCI files: after the two nix/cargo
commands in the job (the step named "build plug OCI artifacts"), add an
actions/upload-artifact@v4 step that references the output paths produced by the
xtask target (the files/directories created by the cargo run -p xtask --
build-plug-oci commands for ./src/plug_dayledger and ./src/plug_plabels) and
give sensible artifact names (e.g., plug_dayledger_oci, plug_plabels_oci);
ensure the upload step runs on success and uses the correct path patterns so
downstream workflows or reviewers can download the built OCI artifacts.

In `@Cargo.toml`:
- Line 61: The workspace currently lists plug_plabels under
[workspace.dependencies] but omits plug_dayledger; add plug_dayledger = { path =
"./src/plug_dayledger", default-features = false } to the
[workspace.dependencies] section in Cargo.toml so plug_dayledger is available as
a workspace dependency (mirror the entry style used for plug_plabels) to keep
dependency declarations consistent and allow other crates to reference it.

In `@src/daybook_core/blobs.rs`:
- Around line 1052-1078: The tests only cover repo.put() and miss the
reference-blob code path; add a new async test that uses setup() then
repo.put_path_reference(...) to create a referenced blob, call
repo.materialize(...) (using BlobMaterializeRequest::Filename or ::Extension) to
produce the staged file, then mutate or delete the original referenced file and
assert the materialized output remains unchanged/existing (and that
repo.cleanup_staging() still removes it); reference the functions
put_path_reference, materialize, cleanup_staging, and BlobMaterializeRequest to
locate and implement the test.

In `@src/daybook_core/rt/init.rs`:
- Around line 381-396: The code in clear_running_dispatch always calls
store.running_dispatches.replace(...) even when nothing changed; modify the
mutate_sync closure in clear_running_dispatch so it only calls replace when the
map was actually mutated: clone or obtain current map into a local variable (as
now), check if map.get(&init_id) == Some(&dispatch_id) and only then
remove(&init_id) and call replace(self.local_actor_id.clone(),
ThroughJson(map)); otherwise leave store unmodified (do not call replace) to
avoid unnecessary CRDT version churn. Reference: clear_running_dispatch,
self.store.mutate_sync, running_dispatches.replace.

In `@src/daybook_types/Cargo.toml`:
- Around line 52-54: Remove the commented-out dependency line for time in
Cargo.toml (the line starting with "# time = { workspace = true, features =
[\"serde\"] }") so the file no longer contains dead/commented dependency
entries; simply delete that commented line to clean up the manifest.

In `@src/daybook_wflows/lib.rs`:
- Around line 71-122: The commented-out helper functions row_text, row_i64,
row_blob and embedding_bytes_to_f32 are dead code; either delete these commented
blocks from src/daybook_wflows/lib.rs or move the implementations into the
module that needs them (e.g., plug_plabels) and restore as active functions; if
moving, ensure signatures and return types (Res<Vec<f32>> for
embedding_bytes_to_f32) match existing callers and update imports/visibility
(pub(crate) as needed) so callers compile.

In `@src/plug_plabels/lib.rs`:
- Around line 298-341: The manifests for "learn-image-label-candidates"
(RoutineManifest with RoutineImpl::Wflow) incorrectly request a writable working
facet PlabelFacetTag::PseudoLabel and include write=true in the facet_acl, even
though the workflow only reads blob/embedding/notes and updates config/local
state; change the working_facet_tag/facet_acl entries for
PlabelFacetTag::PseudoLabel to read-only (set write: false or remove write
permission) or remove the working facet entirely so the routine contract matches
actual usage, and apply the same change to the other learn-*label-candidates
manifest that mirrors this pattern (update RoutineManifest.deets ->
working_facet_tag and RoutineFacetAccess for PlabelFacetTag::PseudoLabel, and
verify config_facet_acl and local_state_acl remain as-is).

In `@src/plug_plabels/wflows/label_image.rs`:
- Around line 56-74: The sequence of silent early returns in functions checking
embedding properties (the embedding.dtype/compression guard, the embedding.dim
and embedding.model_tag check against NOMIC_VISION_MODEL_ID, the parse_facet_ref
match, and the parsed_ref.facet_key.tag check for WellKnownFacetTag::Blob)
should be annotated with brief comments stating they are intentional no-ops for
unsupported embeddings so other processors can handle them; add one-line
comments above each conditional explaining which unsupported case is being
skipped (e.g., "skip non-f32 or compressed embeddings — allow other processors
to handle"), "skip wrong dimension or model", "skip invalid facet ref", and
"skip non-blob facets", referring to the exact symbols (embedding.dtype,
embedding.compression, embedding.dim, embedding.model_tag,
daybook_types::url::parse_facet_ref, parsed_ref.facet_key.tag) so maintainers
understand the rationale.
- Around line 76-80: You are doing duplicate validation when converting
embedding bytes to JSON: after checking embedding.dim == 768 earlier, you still
round-trip via embedding_bytes_to_f32 -> embedding_f32_slice_to_le_bytes ->
embedding_f32_bytes_to_json which repeats the dimension check; to fix,
consolidate by calling the shared helper used in label_engine.rs
(embedding_vec_to_json) or extract a new helper that takes the raw
embedding.vector and the known dim (768) and returns JSON without re-checking
dimensions, or remove the redundant dimension validation call inside this flow
and rely on the prior embedding.dim check; locate and update uses of
embedding_bytes_to_f32 and embedding_f32_bytes_to_json in this file
(label_image.rs) to use the shared helper or a simplified conversion path.

In `@src/plug_plabels/wflows/label_note.rs`:
- Around line 56-66: The check in label_note.rs that returns a
wflow_sdk::JobErrorX::Terminal when embed_result.model_id or dimensions mismatch
is inconsistent with label_image.rs (which silently returns Ok(())); if this
difference is intentional (because label_note generates its own embeddings and
must enforce model/dimension), add a concise comment above the if block
explaining that rationale and referencing that label_image.rs intentionally
tolerates varying incoming embeddings, so the Terminal error here is deliberate;
otherwise, change the behavior to match label_image.rs (silent Ok(()))—the
relevant symbols to update are embed_result, NOMIC_TEXT_MODEL_ID, and the
wflow_sdk::JobErrorX::Terminal return.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5275b91c-635a-4dcf-8837-1ac2b52779a9

📥 Commits

Reviewing files that changed from the base of the PR and between ee86a06 and 478e126.

⛔ Files ignored due to path filters (4)
  • Cargo.lock is excluded by !**/*.lock and included by **/*
  • docs/DEVDOC/log.md is excluded by !docs/DEVDOC/** and included by **/*
  • docs/DEVDOC/todo.md is excluded by !docs/DEVDOC/** and included by **/*
  • flake.lock is excluded by !**/*.lock and included by **/*
📒 Files selected for processing (82)
  • .github/workflows/checks.yml
  • AGENTS.md
  • CONTRIBUTING.md
  • Cargo.toml
  • flake.nix
  • src/daybook_cli/main.rs
  • src/daybook_core/Cargo.toml
  • src/daybook_core/app.rs
  • src/daybook_core/blobs.rs
  • src/daybook_core/build.rs
  • src/daybook_core/config.rs
  • src/daybook_core/drawer.rs
  • src/daybook_core/e2e.rs
  • src/daybook_core/index/doc_blobs.rs
  • src/daybook_core/index/facet_ref.rs
  • src/daybook_core/lib.rs
  • src/daybook_core/plugs.rs
  • src/daybook_core/repo.rs
  • src/daybook_core/rt.rs
  • src/daybook_core/rt/dispatch.rs
  • src/daybook_core/rt/init.rs
  • src/daybook_core/rt/switch.rs
  • src/daybook_core/rt/triage.rs
  • src/daybook_core/rt/wash_plugin.rs
  • src/daybook_core/rt/wash_plugin/caps.rs
  • src/daybook_core/rt/wash_plugin/mltools.rs
  • src/daybook_core/test_support.rs
  • src/daybook_ffi/repos/config.rs
  • src/daybook_sdk/Cargo.toml
  • src/daybook_types/Cargo.toml
  • src/daybook_types/doc.rs
  • src/daybook_types/lib.rs
  • src/daybook_types/macros.rs
  • src/daybook_types/manifest.rs
  • src/daybook_types/reference.rs
  • src/daybook_types/test.rs
  • src/daybook_types/wit.rs
  • src/daybook_wflows/lib.rs
  • src/daybook_wflows/wflows/classify_image_label.rs
  • src/daybook_wflows/wflows/learn_image_label_proposals.rs
  • src/daybook_wflows/wflows/mod.rs
  • src/daybook_wflows/wflows/pseudo_labeler.rs
  • src/plug_dayledger/Cargo.toml
  • src/plug_dayledger/lib.rs
  • src/plug_dayledger/manifest.rs
  • src/plug_dayledger/types.rs
  • src/plug_plabels/Cargo.toml
  • src/plug_plabels/e2e.rs
  • src/plug_plabels/e2e/common.rs
  • src/plug_plabels/e2e/image_label_nomic_wflow.rs
  • src/plug_plabels/e2e/image_label_screenshot_meme_wflow.rs
  • src/plug_plabels/e2e/learned_image_label_proposals_wflow.rs
  • src/plug_plabels/lib.rs
  • src/plug_plabels/manifest.rs
  • src/plug_plabels/types.rs
  • src/plug_plabels/wflows/label_engine.rs
  • src/plug_plabels/wflows/label_image.rs
  • src/plug_plabels/wflows/label_note.rs
  • src/plug_plabels/wflows/learn_algo.rs
  • src/plug_plabels/wflows/learn_image_label_candidates.rs
  • src/plug_plabels/wflows/learn_note_label_candidates.rs
  • src/plug_plabels/wflows/mod.rs
  • src/plug_plabels/wit/deps/api-utils
  • src/plug_plabels/wit/deps/daybook
  • src/plug_plabels/wit/deps/daybook-types
  • src/plug_plabels/wit/deps/mltools
  • src/plug_plabels/wit/deps/townframe-sql
  • src/plug_plabels/wit/deps/utils
  • src/plug_plabels/wit/deps/wasi-cli-0.2.6/package.wit
  • src/plug_plabels/wit/deps/wasi-clocks-0.2.6/package.wit
  • src/plug_plabels/wit/deps/wasi-config-0.2.0-draft/package.wit
  • src/plug_plabels/wit/deps/wasi-filesystem-0.2.6/package.wit
  • src/plug_plabels/wit/deps/wasi-http-0.2.6/package.wit
  • src/plug_plabels/wit/deps/wasi-io-0.2.6/package.wit
  • src/plug_plabels/wit/deps/wasi-keyvalue-0.2.0-draft/package.wit
  • src/plug_plabels/wit/deps/wasi-logging-0.1.0-draft/package.wit
  • src/plug_plabels/wit/deps/wasi-random-0.2.6/package.wit
  • src/plug_plabels/wit/deps/wasmcloud-postgres-0.1.1-draft/package.wit
  • src/plug_plabels/wit/deps/wflow
  • src/plug_plabels/wit/main.wit
  • src/xtask/Cargo.toml
  • src/xtask/main.rs
💤 Files with no reviewable changes (7)
  • src/daybook_types/test.rs
  • src/daybook_wflows/wflows/mod.rs
  • src/daybook_wflows/wflows/pseudo_labeler.rs
  • src/daybook_types/wit.rs
  • src/daybook_types/doc.rs
  • src/daybook_wflows/wflows/learn_image_label_proposals.rs
  • src/daybook_wflows/wflows/classify_image_label.rs

Comment thread src/daybook_core/blobs.rs
Comment thread src/daybook_core/plugs.rs
Comment thread src/daybook_core/rt.rs Outdated
Comment thread src/daybook_core/rt.rs
Comment thread src/daybook_core/rt.rs
Comment thread src/plug_plabels/wflows/learn_image_label_candidates.rs Outdated
Comment on lines +285 to +313
if let Some(row) = cache_rows.first() {
let model_tag = row_text(row, "model_tag").ok_or_else(|| {
JobErrorX::Terminal(ferr!(
"malformed learned label embedding cache row: missing/invalid field 'model_tag'"
))
})?;
let dim = row_i64(row, "dim").ok_or_else(|| {
JobErrorX::Terminal(ferr!(
"malformed learned label embedding cache row: missing/invalid field 'dim'"
))
})?;
let vector_bytes = row_blob(row, "vector").ok_or_else(|| {
JobErrorX::Terminal(ferr!(
"malformed learned label embedding cache row: missing/invalid field 'vector'"
))
})?;
let vector = embedding_bytes_to_f32(&vector_bytes)
.map_err(|err| JobErrorX::Terminal(err.wrap_err("invalid cached embedding bytes")))?;
if dim != vector.len() as i64 {
return Err(JobErrorX::Terminal(ferr!(
"malformed learned label embedding cache row: dim {} does not match vector len {}",
dim,
vector.len()
)));
}
if model_tag.eq_ignore_ascii_case(NOMIC_TEXT_MODEL_ID) {
return Ok(vector);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Treat invalid embedding-cache rows as cache misses.

This table is just derivable local state, but a bad vector payload or stale dim currently turns future learns for that prompt into a terminal error until someone manually cleans SQLite. Recompute and overwrite the row instead of aborting. The duplicate helper in learn_note_label_candidates.rs needs the same change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/plug_plabels/wflows/learn_image_label_candidates.rs` around lines 285 -
313, The code in the learn_image_label_candidates.rs block that reads cache_rows
(using row_text(row, "model_tag"), row_i64(row, "dim"), row_blob(row, "vector"),
and embedding_bytes_to_f32) should not turn malformed or stale cache rows into a
terminal JobErrorX; instead treat them as cache-misses: on missing/invalid
model_tag/dim/vector, on embedding_bytes_to_f32 errors, or on dim !=
vector.len() log or wrap the error as a warning and continue as if the row
wasn't present (so the code will recompute the embedding and overwrite the cache
row), rather than returning Err; apply the identical change to the duplicate
helper in learn_note_label_candidates.rs and ensure rows are updated when a
fresh embedding is computed.

Comment thread src/plug_plabels/wflows/learn_note_label_candidates.rs Outdated
Comment thread src/plug_plabels/wit/deps/wasi-random-0.2.6/package.wit
Comment thread src/xtask/main.rs Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/daybook_core/rt.rs (1)

1247-1279: ⚠️ Potential issue | 🟠 Major

wait_for_dispatch_end() still waits for deletion, not completion.

The runtime now finishes dispatches via dispatch_repo.complete(...), so normal success/failure/cancel no longer guarantees a DispatchDeleted event. Calls that start while a dispatch is active will now sit here until timeout for ordinary completions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_core/rt.rs` around lines 1247 - 1279, The function
wait_for_dispatch_end currently waits only for DispatchDeleted events
(DispatchEvent::DispatchDeleted) which misses normal completion paths; update it
to (1) after fetching the dispatch via self.dispatch_repo.get(dispatch_id).await
check the dispatch's status/terminal flag and return Ok if already
completed/cancelled/failed, and (2) when listening on listener_handle (from
self.dispatch_repo.subscribe), handle the appropriate events emitted on
completion—e.g., DispatchEvent variants such as DispatchCompleted /
DispatchFinished or a DispatchUpdated event whose payload indicates a terminal
status—returning Ok when the event refers to dispatch_id and its status is
terminal; keep the existing timeout and error mapping.
♻️ Duplicate comments (3)
src/daybook_core/plugs.rs (1)

1674-1685: ⚠️ Potential issue | 🟠 Major

Make dependency-id parsing fallible everywhere in validation.

These strings come from imported/manually supplied plug manifests, so the unwrap() / expect() paths here can panic the repo on one malformed dependency id. Please route both call sites through a shared fallible helper and bail! with the offending id instead.

Based on learnings: when validating data that comes from external inputs (such as automerge documents and related metadata like dmeta), use eyre::bail!() to return errors rather than panicking.

Also applies to: 1846-1878

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_core/plugs.rs` around lines 1674 - 1685, Create a fallible helper
function (e.g. parse_dep_base_id(dep_id: &str) -> eyre::Result<String>) that
parses a full dependency id into its base id and returns Err via eyre::bail!
with the offending dep_id on any malformed input; replace the inline logic in
the loop that computes dep_base_id (which currently uses unwrap()) to call
parse_dep_base_id(dep_id_full) and propagate errors, and apply the same
replacement at the other usage mentioned (around lines 1846-1878) so both call
sites use the shared fallible helper instead of panicking.
src/plug_plabels/wflows/learn_image_label_candidates.rs (1)

282-309: ⚠️ Potential issue | 🟠 Major

Treat malformed embedding-cache rows as cache misses.

This table is derivable local state, but missing fields, bad bytes, or a stale dim currently turn one bad SQLite row into a terminal workflow failure until someone manually cleans the DB. Ignore/log invalid rows here and fall through to recompute so the existing upsert path can heal the cache instead.

src/plug_plabels/wflows/learn_note_label_candidates.rs (1)

235-262: ⚠️ Potential issue | 🟠 Major

Don't let a bad cache row wedge future note-learning runs.

learned_label_text_embedding_cache is derivable local state, but any malformed row here becomes a terminal error instead of a cache miss. Warn/skip invalid model_tag / dim / vector data and recompute so the upsert path can overwrite the bad row.

🧹 Nitpick comments (1)
src/xtask/main.rs (1)

411-412: Source the OCI media types from one place.

These strings are duplicated from the importer side in src/daybook_core/plugs.rs. Hoisting them into a shared crate/module, or reusing the existing exports, keeps the OCI builder and importer from silently drifting apart.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/xtask/main.rs` around lines 411 - 412, The two OCI media-type constants
(OCI_PLUG_ARTIFACT_TYPE and OCI_PLUG_MANIFEST_LAYER_MEDIA_TYPE) are duplicated;
remove the local copies and source them from the canonical place (export them as
pub consts in the daybook_core::plugs module or a new shared module) and update
xtask's use sites to reference daybook_core::plugs::OCI_PLUG_ARTIFACT_TYPE and
daybook_core::plugs::OCI_PLUG_MANIFEST_LAYER_MEDIA_TYPE; ensure the constants
are public, update any imports/usages in xtask::main (or related functions) to
use the re-exported symbols, and run cargo build/tests to confirm no remaining
duplicates.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/daybook_core/plugs.rs`:
- Around line 1434-1455: The function rewrite_oci_component_urls currently
leaves non-OCI entries in component_urls untouched which allows imported
manifests to reference local or preexisting blobs; update the loop in
rewrite_oci_component_urls so that instead of continuing on non-oci entries it
returns an error (eyre::bail!) rejecting any component URL that does not start
with "oci://sha256:"; use the existing context symbols (component_urls,
oci_digest_to_repo_hash, digest_key, and crate::blobs::BLOB_SCHEME) and provide
a clear error message like "componentUrls entries must be OCI digests:
'{url_str}'" so imports fail early before self.add(...) is called.

In `@src/daybook_core/rt.rs`:
- Around line 583-603: The current reuse logic in ensure_plug_init_dispatches()
is incorrectly reusing a non-init dispatch for an init because dispatch lookups
only key on dispatch_id; update the dispatch reuse check in
dispatch_no_gate()/its caller so that init identity is part of the reuse key (or
refuse reuse) — specifically, ensure that when you intend to attach an
InitMarkDone hook (symbol InitMarkDone) you only reuse an existing dispatch
whose hook set includes the same InitMarkDone for that init_id, otherwise create
a fresh dispatch; after creating or selecting the correct dispatch, continue to
call init_repo.set_running_dispatch(&init_id, &dispatch_id) and push into
unresolved_init_dispatch_ids as currently done.
- Around line 874-885: When a dependency fails in dispatch_no_gate the code
calls self.dispatch_repo.set_waiting_failed(&waiting_id) but never records a
failed/completed progress update in ProgressRepo, leaving the spawned waiting
task permanently “waiting on dependencies”; fix this by calling the ProgressRepo
finalization method after set_waiting_failed (e.g.
self.progress_repo.finalize_failed(&waiting_id) or
self.progress_repo.mark_failed(&waiting_id)) so the progress entry for
waiting_id is marked failed/completed before continuing—place this call
immediately after self.dispatch_repo.set_waiting_failed(&waiting_id) (keeping
the existing loop over waiting_dispatch.on_success_hooks and
DispatchOnSuccessHook::InitMarkDone logic).

---

Outside diff comments:
In `@src/daybook_core/rt.rs`:
- Around line 1247-1279: The function wait_for_dispatch_end currently waits only
for DispatchDeleted events (DispatchEvent::DispatchDeleted) which misses normal
completion paths; update it to (1) after fetching the dispatch via
self.dispatch_repo.get(dispatch_id).await check the dispatch's status/terminal
flag and return Ok if already completed/cancelled/failed, and (2) when listening
on listener_handle (from self.dispatch_repo.subscribe), handle the appropriate
events emitted on completion—e.g., DispatchEvent variants such as
DispatchCompleted / DispatchFinished or a DispatchUpdated event whose payload
indicates a terminal status—returning Ok when the event refers to dispatch_id
and its status is terminal; keep the existing timeout and error mapping.

---

Duplicate comments:
In `@src/daybook_core/plugs.rs`:
- Around line 1674-1685: Create a fallible helper function (e.g.
parse_dep_base_id(dep_id: &str) -> eyre::Result<String>) that parses a full
dependency id into its base id and returns Err via eyre::bail! with the
offending dep_id on any malformed input; replace the inline logic in the loop
that computes dep_base_id (which currently uses unwrap()) to call
parse_dep_base_id(dep_id_full) and propagate errors, and apply the same
replacement at the other usage mentioned (around lines 1846-1878) so both call
sites use the shared fallible helper instead of panicking.

---

Nitpick comments:
In `@src/xtask/main.rs`:
- Around line 411-412: The two OCI media-type constants (OCI_PLUG_ARTIFACT_TYPE
and OCI_PLUG_MANIFEST_LAYER_MEDIA_TYPE) are duplicated; remove the local copies
and source them from the canonical place (export them as pub consts in the
daybook_core::plugs module or a new shared module) and update xtask's use sites
to reference daybook_core::plugs::OCI_PLUG_ARTIFACT_TYPE and
daybook_core::plugs::OCI_PLUG_MANIFEST_LAYER_MEDIA_TYPE; ensure the constants
are public, update any imports/usages in xtask::main (or related functions) to
use the re-exported symbols, and run cargo build/tests to confirm no remaining
duplicates.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c38f34d8-2126-47cf-a893-96fb696865a9

📥 Commits

Reviewing files that changed from the base of the PR and between 478e126 and f986382.

📒 Files selected for processing (19)
  • src/daybook_cli/main.rs
  • src/daybook_core/blobs.rs
  • src/daybook_core/plugs.rs
  • src/daybook_core/rt.rs
  • src/daybook_core/rt/init.rs
  • src/daybook_core/rt/wash_plugin/mltools.rs
  • src/daybook_core/test_support.rs
  • src/daybook_ffi/rt.rs
  • src/daybook_types/Cargo.toml
  • src/plug_plabels/e2e/image_label_nomic_wflow.rs
  • src/plug_plabels/e2e/image_label_screenshot_meme_wflow.rs
  • src/plug_plabels/lib.rs
  • src/plug_plabels/types.rs
  • src/plug_plabels/wflows/label_engine.rs
  • src/plug_plabels/wflows/label_image.rs
  • src/plug_plabels/wflows/label_note.rs
  • src/plug_plabels/wflows/learn_image_label_candidates.rs
  • src/plug_plabels/wflows/learn_note_label_candidates.rs
  • src/xtask/main.rs
✅ Files skipped from review due to trivial changes (1)
  • src/daybook_core/rt/init.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/daybook_core/blobs.rs
  • src/plug_plabels/wflows/label_image.rs
  • src/daybook_core/test_support.rs

Comment thread src/daybook_core/plugs.rs
Comment thread src/daybook_core/rt.rs
Comment thread src/daybook_core/rt.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/daybook_core/rt/dispatch.rs (1)

523-557: ⚠️ Potential issue | 🟠 Major

Make local dispatch events mirror events_for_patch().

Patch-derived events now model new dispatches entries as DispatchAdded and later vtag/status transitions as DispatchUpdated. The local fast paths diverge from that: waiting dispatches emit no local DispatchAdded, and complete() / set_waiting_failed() still emit DispatchDeleted even though the record remains in dispatches. Same-process subscribers will miss waiting jobs and can observe a different lifecycle than remote peers.

Also applies to: 561-601, 751-777

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_core/rt/dispatch.rs` around lines 523 - 557, The local event
paths must match events_for_patch(): in add (function add) always emit a
DispatchAdded event for the new store.dispatches entry (regardless of
DispatchStatus) using registry.notify with DispatchAdded and the created heads;
update the completion and waiting-failure handlers (functions complete() and
set_waiting_failed()) to emit DispatchUpdated transitions (not DispatchDeleted)
when the record stays in store.dispatches and only emit DispatchDeleted when the
record is truly removed; mirror the same head/vtag construction logic used by
events_for_patch() when building DispatchAdded/DispatchUpdated and stop removing
entries from store.dispatches when only the status/vtag changes (ensure
cancelled_dispatches and store.active_dispatches are updated consistently).
🧹 Nitpick comments (3)
src/plug_plabels/wflows/label_engine.rs (1)

712-740: SQL JOIN pattern for cosine similarity could be optimized.

The self-join FROM image_label_prompt_vec v1 JOIN image_label_prompt_vec v2 ON v1.rowid = ?1 AND v2.rowid = ?2 is unusual—the rowid conditions are in the ON clause rather than WHERE. While this works, a cleaner approach would be:

FROM image_label_prompt_vec v1, image_label_prompt_vec v2
WHERE v1.rowid = ?1 AND v2.rowid = ?2

This is a minor stylistic point; the current query functions correctly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/plug_plabels/wflows/label_engine.rs` around lines 712 - 740, The SQL
self-join in sqlite_vec_rowid_cosine_similarity uses an ON clause to impose
fixed rowid conditions which is stylistically odd; change the query string to
select from two table references (image_label_prompt_vec v1,
image_label_prompt_vec v2) and move the rowid predicates into a WHERE clause
(v1.rowid = ?1 AND v2.rowid = ?2) so the intent is clearer while keeping the
same parameters and subsequent row handling unchanged.
src/plug_plabels/wflows/label_note.rs (1)

37-39: SQLite connection fallback is pragmatic.

The fallback to args.sqlite_connections.first() when the specific key isn't found provides reasonable behavior when the exact key isn't available. However, this could silently use an unexpected connection. Consider logging which connection is used.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/plug_plabels/wflows/label_note.rs` around lines 37 - 39, The code
silently falls back to the first sqlite connection when
tuple_list_get(&args.sqlite_connections, LOCAL_STATE_KEY) fails; update the
logic in label_note.rs so that when the fallback branch is taken you log which
connection/key is being used (include the resolved key/name and token or
identifier) before returning/using it; reference the existing tuple_list_get
call, args.sqlite_connections, and LOCAL_STATE_KEY and add a log via the
existing logger/context used in this module (or wflow/job logger) to record the
chosen connection, then keep the same ok_or_else Terminal error behavior if no
connection is found.
src/daybook_core/blobs.rs (1)

445-457: Make the supported blob modes explicit for materialize_with_meta_extension().

Line 450 delegates to preferred_extension_from_meta(), but Line 522 is the default for every owned blob created through put(), put_path_copy(), or put_from_store(): build_meta() never persists mime, and those ingest paths clear source_paths. Either preserve extension provenance on ingest or narrow this API to the reference-only case instead of leaving callers to hit a runtime error.

As per coding guidelines: "If a requested change requires changing interfaces, break and change the interfaces instead of trying to shim around them; shims break abstraction boundaries and make code confusing to read".

Also applies to: 504-523

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_core/blobs.rs` around lines 445 - 457,
materialize_with_meta_extension currently assumes metadata contains a reliable
extension via preferred_extension_from_meta, but owned blobs created by
put/put_path_copy/put_from_store (which call build_meta) do not persist mime or
source_paths so callers can hit runtime errors; fix by making supported modes
explicit: either (A) restrict materialize_with_meta_extension to reference-only
blobs by checking that the blob metadata has source_paths/mime and returning a
clear error if absent (update materialize_with_meta_extension and its call
sites), or (B) change the ingest path
(build_meta/put/put_path_copy/put_from_store) to preserve extension provenance
(persist mime and/or source_paths) so preferred_extension_from_meta will always
work—pick one approach and update function names/docs and error messages
accordingly (refer to materialize_with_meta_extension,
preferred_extension_from_meta, build_meta, put, put_path_copy, put_from_store).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/daybook_core/blobs.rs`:
- Around line 419-443: The materialize() call to put_from_store(hash) mutates
persisted blob metadata (rebuilding BlobMetaV1 as OwnedCopy with empty
mime/source_paths) and thus drops provenance needed by
materialize_with_meta_extension(); replace this mutating call with a
non-mutating availability check or copy: e.g., add/use a read-only helper (name
suggestions: ensure_blob_available_readonly, ensure_local_copy_no_meta_rewrite,
or put_from_store_preserve_meta) that ensures the blob file is present locally
without rewriting BlobMetaV1, then use self.object_paths(hash)?.blob and proceed
to hard_link/atomic_copy as before; update materialize() to call that
non-mutating helper instead of put_from_store() and keep
materialize_with_meta_extension() intact.
- Around line 539-547: The current validation in sanitize_requested_stem and
sanitize_requested_filename allows drive-prefixed or other prefixed paths (e.g.,
"C:foo") because it only checks separators and "."/ ".."; replace the checks by
using std::path::Path and inspect Path::new(input).components() to ensure there
is exactly one component and that the single component matches
Component::Normal(_), returning an error otherwise; also add use
std::path::{Path, Component} to imports so both sanitize_requested_stem and
sanitize_requested_filename reject prefixed paths and only allow simple
filenames.

In `@src/daybook_core/rt.rs`:
- Around line 951-974: The code calls wflow_ingress.add_job(...) (producing
entry_id) before making the dispatch state durable via
dispatch_repo.activate_waiting(...), which can leave an orphaned running job if
the repo update fails; change the flow so you persist the dispatch first (using
ActiveDispatchDeets with entry_id and wflow_partition_id as None), then call
add_job to get entry_id and patch the persistent dispatch (or call
dispatch_repo.activate_waiting again to update those fields), and on any failure
after add_job invoke wflow_ingress.cancel_job(entry_id) (best-effort) to avoid
orphaned work; update code paths around add_job, activate_waiting,
ActiveDispatchDeets, and handle_wflow_entry to rely on the dispatch having
durable state before enqueuing the job.
- Around line 776-786: After a successful merge via merge_from_branch(), the
subsequent call to self.drawer.delete_branch(doc_id, staging_branch_path,
None).await currently treats a BranchNotFound error as fatal; change this
post-merge cleanup to be best-effort by catching the error from delete_branch,
matching for the BranchNotFound case (same handling as the failure-cleanup
branch), log a debug/info that the staging branch was already removed (including
?doc_id and ?staging_branch_path), and continue without returning an error; for
any other error, return or propagate it as before. Use the same error-matching
logic/symbols used in the failure-cleanup path so behavior is consistent.

In `@src/daybook_core/rt/dispatch.rs`:
- Around line 469-489: The current change made get(&self, id: &str) return only
Active dispatches (hiding Waiting/terminal states) which breaks callers like
Rt::cancel_dispatch() that expect to find and operate on non-active dispatches;
revert get to return the total dispatch (restore previous semantics) and
introduce a new get_active (or get_only_active) method that performs the Active
filter, updating call sites accordingly (e.g., have Rt::cancel_dispatch() call
get() to locate and cancel Waiting/terminal dispatches, and only use
get_active() where callers truly need ActiveDispatch); ensure the new method
names (get, get_active/get_only_active) and types (ActiveDispatch, Waiting) are
used consistently across the codebase.

---

Outside diff comments:
In `@src/daybook_core/rt/dispatch.rs`:
- Around line 523-557: The local event paths must match events_for_patch(): in
add (function add) always emit a DispatchAdded event for the new
store.dispatches entry (regardless of DispatchStatus) using registry.notify with
DispatchAdded and the created heads; update the completion and waiting-failure
handlers (functions complete() and set_waiting_failed()) to emit DispatchUpdated
transitions (not DispatchDeleted) when the record stays in store.dispatches and
only emit DispatchDeleted when the record is truly removed; mirror the same
head/vtag construction logic used by events_for_patch() when building
DispatchAdded/DispatchUpdated and stop removing entries from store.dispatches
when only the status/vtag changes (ensure cancelled_dispatches and
store.active_dispatches are updated consistently).

---

Nitpick comments:
In `@src/daybook_core/blobs.rs`:
- Around line 445-457: materialize_with_meta_extension currently assumes
metadata contains a reliable extension via preferred_extension_from_meta, but
owned blobs created by put/put_path_copy/put_from_store (which call build_meta)
do not persist mime or source_paths so callers can hit runtime errors; fix by
making supported modes explicit: either (A) restrict
materialize_with_meta_extension to reference-only blobs by checking that the
blob metadata has source_paths/mime and returning a clear error if absent
(update materialize_with_meta_extension and its call sites), or (B) change the
ingest path (build_meta/put/put_path_copy/put_from_store) to preserve extension
provenance (persist mime and/or source_paths) so preferred_extension_from_meta
will always work—pick one approach and update function names/docs and error
messages accordingly (refer to materialize_with_meta_extension,
preferred_extension_from_meta, build_meta, put, put_path_copy, put_from_store).

In `@src/plug_plabels/wflows/label_engine.rs`:
- Around line 712-740: The SQL self-join in sqlite_vec_rowid_cosine_similarity
uses an ON clause to impose fixed rowid conditions which is stylistically odd;
change the query string to select from two table references
(image_label_prompt_vec v1, image_label_prompt_vec v2) and move the rowid
predicates into a WHERE clause (v1.rowid = ?1 AND v2.rowid = ?2) so the intent
is clearer while keeping the same parameters and subsequent row handling
unchanged.

In `@src/plug_plabels/wflows/label_note.rs`:
- Around line 37-39: The code silently falls back to the first sqlite connection
when tuple_list_get(&args.sqlite_connections, LOCAL_STATE_KEY) fails; update the
logic in label_note.rs so that when the fallback branch is taken you log which
connection/key is being used (include the resolved key/name and token or
identifier) before returning/using it; reference the existing tuple_list_get
call, args.sqlite_connections, and LOCAL_STATE_KEY and add a log via the
existing logger/context used in this module (or wflow/job logger) to record the
chosen connection, then keep the same ok_or_else Terminal error behavior if no
connection is found.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5c4e2591-1760-4174-9952-3b0b310b5c3b

📥 Commits

Reviewing files that changed from the base of the PR and between f986382 and 75dac83.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock and included by **/*
📒 Files selected for processing (11)
  • src/daybook_core/blobs.rs
  • src/daybook_core/plugs.rs
  • src/daybook_core/rt.rs
  • src/daybook_core/rt/dispatch.rs
  • src/daybook_core/rt/init.rs
  • src/plug_plabels/lib.rs
  • src/plug_plabels/wflows/label_engine.rs
  • src/plug_plabels/wflows/label_image.rs
  • src/plug_plabels/wflows/label_note.rs
  • src/xtask/Cargo.toml
  • src/xtask/main.rs
✅ Files skipped from review due to trivial changes (3)
  • src/xtask/Cargo.toml
  • src/plug_plabels/lib.rs
  • src/daybook_core/plugs.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/plug_plabels/wflows/label_image.rs
  • src/xtask/main.rs

Comment thread src/daybook_core/blobs.rs
Comment thread src/daybook_core/blobs.rs
Comment thread src/daybook_core/rt.rs
Comment thread src/daybook_core/rt.rs Outdated
Comment thread src/daybook_core/rt/dispatch.rs Outdated
Comment on lines 469 to 489
pub async fn get(&self, id: &str) -> Option<Arc<ActiveDispatch>> {
self.store
.query_sync(|store| {
let dispatch = store.dispatches.get(id)?;
if dispatch.status != DispatchStatus::Active {
return None;
}
Some(Arc::clone(&dispatch.val.0))
})
.await
}

pub async fn get_any(&self, id: &str) -> Option<Arc<ActiveDispatch>> {
self.store
.query_sync(|store| {
store
.active_dispatches
.dispatches
.get(id)
.map(|versioned| Arc::clone(versioned))
.map(|versioned| Arc::clone(&versioned.val.0))
})
.await
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

get() becoming active-only is already leaking into call-site bugs.

This accessor now hides Waiting and terminal dispatches behind get_any(), but callers still have pending-dispatch behavior to implement. Rt::cancel_dispatch() is the concrete breakage: a dependency-gated dispatch is effectively “not found” until it becomes active, so it cannot be cancelled before enqueue. Prefer a clear split like get() = total / get_active() = filtered, or add an explicit waiting-cancel API instead of this ambiguous pair.

As per coding guidelines, "**/*.rs: If a requested change requires changing interfaces, break and change the interfaces instead of trying to shim around them; shims break abstraction boundaries and make code confusing to read`."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_core/rt/dispatch.rs` around lines 469 - 489, The current change
made get(&self, id: &str) return only Active dispatches (hiding Waiting/terminal
states) which breaks callers like Rt::cancel_dispatch() that expect to find and
operate on non-active dispatches; revert get to return the total dispatch
(restore previous semantics) and introduce a new get_active (or get_only_active)
method that performs the Active filter, updating call sites accordingly (e.g.,
have Rt::cancel_dispatch() call get() to locate and cancel Waiting/terminal
dispatches, and only use get_active() where callers truly need ActiveDispatch);
ensure the new method names (get, get_active/get_only_active) and types
(ActiveDispatch, Waiting) are used consistently across the codebase.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
.github/workflows/checks.yml (1)

92-92: Consider documenting the rationale for increasing test retries.

Increasing --retries from 1 to 2 provides more resilience against transient failures but can also mask genuinely flaky tests. If this change is in response to specific flaky behavior, consider adding a brief comment explaining the reason, or tracking the flaky tests for future investigation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/checks.yml at line 92, The workflow line that sets test
retries now uses "--retries 2" which can mask flaky tests; update the checks.yml
entry that invokes "nix develop .#ci-rust --command cargo ... nextest run" to
include a brief inline comment explaining why retries were increased (e.g.,
reference to flaky test(s) or a ticket/issue ID), and if applicable add a TODO
to track or file an issue to investigate/remove the extra retry later so the
change is documented and traceable.
src/daybook_core/rt/dispatch.rs (1)

546-546: Consider a more descriptive assertion message.

The assertion assert!(old.is_none(), "fishy") guards against duplicate job IDs but provides minimal context for debugging. A message like "duplicate wflow_job_id insertion: {wflow_job_id}" would help diagnose the invariant violation.

💡 Suggested improvement
-                    assert!(old.is_none(), "fishy");
+                    assert!(
+                        old.is_none(),
+                        "duplicate wflow_job_id insertion: {wflow_job_id}"
+                    );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_core/rt/dispatch.rs` at line 546, Replace the vague assertion
message on the duplicate-job check so failures give useful context: locate the
assertion `assert!(old.is_none(), "fishy")` (the duplicate wflow job-id guard
using `old` and `wflow_job_id`) and change the assertion string to include the
offending id, e.g. `"duplicate wflow_job_id insertion: {wflow_job_id}"` (or use
formatting with `format!`/`panic!` as appropriate in that scope) so test/runtime
failures show the actual `wflow_job_id` value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/daybook_core/rt.rs`:
- Around line 1165-1222: The non-waiting path can orphan a workflow job if
add_job(...) succeeds but dispatch_repo.add(...) fails; modify the code around
add_job (where entry_id is set) and dispatch_repo.add to detect a failure and,
if entry_id.is_some(), call
self.wflow_ingress.cancel_job(entry_id.unwrap()).await (or the appropriate
cancel method) to roll back the scheduled job, handle any cancel errors
gracefully (log/ignore) and then return the original dispatch_repo.add error so
the failure is propagated; ensure you reference the same entry_id variable and
the wflow_ingress.cancel_job API when implementing the rollback.

---

Nitpick comments:
In @.github/workflows/checks.yml:
- Line 92: The workflow line that sets test retries now uses "--retries 2" which
can mask flaky tests; update the checks.yml entry that invokes "nix develop
.#ci-rust --command cargo ... nextest run" to include a brief inline comment
explaining why retries were increased (e.g., reference to flaky test(s) or a
ticket/issue ID), and if applicable add a TODO to track or file an issue to
investigate/remove the extra retry later so the change is documented and
traceable.

In `@src/daybook_core/rt/dispatch.rs`:
- Line 546: Replace the vague assertion message on the duplicate-job check so
failures give useful context: locate the assertion `assert!(old.is_none(),
"fishy")` (the duplicate wflow job-id guard using `old` and `wflow_job_id`) and
change the assertion string to include the offending id, e.g. `"duplicate
wflow_job_id insertion: {wflow_job_id}"` (or use formatting with
`format!`/`panic!` as appropriate in that scope) so test/runtime failures show
the actual `wflow_job_id` value.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 71cdbda4-5433-4d41-9435-4184c9d3f722

📥 Commits

Reviewing files that changed from the base of the PR and between 75dac83 and 0d385a2.

📒 Files selected for processing (8)
  • .github/workflows/checks.yml
  • flake.nix
  • src/daybook_core/blobs.rs
  • src/daybook_core/rt.rs
  • src/daybook_core/rt/dispatch.rs
  • src/plug_plabels/wflows/label_engine.rs
  • src/plug_plabels/wflows/label_image.rs
  • src/plug_plabels/wflows/label_note.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • flake.nix
  • src/plug_plabels/wflows/label_note.rs

Comment thread src/daybook_core/rt.rs
Comment on lines +1165 to 1222
let mut entry_id = None;
let mut wflow_partition_id = None;
if !is_waiting {
let bundle_man = plug_man.wflow_bundles.get(bundle_name).ok_or_else(|| {
ferr!(
"bundle not found in plug manifest: routine={plug_id}/{routine_name} bundle={bundle_name} key={key}"
)
})?;
let _workload_id = ensure_bundle_workload_running(
&self.wcx,
&self.wash_host,
&self.blobs_repo,
plug_id.into(),
bundle_name.0.clone(),
bundle_man,
)
.await?;

entry_id = Some(
self.wflow_ingress
.add_job(
job_id.clone().into(),
key,
serde_json::to_string(&()).expect(ERROR_JSON),
None,
)
.await
.wrap_err_with(|| {
format!("error scheduling job for {plug_id}/{routine_name}")
})?,
);
wflow_partition_id = Some(self.local_wflow_part_id.clone());
}

ActiveDispatchDeets::Wflow {
wflow_partition_id: self.local_wflow_part_id.clone(),
wflow_partition_id,
entry_id,
plug_id: plug_id.into(),
bundle_name: bundle_name.0.clone(),
wflow_key: key.0.clone(),
wflow_job_id: job_id,
wflow_job_id: Some(job_id),
}
}
};
let active_dispatch = Arc::new(ActiveDispatch { args, deets });
let active_dispatch = Arc::new(ActiveDispatch {
args,
deets,
status: if is_waiting {
dispatch::DispatchStatus::Waiting
} else {
dispatch::DispatchStatus::Active
},
waiting_on_dispatch_ids,
on_success_hooks,
});
self.dispatch_repo
.add(dispatch_id.clone(), Arc::clone(&active_dispatch))
.await?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Job may be orphaned if dispatch_repo.add() fails after add_job().

In the non-waiting path, add_job() (Lines 1183-1195) succeeds and returns an entry_id, but if dispatch_repo.add() (Lines 1220-1222) subsequently fails, no rollback cancels the job. This leaves an orphaned wflow job that handle_wflow_entry will ignore (no matching dispatch).

Consider wrapping dispatch_repo.add() in an error handler that calls wflow_ingress.cancel_job() on failure, similar to the rollback in start_waiting_dispatch (Lines 1006-1012).

🛠️ Suggested fix
         }
         };
-        let active_dispatch = Arc::new(ActiveDispatch {
+        let active_dispatch = Arc::new(ActiveDispatch { ... });
+        if let Err(err) = self.dispatch_repo.add(dispatch_id.clone(), Arc::clone(&active_dispatch)).await {
+            // Rollback: cancel the job if we failed to persist
+            if let ActiveDispatchDeets::Wflow { wflow_job_id: Some(job_id), entry_id: Some(_), .. } = &active_dispatch.deets {
+                let _ = self.wflow_ingress.cancel_job(
+                    Arc::from(job_id.as_ref()),
+                    format!("rollback add for dispatch {dispatch_id}"),
+                ).await;
+            }
+            return Err(err);
+        }
-        self.dispatch_repo
-            .add(dispatch_id.clone(), Arc::clone(&active_dispatch))
-            .await?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/daybook_core/rt.rs` around lines 1165 - 1222, The non-waiting path can
orphan a workflow job if add_job(...) succeeds but dispatch_repo.add(...) fails;
modify the code around add_job (where entry_id is set) and dispatch_repo.add to
detect a failure and, if entry_id.is_some(), call
self.wflow_ingress.cancel_job(entry_id.unwrap()).await (or the appropriate
cancel method) to roll back the scheduled job, handle any cancel errors
gracefully (log/ignore) and then return the original dispatch_repo.add error so
the failure is propagated; ensure you reference the same entry_id variable and
the wflow_ingress.cancel_job API when implementing the rollback.

@dman-os dman-os merged commit 83465e9 into main Apr 1, 2026
7 of 9 checks passed
@dman-os dman-os deleted the feat/plug/plabels branch April 1, 2026 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant