feat(infra): PGMQ workers — processAsync CF<ProcessOutcome>, no .join#1305
Conversation
- 28 real blocking sites across module-infra + module-external-api - original plan (docs/superpowers + ADR) was based on wrong code assumptions - 5 recommended sub-PRs for future work - no code changes
…ed for runBlocking
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 00a746f888
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| handleFailure(jobId, cause) | ||
| ProcessOutcome.Nack(retryable = true) |
There was a problem hiding this comment.
Preserve successful retry scheduling as an ack
When an external API failure reaches this branch and handleFailure() successfully schedules a replacement retry message via retryExternalApiJob() (or marks the job failed), the previous process() path returned that Boolean so the current PGMQ message was archived. This now discards the result and always returns Nack, so while message.readCount is still retryable the base worker leaves the original message to reappear even though a replacement was already enqueued, causing duplicate external-api work for the same job on transient failures.
Useful? React with 👍 / 👎.
|
@codex review |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Summary
Adds async-returning
processAsync(): CF<ProcessOutcome>to PGMQ workers. Eliminates 8 blocking sites (.join()× 4,runBlocking× 3,Task.join()× 1) across 6 PGMQ workers +ResultReadyProjectionWorker+PgmqWorker.processSequentialBatch. Syncprocess(): Booleankept@Deprecatedfor module-app legacy.Spec / Audit
docs/05_Reports/2026-06-18-blocking-audit.mddocs/superpowers/plans/2026-06-18-pgmq-process-async.mdDecisions (grill-me Q1-Q8)
ProcessOutcome(Ack | Nack(retryable, visibilityReset) | DeadLetter)processAsync()+ sync@Deprecated— revised in execution: processAsync isopenwith default impl (not abstract) to keep build green; syncprocess()carries@Deprecatedshim.whenCompleteCompletableFuture.allOfbounded by workerPool/cpuExecutor.join()+ 1runBlocking)Nack.visibilityReset: Duration?enables per-cancel redelivery timingChanges (12 commits)
module-infra/pgmq/ProcessOutcome.kt: new sealed class — Ack | Nack(retryable, visibilityReset) | DeadLetterPgmqWorker.kt: addedopen fun processAsync()with default impl; markedprocess()@Deprecated; migratedprocessSequentialBatchfromrunBlockingtoCompletableFuture.allOf+ per-messagesupplyAsyncmodule-infra/worker/processAsyncchainspipeline.processAsync().thenCompose(publisher).thenApply(Ack).exceptionally(Nack). CPU section migrated fromrunBlocking(Dispatchers.Default)toCompletableFuture.supplyAsync(expectationComputeCpuExecutor).processAsyncchainsexpectationPort.calculateExpectationAsync().handle { value, ex -> Ack/Nack/DeadLetter }.processAsyncwrapsprocess()insupplyAsync(cpuExecutor). Pure sync — no internal blocks.onProcessingFailedkept UNCHANGED — still needed for legacyprocess()path until base class orchestrator is updated (follow-up PR).projectPgmqBatchreturnsCompletableFuture<Void>;.join()replaced with.thenCombine;runBlocking(Dispatchers.Default)replaced withCompletableFuture.allOf+ per-messagesupplyAsync.Test infra
*AsyncTest.ktfor each migrated worker +PgmqWorkerProcessAsyncTest+ProcessOutcomeTest*Asynctests use thecallProcessAsync/callProjectPgmqBatchinternal bridge pattern to accessprotectedmethodsLogicExecutorimpl to avoid NPE frommock<LogicExecutor>()(which would return null and NPE on Boolean unbox inside theCompletableFuture.supplyAsynclambda)CI gate
New test
PgmqBlockingPrimitiveGateTestgrepsmodule-infra/{pgmq,worker}/for.join(),runBlocking,Task.join(),Thread.sleep. Allowlist for@Deprecatedsync compat (line-level). ExternalApiWorker.kt + OcidResolveWorker.kt.join()allowlisted (documented keep-sites per audit + inline Javadoc ADRs).Verification
./gradlew compileKotlin compileJava --continue— clean./gradlew test(excludingmodule-cleanupwhich has a pre-existing compile error on develop unrelated to this PR) — 1216 PASSED, 0 FAILEDOut of scope (follow-up PR)
worker/OcidResolveWorker.ktandNexonApiWorker.kt(topic subscribers, different framework — Q7=B)module-applegacyprocess(): BooleanusersPgmqWorker.processSingleMessagestill callsprocess(): BooleanandonProcessingFailed. The full migration to honorProcessOutcome.Nack.visibilityResetrequires updating the base class to useprocessAsync(). Tracked as follow-up.