Skip to content

refactor(jd-client): robust error handling for initialization and retry logic#456

Open
nulllpc wants to merge 1 commit into
stratum-mining:mainfrom
nulllpc:npc/refactor-non-critical-jdc-errors
Open

refactor(jd-client): robust error handling for initialization and retry logic#456
nulllpc wants to merge 1 commit into
stratum-mining:mainfrom
nulllpc:npc/refactor-non-critical-jdc-errors

Conversation

@nulllpc
Copy link
Copy Markdown

@nulllpc nulllpc commented Apr 25, 2026

#168

This PR attempts to improve error handling for JDC by addressing some low-hanging improvements:

  1. Replaces several expect() and unwrap() calls in JobDeclaratorClient::start with proper error propagation using Result.
  2. Refactors the JDC error handling by making the upstream initialization logic "action-aware."

Changes:

  • Updated start() signature to return JDCError<JobDeclarationClient>.
  • Introduced InitializationError(String) variant to JDCErrorKind to preserve context from formerly panicking calls.
  • Updated main.rs and integration tests to handle the new Result type
  • try_initialize_single is updated to inspect the Action from sub-component errors and return a full JDCError with either a Shutdown or Fallback action.
  • The retry loop in initialize_jd now checks e.action. It immediately propagates fatal Shutdown errors up to start() while continuing to retry on Fallback errors.
  • The Action enum now derives PartialEq and Eq to enable this logic.
  • Function signatures now consistently use the JDCResult type alias.

@nulllpc nulllpc force-pushed the npc/refactor-non-critical-jdc-errors branch from 90ed53b to a541334 Compare April 25, 2026 12:52
@nulllpc nulllpc changed the title refactor(jd-client): propagate initialization errors instead of panicking refactor(jd-client): robust error handling for initialization and retry logic Apr 25, 2026
@nulllpc nulllpc marked this pull request as ready for review April 25, 2026 14:41
Copy link
Copy Markdown
Member

@Shourya742 Shourya742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall approach looks good.

Comment thread miner-apps/jd-client/src/lib/error.rs Outdated
@nulllpc nulllpc force-pushed the npc/refactor-non-critical-jdc-errors branch 4 times, most recently from 3d9b16b to aaece78 Compare May 1, 2026 16:48
@nulllpc
Copy link
Copy Markdown
Author

nulllpc commented May 1, 2026

Hi @Shourya742 I updated my PR accordingly, I reviewed the changes made in #467 and I think my changes still align well with changes there and the comment you made here

#467 (comment)

Copy link
Copy Markdown
Member

@Shourya742 Shourya742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look okay to me. As a rule of thumb, avoid adding side effects in combinators. Once we fix this, I will approve it.

Comment thread miner-apps/jd-client/src/lib/mod.rs Outdated
Comment thread miner-apps/jd-client/src/lib/mod.rs Outdated
Comment thread miner-apps/jd-client/src/lib/mod.rs Outdated
Comment thread miner-apps/jd-client/src/lib/mod.rs Outdated
@nulllpc
Copy link
Copy Markdown
Author

nulllpc commented May 1, 2026

I addressed all comments, please review again @Shourya742

Comment thread miner-apps/jd-client/src/main.rs Outdated
Copy link
Copy Markdown
Member

@Shourya742 Shourya742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nits

Comment thread integration-tests/lib/mod.rs Outdated
@Shourya742
Copy link
Copy Markdown
Member

@nulllpc once you fix the clippy and fmt nits, I’ll approve. Nice work.

@nulllpc
Copy link
Copy Markdown
Author

nulllpc commented May 2, 2026

All done! Let's get this merged, this is my first PR for sv2. Thanks for your reviews @Shourya742

@Shourya742
Copy link
Copy Markdown
Member

@nulllpc can you squash the commits.

@nulllpc nulllpc force-pushed the npc/refactor-non-critical-jdc-errors branch from 7979589 to f3c949c Compare May 2, 2026 15:29
@nulllpc
Copy link
Copy Markdown
Author

nulllpc commented May 2, 2026

Done! @Shourya742 I double-checked commit message as well

Copy link
Copy Markdown
Member

@Shourya742 Shourya742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! ACK

@nulllpc nulllpc force-pushed the npc/refactor-non-critical-jdc-errors branch from f3c949c to 42587c2 Compare May 4, 2026 14:23
@nulllpc
Copy link
Copy Markdown
Author

nulllpc commented May 4, 2026

Hi @Shourya742 can we merge this PR, are we waiting for sth?

@nulllpc nulllpc force-pushed the npc/refactor-non-critical-jdc-errors branch 3 times, most recently from 105844a to 7af9c03 Compare May 6, 2026 16:40
@GitGab19
Copy link
Copy Markdown
Member

@nulllpc CI is failing here, I guess you need to rebase it and fix issues.

@nulllpc nulllpc force-pushed the npc/refactor-non-critical-jdc-errors branch 2 times, most recently from aa97b02 to 7779c3f Compare May 11, 2026 17:08
@nulllpc
Copy link
Copy Markdown
Author

nulllpc commented May 11, 2026

Hi @GitGab19, I updated the PR, it's ready for review now

Comment thread integration-tests/lib/mod.rs Outdated
Comment thread miner-apps/jd-client/src/lib/mod.rs
Comment thread miner-apps/jd-client/src/lib/mod.rs
_ = tokio::time::sleep(Duration::from_secs(1)) => {}
}

if e.action == Action::Shutdown {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a short comment here explaining why we need this if statement?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically try_initialize_single can return either a fallback or shutdown signal. The rationale here is to catch the shutdown signal and then fail fast, instead of retrying again. Before, it was passing only JDCErrorKind so we wouldn't be able to differentiate between a shutdown and a fallback signal. This is the same as @Shourya742 explained here: https://github.com/stratum-mining/sv2-apps/pull/456/changes#r3224976497

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, can you write a short comment explaining that though?

Comment thread miner-apps/jd-client/src/lib/mod.rs
Comment on lines 368 to 371
Err(e) => {
tracing::error!("Failed to initialize upstream: {:?}", e);
mode.set_solo_mining();
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably the same check explained in https://github.com/stratum-mining/sv2-apps/pull/456/changes#r3225801560 should be done here?

What do you think @Shourya742 ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that when we fallback to solo mining here, we only set the mode but we do not set the upstream_state. Is this intentional or a bug? @Shourya742

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is already set in the fallback branch for the channel manager, so we don't need an assignment here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see ChannelManager::new() initiates with upstream state as solo mining so technically it still works, but I think it's better to be explicit here. What do you think @Shourya742 @GitGab19

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. I’m going to refactor the startup and fallback code soon anyway to make it more streamlined, so it’s not really needed right now.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for the explanation

panicking

Replaces several `expect()` and `unwrap()` calls in
`JobDeclaratorClient::start`
with proper error propagation using `Result`.

- Updated `start()` signature to return
  `JDCError<JobDeclarationClient>`.
- Introduced `InitializationError(String)` variant to `JDCErrorKind` to
  preserve context from formerly panicking calls.
- Updated `main.rs` and integration tests to handle the new `Result`
  type
- Avoid blindly falling back to solo mining
- Return proper error when template receiver fail to start
- Fix clippy and fmt checks
@nulllpc nulllpc force-pushed the npc/refactor-non-critical-jdc-errors branch from 7779c3f to 3287d6c Compare May 12, 2026 17:31
@nulllpc
Copy link
Copy Markdown
Author

nulllpc commented May 12, 2026

I resolved all comments @Shourya742 @GitGab19 Please check again when you have time

@nulllpc
Copy link
Copy Markdown
Author

nulllpc commented May 12, 2026

Tests are failing because they still expect setup to fallback to solo mining, I'll adjust the tests accordingly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants