Skip to content

serialization: fix double-read and type mismatch in exception_ptr load()#7278

Open
arpittkhandelwal wants to merge 1 commit into
TheHPXProject:masterfrom
arpittkhandelwal:fix/exception-ptr-serialization-double-read
Open

serialization: fix double-read and type mismatch in exception_ptr load()#7278
arpittkhandelwal wants to merge 1 commit into
TheHPXProject:masterfrom
arpittkhandelwal:fix/exception-ptr-serialization-double-read

Conversation

@arpittkhandelwal
Copy link
Copy Markdown
Contributor

Summary

Two bugs in hpx::serialization::detail::load() for std::exception_ptr in libs/core/serialization/src/exception_ptr.cpp.


Bug 1 — Double-read of archive fields (silent data corruption)

In the load() function, for hpx_exception and std_system_error types, the err_value and err_message fields were read from the archive twice — once via ar & ... and again via ar >> ...:

// Before (buggy):
if (hpx::util::exception_type::hpx_exception == type)
{
    ar & err_value;   // reads from archive (advances cursor)
    ar >> err_value;  // reads AGAIN — cursor double-advanced → garbage!
}
else if (... std_system_error ...)
{
    ar & err_value & err_message;     // reads both
    ar >> err_value >> err_message;   // reads both AGAIN
}

Since save() writes each field once (using ar <<), this caused the deserialization read cursor to advance by 2× what was expected, producing silently wrong err_value and err_message in any distributed HPX application that propagates hpx::exception or std::system_error across localities.

Fix: Remove the redundant ar & ... reads; keep only ar >> ... to match the ar << used in save().


Bug 2 — Type mismatch for throw_line_ (long saved, int loaded)

save() declares throw_line_ as long (8 bytes on 64-bit Linux), but load() declared it as int (4 bytes):

// save() — line 45:
long throw_line_ = 0;  // 8 bytes on 64-bit Linux

// load() — line 185 (before fix):
int throw_line_ = 0;   // only 4 bytes read!

On platforms where sizeof(long) != sizeof(int) (e.g. 64-bit Linux/macOS), the serializer writes 8 bytes but the deserializer reads only 4, shifting all subsequent field reads by 4 bytes — a cascading deserialization corruption.

Fix: Change int throw_line_ = 0long throw_line_ = 0 in load() to match save().


Changes

libs/core/serialization/src/exception_ptr.cpp
  • -3 / +1 lines net change

Testing

These bugs are triggered in distributed scenarios where exceptions cross locality boundaries. The fix is mechanical — the load path now mirrors the save path exactly.

@arpittkhandelwal arpittkhandelwal requested a review from hkaiser as a code owner May 19, 2026 16:42
Copilot AI review requested due to automatic review settings May 19, 2026 16:42
@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 19, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes two deserialization bugs in hpx::serialization::detail::load() for std::exception_ptr: a double-read of archive fields and a long/int type mismatch for throw_line_. Both caused silent data corruption when exceptions crossed locality boundaries in distributed HPX applications. The load path now mirrors the save path exactly.

Changes:

  • Removed redundant ar & err_value / ar & err_value & err_message reads that were duplicating the subsequent ar >> ... reads (since operator& on input_archive is equivalent to operator>>).
  • Changed int throw_line_ = 0 to long throw_line_ = 0 in load() to match the type used in save().

hkaiser
hkaiser previously approved these changes May 19, 2026
Copy link
Copy Markdown
Contributor

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent catch! LGTM, thanks!

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented May 19, 2026

@arpittkhandelwal Could you please construct a test that verifies the fix?

@StellarBot
Copy link
Copy Markdown
Collaborator

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)(=)(=)

Info

PropertyBeforeAfter
HPX Commit0eeca865decf74
HPX Datetime2026-03-09T14:08:29+00:002026-05-19T16:42:14+00:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Envfile
Clusternamerostamrostam
Datetime2026-03-09T09:15:24.034803-05:002026-05-19T21:11:33.411763-05:00

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch++

Info

PropertyBeforeAfter
HPX Commit0eeca865decf74
HPX Datetime2026-03-09T14:08:29+00:002026-05-19T16:42:14+00:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Envfile
Clusternamerostamrostam
Datetime2026-03-09T09:17:15.638328-05:002026-05-19T21:13:41.524568-05:00

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)--
Stream Benchmark - Scale(=)----
Stream Benchmark - Triad(=)---
Stream Benchmark - Copy(=)+++++

Info

PropertyBeforeAfter
HPX Commitba89f5d5decf74
HPX Datetime2026-03-09T18:50:37+00:002026-05-19T16:42:14+00:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Envfile
Clusternamerostamrostam
Datetime2026-03-09T17:49:10.837937-05:002026-05-19T21:14:26.875804-05:00

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@arpittkhandelwal arpittkhandelwal force-pushed the fix/exception-ptr-serialization-double-read branch from 7703618 to 6993160 Compare May 20, 2026 03:31
Two bugs in hpx::serialization::detail::load() for std::exception_ptr:

1. Double-read of archive fields (Bug TheHPXProject#1 - data corruption):
   In the load() function, for hpx_exception and std_system_error types,
   the err_value and err_message fields were read from the archive twice:
   once via 'ar & ...' and again via 'ar >> ...'. Since save() writes each
   field only once, this caused the read cursor to advance by 2x, silently
   producing garbled error codes and messages in any distributed HPX
   application that propagates these exception types across localities.

   Fix: Remove the redundant 'ar & ...' reads; keep only 'ar >> ...'
   which matches the 'ar << ...' used in save().

2. Type mismatch for throw_line_ (Bug TheHPXProject#3 - platform-specific corruption):
   save() declares throw_line_ as 'long' (8 bytes on 64-bit Linux), but
   load() declared it as 'int' (4 bytes). This caused the serializer to
   write 8 bytes and the deserializer to read only 4, shifting all
   subsequent field reads by 4 bytes on affected platforms.

   Fix: Change 'int throw_line_ = 0' to 'long throw_line_ = 0' in load()
   to match the type used in save().

Additionally, added a regression test to verify that serialization of
hpx::exception, std::system_error, std::runtime_error, and sequential
round-tripping behaves correctly.

Signed-off-by: arpittkhandelwal <arpitkhandelwal810@gmail.com>
@arpittkhandelwal arpittkhandelwal force-pushed the fix/exception-ptr-serialization-double-read branch from 6993160 to 0be64ba Compare May 20, 2026 03:32
@arpittkhandelwal
Copy link
Copy Markdown
Contributor Author

@arpittkhandelwal Could you please construct a test that verifies the fix?

Yes I have constructed and added a comprehensive regression unit test under libs/core/serialization/tests/unit/serialization_exception_ptr.cpp to verify these fixes.

@StellarBot
Copy link
Copy Markdown
Collaborator

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)(=)(=)

Info

PropertyBeforeAfter
HPX Datetime2026-03-09T14:08:29+00:002026-05-20T03:32:29+00:00
HPX Commit0eeca86094caac
Datetime2026-03-09T09:15:24.034803-05:002026-05-19T22:41:57.886090-05:00
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Envfile
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch++

Info

PropertyBeforeAfter
HPX Datetime2026-03-09T14:08:29+00:002026-05-20T03:32:29+00:00
HPX Commit0eeca86094caac
Datetime2026-03-09T09:17:15.638328-05:002026-05-19T22:44:07.143095-05:00
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Envfile
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)=--
Stream Benchmark - Scale(=)----
Stream Benchmark - Triad(=)---
Stream Benchmark - Copy(=)+++++

Info

PropertyBeforeAfter
HPX Datetime2026-03-09T18:50:37+00:002026-05-20T03:32:29+00:00
HPX Commitba89f5d094caac
Datetime2026-03-09T17:49:10.837937-05:002026-05-19T22:44:53.173799-05:00
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Envfile
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

{
// Construct an hpx::exception whose internal throw_line will be stored
// as a long; the exact value is set by the HPX_THROW_EXCEPTION macro but
// we can still verify the type and error code survive.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could still use hpx::detail::throw_exception directly and pass in a large line number for testing, see:

hpx::detail::throw_exception( \
errcode, hpx::util::format(__VA_ARGS__), f, __FILE__, __LINE__) /**/

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented May 20, 2026

The new test looks good, it however doesn't compile :/

Also, your LLM use caught up with you again:

/libs/core/serialization/tests/unit/serialization_exception_ptr.cpp:
    Non-ASCII: line 10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants