serialization: fix double-read and type mismatch in exception_ptr load()#7278
Conversation
Up to standards ✅🟢 Issues
|
There was a problem hiding this comment.
Pull request overview
Fixes two deserialization bugs in hpx::serialization::detail::load() for std::exception_ptr: a double-read of archive fields and a long/int type mismatch for throw_line_. Both caused silent data corruption when exceptions crossed locality boundaries in distributed HPX applications. The load path now mirrors the save path exactly.
Changes:
- Removed redundant
ar & err_value/ar & err_value & err_messagereads that were duplicating the subsequentar >> ...reads (sinceoperator&oninput_archiveis equivalent tooperator>>). - Changed
int throw_line_ = 0tolong throw_line_ = 0inload()to match the type used insave().
hkaiser
left a comment
There was a problem hiding this comment.
Excellent catch! LGTM, thanks!
|
@arpittkhandelwal Could you please construct a test that verifies the fix? |
Performance test reportHPX PerformanceComparison
Info
Comparison
Info
Comparison
Info
Explanation of Symbols
|
7703618 to
6993160
Compare
Two bugs in hpx::serialization::detail::load() for std::exception_ptr: 1. Double-read of archive fields (Bug TheHPXProject#1 - data corruption): In the load() function, for hpx_exception and std_system_error types, the err_value and err_message fields were read from the archive twice: once via 'ar & ...' and again via 'ar >> ...'. Since save() writes each field only once, this caused the read cursor to advance by 2x, silently producing garbled error codes and messages in any distributed HPX application that propagates these exception types across localities. Fix: Remove the redundant 'ar & ...' reads; keep only 'ar >> ...' which matches the 'ar << ...' used in save(). 2. Type mismatch for throw_line_ (Bug TheHPXProject#3 - platform-specific corruption): save() declares throw_line_ as 'long' (8 bytes on 64-bit Linux), but load() declared it as 'int' (4 bytes). This caused the serializer to write 8 bytes and the deserializer to read only 4, shifting all subsequent field reads by 4 bytes on affected platforms. Fix: Change 'int throw_line_ = 0' to 'long throw_line_ = 0' in load() to match the type used in save(). Additionally, added a regression test to verify that serialization of hpx::exception, std::system_error, std::runtime_error, and sequential round-tripping behaves correctly. Signed-off-by: arpittkhandelwal <arpitkhandelwal810@gmail.com>
6993160 to
0be64ba
Compare
Yes I have constructed and added a comprehensive regression unit test under libs/core/serialization/tests/unit/serialization_exception_ptr.cpp to verify these fixes. |
Performance test reportHPX PerformanceComparison
Info
Comparison
Info
Comparison
Info
Explanation of Symbols
|
| { | ||
| // Construct an hpx::exception whose internal throw_line will be stored | ||
| // as a long; the exact value is set by the HPX_THROW_EXCEPTION macro but | ||
| // we can still verify the type and error code survive. |
There was a problem hiding this comment.
You could still use hpx::detail::throw_exception directly and pass in a large line number for testing, see:
hpx/libs/core/errors/include/hpx/errors/macros.hpp
Lines 105 to 106 in ed201ce
|
The new test looks good, it however doesn't compile :/ Also, your LLM use caught up with you again: |
Summary
Two bugs in
hpx::serialization::detail::load()forstd::exception_ptrinlibs/core/serialization/src/exception_ptr.cpp.Bug 1 — Double-read of archive fields (silent data corruption)
In the
load()function, forhpx_exceptionandstd_system_errortypes, theerr_valueanderr_messagefields were read from the archive twice — once viaar & ...and again viaar >> ...:Since
save()writes each field once (usingar <<), this caused the deserialization read cursor to advance by 2× what was expected, producing silently wrongerr_valueanderr_messagein any distributed HPX application that propagateshpx::exceptionorstd::system_erroracross localities.Fix: Remove the redundant
ar & ...reads; keep onlyar >> ...to match thear <<used insave().Bug 2 — Type mismatch for
throw_line_(longsaved,intloaded)save()declaresthrow_line_aslong(8 bytes on 64-bit Linux), butload()declared it asint(4 bytes):On platforms where
sizeof(long) != sizeof(int)(e.g. 64-bit Linux/macOS), the serializer writes 8 bytes but the deserializer reads only 4, shifting all subsequent field reads by 4 bytes — a cascading deserialization corruption.Fix: Change
int throw_line_ = 0→long throw_line_ = 0inload()to matchsave().Changes
-3 / +1lines net changeTesting
These bugs are triggered in distributed scenarios where exceptions cross locality boundaries. The fix is mechanical — the load path now mirrors the save path exactly.