Based on the documentation and recent benchmarking efforts, we now have a holistic overview of the changes required and how we should move forward, both in terms of design and performance.
binary-sv2 has two layers. The first is the layer where we define the primitive SV2 data types and interpret raw bytes. We use a procedural macro which, much like Serde, defines how primitives are encoded and decoded from a binary frame into corresponding types.
Currently, the proc macro supports only very simple structs based on the SV2 types supported by binary-sv2. It does not support more complex or grouped data types such as enums, different struct variants, or self-describing structures. We may want to expand the macro’s capabilities so that these richer collections become first-class citizens.
Another important issue is the lifetime pollution in the SV2 primitives defined in binary-sv2. All types currently carry an associated lifetime, which leaks into user space. This lifetime exists to allow types to reference memory allocated per connection, avoiding additional allocations and improving performance. However, this approach becomes counterintuitive in scenarios where these types need to be stored for longer durations. In such cases, we end up holding onto shared memory that is intended to be reused across connections.
Although this design is efficient in avoiding allocations, it creates complications. When longer-lived storage is required, we still end up allocating memory on the heap while simultaneously retaining references to the shared pool due to frame pointers. This limits memory reuse and reduces the effectiveness of the pooling strategy.
An alternative approach is to reference shared memory only during deserialization and then produce an owned copy of the message. While this introduces some allocation overhead, in practice especially under higher message volumes it can be more efficient than the current approach of pooling followed by cloning. The difference is small (on the order of ms), but still favorable.
As a way forward, we discussed deferring this decision to the end user. The idea is to provide both options: one where users can reference shared pool memory directly, and another where the pool is used only during deserialization, after which an owned copy is returned. This allows users to choose the approach that best fits their use case and lifetime requirements.
There are also additional optimizations and improvements we can make to binary-sv2, such as reducing unnecessary allocations, removing types that are not part of the specs, improving type representations, modularizing the codebase, and trimming unused or bloated components. These are not urgent issues and can be addressed incrementally as part of the ongoing refactor of binary-sv2.
For more detailed analysis, check out the benchmarking report: https://hackmd.io/@mG-mY9sEQie4e8K5Bp-V6g/HkVN77N5We
Based on the documentation and recent benchmarking efforts, we now have a holistic overview of the changes required and how we should move forward, both in terms of design and performance.
binary-sv2has two layers. The first is the layer where we define the primitive SV2 data types and interpret raw bytes. We use a procedural macro which, much like Serde, defines how primitives are encoded and decoded from a binary frame into corresponding types.Currently, the proc macro supports only very simple structs based on the SV2 types supported by
binary-sv2. It does not support more complex or grouped data types such as enums, different struct variants, or self-describing structures. We may want to expand the macro’s capabilities so that these richer collections become first-class citizens.Another important issue is the lifetime pollution in the SV2 primitives defined in
binary-sv2. All types currently carry an associated lifetime, which leaks into user space. This lifetime exists to allow types to reference memory allocated per connection, avoiding additional allocations and improving performance. However, this approach becomes counterintuitive in scenarios where these types need to be stored for longer durations. In such cases, we end up holding onto shared memory that is intended to be reused across connections.Although this design is efficient in avoiding allocations, it creates complications. When longer-lived storage is required, we still end up allocating memory on the heap while simultaneously retaining references to the shared pool due to frame pointers. This limits memory reuse and reduces the effectiveness of the pooling strategy.
An alternative approach is to reference shared memory only during deserialization and then produce an owned copy of the message. While this introduces some allocation overhead, in practice especially under higher message volumes it can be more efficient than the current approach of pooling followed by cloning. The difference is small (on the order of ms), but still favorable.
As a way forward, we discussed deferring this decision to the end user. The idea is to provide both options: one where users can reference shared pool memory directly, and another where the pool is used only during deserialization, after which an owned copy is returned. This allows users to choose the approach that best fits their use case and lifetime requirements.
There are also additional optimizations and improvements we can make to
binary-sv2, such as reducing unnecessary allocations, removing types that are not part of the specs, improving type representations, modularizing the codebase, and trimming unused or bloated components. These are not urgent issues and can be addressed incrementally as part of the ongoing refactor ofbinary-sv2.For more detailed analysis, check out the benchmarking report: https://hackmd.io/@mG-mY9sEQie4e8K5Bp-V6g/HkVN77N5We