From 7788f2f6ec9053a5347a851f5d98074bdc7e792e Mon Sep 17 00:00:00 2001 From: Charlie Savage Date: Thu, 2 Apr 2026 22:25:56 -0700 Subject: [PATCH 1/3] Add planning docs for type sharing and packaging --- doc/plans/packaging.md | 188 +++++++++++++++++++++++++++++++ doc/plans/type_sharing.md | 228 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 416 insertions(+) create mode 100644 doc/plans/packaging.md create mode 100644 doc/plans/type_sharing.md diff --git a/doc/plans/packaging.md b/doc/plans/packaging.md new file mode 100644 index 00000000..52e26959 --- /dev/null +++ b/doc/plans/packaging.md @@ -0,0 +1,188 @@ +# Packaging Strategies For Rice Extensions + +## Summary + +Packaging is a separate concern from type sharing. + +Rice should define a cross-extension type-sharing mechanism that works regardless of gem topology. Binding authors should then be free to choose the packaging model that fits their library, audience, and maintenance goals. + +This document describes the main packaging shapes Rice should support and the tradeoffs between them. + +## Design Principles + +Packaging should not define type identity. + +Whether a binding ships as one gem, multiple gems, or one gem with multiple native extensions, shared type identity should still come from: + +* Ruby class path +* cached normalized C++ type key + +The packaging layer should remain policy chosen by the binding author. + +## Supported Packaging Shapes + +Rice should support at least these publication patterns. + +### One Gem, One Native Extension + +This is the simplest model. + +Characteristics: + +* one gemspec +* one native extension target +* one primary `require` + +Advantages: + +* easiest to explain to users +* simplest release process +* no cross-extension attach requirements inside the project itself + +Costs: + +* larger native binary +* larger build surface +* weak support for optional feature subsets + +### One Gem, Many Native Extensions + +One gem ships several native libraries and a Ruby loader coordinates them. + +Characteristics: + +* one gemspec +* one published artifact +* multiple native extension targets +* runtime can eager-load or lazy-load extension libraries + +Advantages: + +* keeps installation simple for users +* lets the project split build/runtime boundaries internally +* supports conditional builds based on available upstream modules + +Costs: + +* loader logic becomes more complex +* type sharing inside the gem still matters +* shipped source payload may remain large even if native binaries are split + +### Umbrella Gem Plus Module Gems + +A small top-level gem depends on one or more module gems. + +Characteristics: + +* one user-facing umbrella package +* real implementation split across subgems +* dependency graph controls which modules are installed by default + +Advantages: + +* can reduce install/build surface for optional modules +* keeps a simple top-level entrypoint +* versioning can stay coordinated while artifacts remain split + +Costs: + +* more release management +* more gemspecs and packaging metadata +* more chances for dependency skew if versions drift + +### Only Module Gems + +Users install the exact pieces they need. + +Characteristics: + +* no umbrella package required +* module boundaries are explicit in the package list + +Advantages: + +* smallest install footprint for targeted users +* packaging aligns closely with module boundaries + +Costs: + +* highest complexity for users +* weakest “just install this” story +* version compatibility between modules must be managed carefully + +## Runtime Loading + +Once a project has more than one native extension, it still needs a runtime loading policy. + +The common choices are: + +* eager-load everything that was built +* load a minimal foundation, then lazy-load feature modules +* load only core by default and require explicit module entrypoints + +Rice should not force one policy globally. + +## Wrapper Namespaces And Packaging + +Wrapper namespace choice affects packaging freedom. + +If generated helper/container types stay global in `Rice::*` and `Std::*`: + +* wrapper sharing becomes effectively process-global +* unrelated projects are more likely to collide +* packaging flexibility exists, but global namespace pressure remains + +If generated wrapper types move under the binding family's public namespace: + +* packaging becomes more independent across projects +* one library family can choose one gem while another chooses many gems +* unrelated projects are less likely to interfere with each other + +So packaging and wrapper namespaces are separate decisions, but they influence each other strongly. + +## Example: OpenCV + +`opencv-ruby` can reasonably choose: + +* one published gem +* multiple internal native module libraries +* CMake deciding which pieces can be built from the locally available OpenCV installation +* runtime lazy-loading of feature modules + +This is a good fit when: + +* the author values a simple install command +* shipped source volume is not a major concern +* local build variability is acceptable + +## Example: Qt + +A Qt binding author may reasonably choose: + +* multiple gems +* package boundaries closer to upstream Qt modules +* explicit dependencies between those gems + +This is a good fit when: + +* package boundaries are important to the author +* optional install surface matters +* upstream module structure is meaningful to users + +## Recommendation + +Rice should not prescribe one universal packaging model. + +The preferred Rice-level position is: + +* type sharing is a mechanism Rice provides +* packaging is a policy binding authors choose +* docs should present supported patterns and tradeoffs, not one mandatory topology + +## Non-Goals + +This document does not attempt to: + +* define a mandatory gem naming convention +* require one packaging shape for all Rice users +* make packaging decisions part of type identity diff --git a/doc/plans/type_sharing.md b/doc/plans/type_sharing.md new file mode 100644 index 00000000..edc0e6b1 --- /dev/null +++ b/doc/plans/type_sharing.md @@ -0,0 +1,228 @@ +# Type Sharing Across Rice Extensions + +## Summary + +Rice type sharing matters whenever more than one Rice extension is loaded into the same Ruby process. This includes libraries split into multiple native modules or gems, but it also includes independently authored Rice extensions that need to interoperate correctly. Modular packaging is only one instance of the broader cross-extension type-sharing problem. + +Rice should improve cross-extension correctness without trying to rescue extensions that define public classes in namespaces they do not own. Ruby modules and class paths remain the primary ownership mechanism. + +## Why This Matters + +There are at least two important use cases: + +* one project split into multiple native extensions, such as OpenCV or Qt bindings +* multiple independent Rice extensions from different authors loaded into the same Ruby process + +Both cases run into the same underlying problem: one extension may need to recognize, accept, or return Ruby classes and C++ types that were first bound by another extension. + +## Current Behavior + +Rice has two distinct layers of type state: + +* `Data_Type` local static binding state inside a compiled extension +* `TypeRegistry` local runtime lookup/cache state used for conversion, verification, and inheritance + +These are related but not the same thing. + +Historically, split Rice-based bindings could appear to work on Linux because default symbol visibility sometimes allowed Rice globals to be shared implicitly across `.so` files. In practice, the first loaded extension could become the master `TypeRegistry`. Issue [#355](https://github.com/ruby-rice/rice/issues/355) shows this clearly: split Qt bindings worked on Linux but failed on Windows. + +That accidental Linux behavior should not be treated as architecture. The default `CMakePresets.json` now tells GCC and Clang not to export symbols by default, so implicit Rice-global sharing should no longer be assumed even on Linux. + +The main current failure is not that `bind()` is too simple. The main failure is that lazy access to a locally unbound `Data_Type` fails too early: + +* `klass()` and `ruby_data_type()` call `check_is_bound()` and raise +* `is_descendant()` silently returns `false` if local state is missing + +## Identity And Compatibility + +The canonical identity for a shared binding is the Ruby class path. + +If two extensions define classes in a namespace they do not own and collide, that is developer error. Rice relies on normal Ruby namespace ownership rules rather than introducing a second ownership model. + +The compatibility check for a shared binding is: + +* same Ruby class path +* same normalized C++ type key cached on that Ruby class + +This same rule can also be used when discovering derived classes through Ruby subclass relationships. + +## Lazy Attach Design + +The primary change should be a non-throwing lazy attach path, such as `try_bind()`, not a redesign of `bind()` itself. + +`try_bind()` should: + +* return immediately if local static state is already attached +* look up the existing Ruby class for `T` +* validate the cached normalized C++ type key +* hydrate local `klass_` +* hydrate local `rb_data_type_` +* add the local `TypeRegistry` entry +* update any waiting `unbound_instances()` +* return success or failure without throwing + +Then: + +* `check_is_bound()` should call `try_bind()` before throwing +* `klass()` and `ruby_data_type()` continue to rely on `check_is_bound()` +* `is_descendant()` should call `try_bind()` before returning `false` + +Caching should be asymmetric: + +* cache success locally +* do not cache failure + +Load order will still matter. A miss can simply mean that the owning extension has not been loaded yet, so negative caching would be incorrect. + +## Inherited Types + +Arguments typed as a base class are already mostly fine. `From_Ruby` and similar conversions rely on Ruby subclass checks, so another extension does not need to know every concrete subclass just to accept a `Base`. + +The harder case is polymorphic returns typed as `Base*` or `Base&`. + +Today, `TypeRegistry::figureType()` already behaves like this: + +* return the most-derived locally known type if it is in the local registry +* otherwise fall back to the requested/base type + +That base fallback is acceptable as the baseline behavior for v1. + +The preferred enhancement is: + +* on a local derived-type miss +* walk the Ruby subclass tree of the known base class +* compare each subclass's cached normalized C++ type key to the runtime type +* import matching descendants into the local cache +* retry lookup before falling back to base + +This keeps `TypeRegistry` local while still improving cross-extension polymorphic returns for classes that are already loaded somewhere in Ruby. + +## Shared `TypeRegistry` As An Alternative + +The obvious alternative is to export or share one process-wide `TypeRegistry`. + +That explains historical Linux behavior, where the first loaded extension could effectively become the master registry. It also explains why split bindings could appear to work without any explicit attach mechanism. + +But this approach has major drawbacks: + +* it is load-order dependent +* it is not portable to Windows +* it no longer matches the default hidden-symbol build configuration +* it is still not sufficient by itself, because local `Data_Type` state may remain unbound +* it creates tighter native coupling between extensions + +So shared `TypeRegistry` is a real historical behavior, not a hypothetical one, but it should not be the preferred design. + +## Wrapper Namespace Options + +Rice auto-generates essential helper and container types for large bindings. For libraries such as OpenCV, manually naming every instantiated template type is not realistic. + +Examples include: + +* `Rice::Pointer` +* `Rice::Buffer` +* `Std::Vector` +* `Std::Map` +* `Std::SharedPtr` + +There are two viable namespace models. + +### Option 1: Global Rice-Owned Wrapper Namespaces + +Generated helper/container classes stay in global public namespaces: + +* `Rice::*` +* `Std::*` + +Under this model: + +* generated classes are process-global by Ruby class path +* sharing is mandatory for matching `Rice::*` and `Std::*` class paths +* one gem or many gems still share the same generated wrapper space + +Advantages: + +* closest to current Rice behavior and docs +* no new namespace configuration surface +* existing examples remain valid + +Costs: + +* unrelated Rice extensions can collide in the same wrapper namespace +* sharing is mandatory, not policy-controlled +* one global generated-wrapper universe is forced onto the whole process + +### Option 2: Binding-Family-Owned Wrapper Namespaces + +Generated helper/container classes live under the binding family's public namespace. + +For OpenCV, examples could be: + +* `Cv::Rice::Pointer` +* `Cv::Rice::Buffer` +* `Cv::Std::Vector` +* `Cv::Std::Vector` + +Qt could choose its own parallel structure. + +Under this model: + +* generated wrappers are shared within one binding family +* unrelated projects no longer collide just because both use Rice +* packaging choices become freer because wrapper identity is scoped to the family namespace + +Advantages: + +* much better isolation between unrelated extensions +* better fit for modular multi-gem or multi-extension libraries +* same Ruby class path is a stronger identity rule because the namespace is author-owned + +Costs: + +* larger public API change +* Rice needs configuration for generated wrapper namespace roots +* generated wrapper names become part of each binding family's public API + +Preferred long-term direction: + +* binding-family-owned wrapper namespaces + +## Packaging + +Packaging should remain author policy, not a Rice-level requirement. + +The type-sharing mechanism should work regardless of whether a project chooses: + +* one gem, one native extension +* one gem, many native extensions +* umbrella gem plus module gems +* only module gems + +Packaging tradeoffs and examples are documented separately in [packaging.md](packaging.md). + +## Load Order + +Load order will still matter. + +Before the owning extension is loaded: + +* `try_bind()` can miss +* hard-use sites such as `klass()` and `ruby_data_type()` will still raise +* soft-probe sites such as `is_descendant()` should simply return `false` + +After the owning extension is loaded: + +* the next `try_bind()` can succeed +* local state attaches once +* future accesses become cheap + +The goal is not to eliminate load-order sensitivity. The goal is to make it predictable and avoid poisoning local state with cached misses. + +## Non-Goals + +This design does not attempt to: + +* rescue extensions that define public classes in namespaces they do not own +* solve arbitrary unrelated collisions in the global namespace +* infer full native compatibility beyond Ruby class path plus normalized C++ type key +* force one packaging topology on all Rice users From 2c8c79a2718f726f391d94fbdaf0a18b517f0d69 Mon Sep 17 00:00:00 2001 From: Charlie Savage Date: Thu, 9 Apr 2026 01:05:02 -0700 Subject: [PATCH 2/3] Fix undefined behavior when deleting polymorphic classes with non-virtual destructors. The Wrapper destructor guard used is_abstract_v which only caught abstract classes. Concrete polymorphic classes (virtual methods but no virtual destructor) still triggered delete, causing UB. Use is_polymorphic_v instead to cover all polymorphic types. Co-Authored-By: Claude Opus 4.6 (1M context) --- rice/detail/Wrapper.ipp | 6 +++--- test/test_Constructor.cpp | 35 +++++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+), 3 deletions(-) diff --git a/rice/detail/Wrapper.ipp b/rice/detail/Wrapper.ipp index 718189ea..19a94c1c 100644 --- a/rice/detail/Wrapper.ipp +++ b/rice/detail/Wrapper.ipp @@ -121,10 +121,10 @@ namespace Rice::detail if constexpr (is_complete_v) { - // is_abstract_v requires a complete type, so nest inside is_complete_v. - // Deleting an abstract class through a non-virtual destructor is UB, + // is_polymorphic_v requires a complete type, so nest inside is_complete_v. + // Deleting a polymorphic class through a non-virtual destructor is UB, // but it is safe if the destructor is virtual. - if constexpr (std::is_destructible_v && (!std::is_abstract_v || std::has_virtual_destructor_v)) + if constexpr (std::is_destructible_v && (!std::is_polymorphic_v || std::has_virtual_destructor_v)) { if (this->isOwner_) { diff --git a/test/test_Constructor.cpp b/test/test_Constructor.cpp index 287003b8..e2720753 100644 --- a/test/test_Constructor.cpp +++ b/test/test_Constructor.cpp @@ -230,6 +230,41 @@ TESTCASE(constructor_move) //c.define_constructor(Constructor()); } +namespace +{ + // Polymorphic (has virtual methods) but non-virtual destructor. + // Deleting through a pointer is UB in this case. Rice's + // Wrapper::~Wrapper should not delete such objects. + class PolymorphicNonVirtualDtor + { + public: + PolymorphicNonVirtualDtor() : value_(42) {} + ~PolymorphicNonVirtualDtor() = default; + + virtual int value() + { + return value_; + } + + private: + int value_; + }; +} + +TESTCASE(constructor_polymorphic_non_virtual_dtor) +{ + Data_Type rb_cPolymorphicNonVirtualDtor(anonymous_class()); + rb_cPolymorphicNonVirtualDtor + .define_constructor(Constructor()) + .define_method("value", &PolymorphicNonVirtualDtor::value); + + Object o = rb_cPolymorphicNonVirtualDtor.call("new"); + ASSERT_EQUAL(rb_cPolymorphicNonVirtualDtor, o.class_of()); + + Object value = o.call("value"); + ASSERT_EQUAL(42, detail::From_Ruby().convert(value)); +} + namespace { class MoveOnlyValue From 637c248ef5cd6bf40de5b24fe51472dd8080c9dd Mon Sep 17 00:00:00 2001 From: Charlie Savage Date: Thu, 9 Apr 2026 01:05:49 -0700 Subject: [PATCH 3/3] Bump version to 4.12.0 and update changelog. Co-Authored-By: Claude Opus 4.6 (1M context) --- CHANGELOG.md | 21 ++++++++++++++++++++- lib/rice/version.rb | 2 +- 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 25d5ea20..8caaab13 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,12 +1,31 @@ # Changelog -## Next Release +## 4.12.0 (2026-04-09) + +### Bug Fixes + +* Fix undefined behavior when deleting polymorphic classes with non-virtual destructors +* Fix overload resolution to prefer `T&` or `const T&` over `T&&` after perfect forwarding changes +* Fix `Reference` constructor overload resolution so forwarded arguments bind to the typed overload instead of falling through to `Reference(VALUE)` +* Reject incomplete `std::unique_ptr` bindings at compile time ### Enhancements +* Forward `Object::call` arguments without copying +* Forward Rice constructor rvalue arguments +* Forward `std::function` callback arguments +* Add support for `To_Ruby` for types missing it * Allow raw Ruby procs and lambdas to convert directly to `std::function` parameters while keeping the callable alive with `Pin` * Only define equality-based STL container methods when the contained type supports C++ equality comparison +## 4.11.5 (2026-03-24) + +### Bug Fixes + +* Fix Valgrind invalid reads caused by stale GC root addresses (#399) +* Fix `rb_gc_register_address()` triggering GC before Anchor stores heap object — use `RB_GC_GUARD` to keep VALUE alive +* Fix gem packaging warning + ## 4.11.4 (2026-03-13) ### Bug Fixes diff --git a/lib/rice/version.rb b/lib/rice/version.rb index 89e02b2d..e0de85a9 100644 --- a/lib/rice/version.rb +++ b/lib/rice/version.rb @@ -1,3 +1,3 @@ module Rice - VERSION = "4.11.4" + VERSION = "4.12.0" end