Skip to content

[STF] Refactor data_place to use polymorphic design#7976

Open
andralex wants to merge 10 commits intoNVIDIA:mainfrom
caugonnet:stf_data_place_polymorphic
Open

[STF] Refactor data_place to use polymorphic design#7976
andralex wants to merge 10 commits intoNVIDIA:mainfrom
caugonnet:stf_data_place_polymorphic

Conversation

@andralex
Copy link
Contributor

Summary

Replaces the ad-hoc tagged union design in data_place with a clean polymorphic architecture.

Before: data_place used an int devid plus separate shared_ptr<composite_state> and shared_ptr<data_place_extension> members — an awkward mix of enum-based type tags with extension pointers.

After:

  • data_place_interface (new data_place_interface.cuh) — abstract virtual interface for all place types, with sensible defaults for hash(), equals(), less_than(), and mem_create()
  • data_place_impl.cuh — concrete implementations: data_place_host, data_place_managed, data_place_device, data_place_affine, data_place_device_auto, data_place_invalid
  • data_place_composite — defined in places.cuh after exec_place_grid (needed due to its dependency on that type)
  • data_place — now holds a shared_ptr<data_place_interface> and delegates all operations to it
  • Static singleton instances with no-op deleters for common types (host, managed, invalid) to avoid per-use allocations
  • Removes data_place_extension.cuh — custom extensions now inherit directly from data_place_interface

Test plan

  • Build cudax.test.stf and verify no regressions

Made with Cursor

Replace the ad-hoc tagged union design in data_place with a clean
polymorphic architecture using data_place_interface as the abstract
base class.

Key changes:
- Add data_place_interface.cuh defining the virtual interface for all
  data place types with sensible defaults for hash/equals/less_than
- Add data_place_impl.cuh with concrete implementations: data_place_host,
  data_place_managed, data_place_device, data_place_affine, etc.
- Refactor data_place to hold shared_ptr<data_place_interface> and
  delegate all operations to the implementation
- Use static instances with no-op deleters for common singletons (host,
  managed, invalid) to avoid repeated allocations
- Remove data_place_extension.cuh - extensions now inherit directly
  from data_place_interface and provide get_affine_exec_impl()
- Update green_context to inherit from data_place_interface directly

This design enables cleaner extensibility and eliminates the awkward
mix of enum-based type tags with extension pointers.

Made-with: Cursor
@andralex andralex requested a review from a team as a code owner March 10, 2026 15:57
@andralex andralex requested a review from caugonnet March 10, 2026 15:57
@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 10, 2026
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 10, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Mar 10, 2026
@andralex
Copy link
Contributor Author

pre-commit.ci autofix

Require each concrete data_place implementation to provide its own hash()
instead of relying on the default get_device_ordinal()-based implementation.
Also fix stray includes of the getStream/getDataStream rename and a duplicate
interpreted_execution_policy constructor that shouldn't have been in this PR.

Made-with: Cursor
- Make hash() pure virtual; each concrete class provides its own implementation
- Replace equals()/less_than() with a single pure virtual cmp() returning -1/0/1;
  data_place::operator==/<  delegate to pimpl_->cmp() directly
- Move device ordinals into data_place_interface as a nested unscoped enum ord,
  accessible as data_place_interface::host, ::managed, etc.
- Cache data_place_device instances in a static array (same pattern as
  exec_place::device), avoiding per-call allocations
- Remove nop_deleter struct; inline the lambda directly in make_static_instance
- data_place::device() now asserts dev_id >= 0, dropping legacy negative-id
  special cases

Made-with: Cursor
@andralex
Copy link
Contributor Author

pre-commit.ci autofix

@andralex
Copy link
Contributor Author

/ok to test d67f93f

@andralex
Copy link
Contributor Author

/ok to test 10c3877

@github-actions

This comment has been minimized.

…posite::cmp()

Replace ordered comparison of function pointers with std::less, which
provides a well-defined total order on pointers without triggering
Clang's -Wordered-compare-function-pointers diagnostic.

Made-with: Cursor
@andralex
Copy link
Contributor Author

/ok to test 4d74379

@caugonnet caugonnet added stf Sequential Task Flow programming model places labels Mar 10, 2026
@github-actions

This comment has been minimized.


// For simple places, compare devid
return devid < rhs.devid;
EXPECT((!is_composite() && !rhs.is_composite()), "Ordering of composite places is not implemented.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is implemented

…conversion

Use structural grid comparison (shape + element-by-element) in
data_place_composite::cmp(), matching the exec_place_grid comparison model
as Cedric suggested.

Add from_index() as the proper inverse of to_index(), and fix the
reduction plan code that was incorrectly using data_place::device(n-2)
for all node indices, which crashed when n < 2 (host/managed nodes).

Made-with: Cursor
@andralex
Copy link
Contributor Author

/ok to test

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 10, 2026

/ok to test

@andralex, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

@andralex
Copy link
Contributor Author

/ok to test 536826a

@github-actions

This comment has been minimized.

@andralex
Copy link
Contributor Author

/ok to test 3542521

The friend definition inside data_place is only found via ADL when a
data_place argument is present. Calls like from_index(n) in logical_data.cuh
have no data_place argument, so add a matching declaration at namespace
scope so the identifier is visible to unqualified lookup.

Made-with: Cursor
@andralex andralex force-pushed the stf_data_place_polymorphic branch from 3542521 to ede252b Compare March 11, 2026 01:09
@andralex
Copy link
Contributor Author

/ok to test ede252b

@github-actions
Copy link
Contributor

😬 CI Workflow Results

🟥 Finished in 31m 33s: Pass: 93%/48 | Total: 12h 15m | Max: 25m 03s | Hits: 68%/21910

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

places stf Sequential Task Flow programming model

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

2 participants