Skip to content

Bringing all default options to one place#1028

Open
nirandaperera wants to merge 14 commits into
rapidsai:mainfrom
nirandaperera:default-options
Open

Bringing all default options to one place#1028
nirandaperera wants to merge 14 commits into
rapidsai:mainfrom
nirandaperera:default-options

Conversation

@nirandaperera
Copy link
Copy Markdown
Contributor

@nirandaperera nirandaperera commented May 12, 2026

[C++/Python] Centralize option keys and defaults via OptionDescriptor

Option keys and default values were scattered as raw string literals across C++ and Python call sites, making it easy for the two languages to drift and for typos to go unnoticed. This PR consolidates every recognised configuration option into a single source of truth in cpp/include/rapidsmpf/config.hpp and exposes the same keys and defaults to Python through a new rapidsmpf.config_defaults module that pulls the values directly from the C++ header at import time.

  • Add OptionDescriptor{key, default_val} (non-templated) in config.hpp, with both fields stored as std::string_view. Defaults are written as string literals (e.g. "false", "16", "1ms", "WARN") since options are always parsed from their string representation — the descriptor's default flows through the same factory the call site uses for user-supplied values.
  • Replace ad-hoc key/default literals across statistics.cpp, buffer_resource.cpp, pinned_memory_resource.cpp, communicator.cpp, ucxx.cpp, coro_executor.cpp, and memory_reserve_or_wait.cpp with the descriptor-based API. Typed defaults (bool / size_t / uint32_t) are parsed via parse_string<T> so the canonical string form is the single source of truth.
  • Widen Options::get to accept std::string_view so descriptor keys can be passed directly (existing std::string / const char* callers still work via implicit conversion).
  • Widen parse_string<T> to accept std::string_view so descriptor defaults flow through unchanged.
  • Introduce rapidsmpf/config_defaults.{pyx,pyi} exposing one Final[str] constant per option (e.g. BUFFER_RESOURCE_NUM_STREAMS) plus a read-only DEFAULTS: Mapping[str, str] (MappingProxyType) sourced from the C++ header via cdef extern. A single RMPF_OPT macro re-exposes key.data() and default_val.data() as null-terminated const char* aliases — safe because both views are initialised from string literals.
  • Migrate Python consumers (memory/buffer_resource.pyx, streaming/core/memory_reserve_or_wait.pyx, integrations/core.py, tests/test_config.py) to import from config_defaults instead of using raw string keys / hard-coded defaults. Consumers that need typed values parse the string default on-site (e.g. int(_OPTION_DEFAULTS[BUFFER_RESOURCE_NUM_STREAMS]), parse_boolean(_OPTION_DEFAULTS[STREAMING_ALLOW_OVERBOOKING_BY_DEFAULT])), mirroring the C++ side.
  • Promote the previously unnamed streaming option to streaming::AllowOverbookingByDefaultOption and drop the Factor suffix from the pinned-pool size descriptors.

Signed-off-by: niranda perera <niranda.perera@gmail.com>
@nirandaperera nirandaperera requested review from a team as code owners May 12, 2026 23:21
@nirandaperera nirandaperera added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels May 12, 2026
@nirandaperera nirandaperera marked this pull request as draft May 13, 2026 00:08
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 13, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
@nirandaperera nirandaperera marked this pull request as ready for review May 13, 2026 07:26
Comment thread cpp/include/rapidsmpf/config.hpp
@nirandaperera nirandaperera requested a review from madsbk May 13, 2026 17:20
@nirandaperera nirandaperera changed the title Bringing all default options in one place Bringing all default options to one place May 13, 2026
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Copy link
Copy Markdown
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove CMakeFiles/test_sources.dir/test_host_buffer.cpp.o

Comment thread cpp/include/rapidsmpf/config.hpp Outdated
*/
template <typename T>
T const& get(std::string const& key, OptionFactory<T> factory) {
T const& get(std::string_view key, OptionFactory<T> factory) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just keep std::string const& key?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wanted to preserve the constexpr for OptionDescriptor

Comment thread cpp/include/rapidsmpf/config.hpp Outdated
Comment on lines +402 to +412
* Both `key` and `default_val` are stored as `std::string_view`. Options are
* always parsed from their string representation at runtime, so the default
* is expressed as a string and fed through the same factory the call site
* uses for user-supplied values. Descriptors must be initialized from string
* literals so that `key.data()` and `default_val.data()` yield
* null-terminated `char const*` pointers, which are consumed directly by the
* Cython bindings.
*/
struct OptionDescriptor {
std::string_view key; ///< Lookup key passed to `Options::get`.
std::string_view default_val; ///< String form of the value used when unset.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just use std::string to avoid this complication, I don't think it will make any difference performance wise?

Comment on lines +242 to +243
T parse_string(std::string_view text) {
std::stringstream sstream{std::string{text}};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not std::string const& text and avoid a copy?

Signed-off-by: niranda perera <niranda.perera@gmail.com>
Copy link
Copy Markdown
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nirandaperera could you reconsider the string-handling complexity here?

I would like to keep a plain std::string const& key in Options::get, and drop the StringHash / transparent-lookup machinery along with it.

Tracing it back, the whole apparatus seems to exist to serve one decision: making the OptionDescriptors inline constexpr. That forces std::string_view members, which then forces the heterogeneous-hashing dance in get() to avoid a temporary allocation per call.

But nothing in the codebase actually consumes the descriptors in a constexpr context. If we switch the descriptors to inline const with std::string members, the chain collapses:

options.get(EnabledOption.key, ...)

binds directly to std::string const&, with no temporary and no transparent hashing.

The only thing that matters at runtime here is the performance of Options::get() itself.

Signed-off-by: niranda perera <niranda.perera@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants