Bringing all default options to one place#1028
Conversation
Signed-off-by: niranda perera <niranda.perera@gmail.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
madsbk
left a comment
There was a problem hiding this comment.
Remove CMakeFiles/test_sources.dir/test_host_buffer.cpp.o
| */ | ||
| template <typename T> | ||
| T const& get(std::string const& key, OptionFactory<T> factory) { | ||
| T const& get(std::string_view key, OptionFactory<T> factory) { |
There was a problem hiding this comment.
Why not just keep std::string const& key?
There was a problem hiding this comment.
wanted to preserve the constexpr for OptionDescriptor
| * Both `key` and `default_val` are stored as `std::string_view`. Options are | ||
| * always parsed from their string representation at runtime, so the default | ||
| * is expressed as a string and fed through the same factory the call site | ||
| * uses for user-supplied values. Descriptors must be initialized from string | ||
| * literals so that `key.data()` and `default_val.data()` yield | ||
| * null-terminated `char const*` pointers, which are consumed directly by the | ||
| * Cython bindings. | ||
| */ | ||
| struct OptionDescriptor { | ||
| std::string_view key; ///< Lookup key passed to `Options::get`. | ||
| std::string_view default_val; ///< String form of the value used when unset. |
There was a problem hiding this comment.
Maybe just use std::string to avoid this complication, I don't think it will make any difference performance wise?
| T parse_string(std::string_view text) { | ||
| std::stringstream sstream{std::string{text}}; |
There was a problem hiding this comment.
Why not std::string const& text and avoid a copy?
Signed-off-by: niranda perera <niranda.perera@gmail.com>
madsbk
left a comment
There was a problem hiding this comment.
@nirandaperera could you reconsider the string-handling complexity here?
I would like to keep a plain std::string const& key in Options::get, and drop the StringHash / transparent-lookup machinery along with it.
Tracing it back, the whole apparatus seems to exist to serve one decision: making the OptionDescriptors inline constexpr. That forces std::string_view members, which then forces the heterogeneous-hashing dance in get() to avoid a temporary allocation per call.
But nothing in the codebase actually consumes the descriptors in a constexpr context. If we switch the descriptors to inline const with std::string members, the chain collapses:
options.get(EnabledOption.key, ...)binds directly to std::string const&, with no temporary and no transparent hashing.
The only thing that matters at runtime here is the performance of Options::get() itself.
Signed-off-by: niranda perera <niranda.perera@gmail.com>
[C++/Python]Centralize option keys and defaults viaOptionDescriptorOption keys and default values were scattered as raw string literals across C++ and Python call sites, making it easy for the two languages to drift and for typos to go unnoticed. This PR consolidates every recognised configuration option into a single source of truth in
cpp/include/rapidsmpf/config.hppand exposes the same keys and defaults to Python through a newrapidsmpf.config_defaultsmodule that pulls the values directly from the C++ header at import time.OptionDescriptor{key, default_val}(non-templated) inconfig.hpp, with both fields stored asstd::string_view. Defaults are written as string literals (e.g."false","16","1ms","WARN") since options are always parsed from their string representation — the descriptor's default flows through the same factory the call site uses for user-supplied values.statistics.cpp,buffer_resource.cpp,pinned_memory_resource.cpp,communicator.cpp,ucxx.cpp,coro_executor.cpp, andmemory_reserve_or_wait.cppwith the descriptor-based API. Typed defaults (bool / size_t / uint32_t) are parsed viaparse_string<T>so the canonical string form is the single source of truth.Options::getto acceptstd::string_viewso descriptor keys can be passed directly (existingstd::string/const char*callers still work via implicit conversion).parse_string<T>to acceptstd::string_viewso descriptor defaults flow through unchanged.rapidsmpf/config_defaults.{pyx,pyi}exposing oneFinal[str]constant per option (e.g.BUFFER_RESOURCE_NUM_STREAMS) plus a read-onlyDEFAULTS: Mapping[str, str](MappingProxyType) sourced from the C++ header viacdef extern. A singleRMPF_OPTmacro re-exposeskey.data()anddefault_val.data()as null-terminatedconst char*aliases — safe because both views are initialised from string literals.memory/buffer_resource.pyx,streaming/core/memory_reserve_or_wait.pyx,integrations/core.py,tests/test_config.py) to import fromconfig_defaultsinstead of using raw string keys / hard-coded defaults. Consumers that need typed values parse the string default on-site (e.g.int(_OPTION_DEFAULTS[BUFFER_RESOURCE_NUM_STREAMS]),parse_boolean(_OPTION_DEFAULTS[STREAMING_ALLOW_OVERBOOKING_BY_DEFAULT])), mirroring the C++ side.streaming::AllowOverbookingByDefaultOptionand drop theFactorsuffix from the pinned-pool size descriptors.