Skip to content

Redfish OEM Extension for SONiC BMC with Schema Mapping to Redis & D-Bus#3

Open
chinmoy-nexthop wants to merge 14 commits into
sonic-net:masterfrom
nexthop-ai:oem/bmcweb-with-dbus-bridge
Open

Redfish OEM Extension for SONiC BMC with Schema Mapping to Redis & D-Bus#3
chinmoy-nexthop wants to merge 14 commits into
sonic-net:masterfrom
nexthop-ai:oem/bmcweb-with-dbus-bridge

Conversation

@chinmoy-nexthop
Copy link
Copy Markdown
Contributor

@chinmoy-nexthop chinmoy-nexthop commented Mar 31, 2026

Why I did it

The SONiC switch BMC (running bmcweb + sonic-dbus-bridge) had no standard channel for a rack manager to push structured alerts (leak detected, liquid pressure deviation, shutdown triggers, etc.) or periodic telemetry (inlet temperature, flow rate, energy valve state, glycol concentration, etc.) to the switch. Operators had no visibility into cooling or power-delivery anomalies that originate at the rack manager. This change introduces a SONiC-specific OEM Redfish extension — Manager.Oem.SONiC.RackManager — that gives the rack manager two fire-and-forget POST actions (SubmitAlert, SubmitTelemetry) and persists the received data to Redis STATE_DB so that existing SONiC tooling (CLI, telemetry daemons) can consume it without any further protocol bridging.

How I did it

The work is layered across four areas, each building on the previous:

  1. Schema

Added oem-extension/schema/json-schema/SonicManager.v1_0_0.json — the authoritative DMTF-style OEM schema defining Manager, RackManager, SubmitAlert, SubmitTelemetry, AlertEntry, Alarms, and the SonicSeverity enum (Normal / Minor / Major / Critical).
Added an unversioned alias SonicManager.json that points to v1_0_0.
Expanded AlertEntry to support both flat alert form (Redfish..) and the ShutdownAlert-wrapped form where RscmPosition lives on the parent wrapper and applies to all nested leaf alerts.
Expanded the Alarms definition to be an open container so generic telemetry fields (any scalar sensor value) pass through to the bridge without schema changes.

  1. bmcweb route handlers

Added four header-only files under oem-extension/sonic/:
sonic_rack_manager.hpp — GET handler returning the OEM sub-resource with action target URIs.
sonic_submit_alert.hpp — POST handler: validates managerId, rejects bodies > 64 KiB (413), validates JSON, enforces the Redfish envelope key, forwards the raw JSON string to D-Bus.
sonic_submit_telemetry.hpp — POST handler: validates managerId, rejects oversized bodies, enforces the Alarms envelope key, forwards to D-Bus.
sonic_oem_constants.hpp — single source of truth for the D-Bus triple (com.sonic.RackManager, object path, interface) and the 64 KiB body cap. Both action headers import this to prevent drift.
sonic_oem_redfish.hpp — single requestRoutesSonicOem() entry point that calls all three requestRoutes* functions; this is the only symbol patched into redfish.cpp.
HTTP status mapping: 204 No Content on success (DMTF fire-and-forget convention), 503 + Retry-After: 1 for D-Bus transport errors (bridge down/timeout), operationFailed (4xx) when the bridge explicitly rejects the payload. Previously both failure modes collapsed into a generic 500.
A bmcweb patch (patches/0003-Integrate-SONiC-OEM-extension.patch) wires the OEM schema into meson.build and calls requestRoutesSonicOem from redfish.cpp.

  1. sonic-dbus-bridge

Added field_mapping.hpp — a header-only, declarative table of FieldMapping structs. Each entry maps a dot-separated JSON path (e.g. Redfish.LiquidPressureDeviation.LiquidPressure) to a Redis hash key (RSCM_ALERT|LiquidPressureDeviation) and field (liquid_pressure). Adding or removing a field from the bridge requires only a one-line change to this table; no logic changes.
Added RackManagerReceiver class (rack_manager_receiver.hpp / .cpp) claiming the com.sonic.RackManager D-Bus bus name. Threading model: D-Bus dispatch thread parses JSON and enqueues a pre-extracted (key, field, value) job in O(1); a dedicated worker thread owns the hiredis connection and pipelines HSET commands to Redis STATE_DB (DB index 6). This keeps the sdbusplus asio loop free even when Redis is slow.
Wired RackManagerReceiver into BridgeApp alongside the existing request handler.

  1. Tests and build

Added 22 JSON-driven integration test cases (tests/redfish-api/cases/oem_manager.json) covering: action discovery (GET), end-to-end POST -> STATE_DB (redis_validations), and 400/401/405 error-body assertions.
Extended the test framework (test_runner.py, validator.py) to support POST request bodies (body), Redis pre-conditions (redis_setup), post-call Redis field validation (redis_validations), and response-body assertions for error paths (expected_response). Added a guard that rejects expected_status: 204 combined with expected_response (contradictory by definition).
Added a field_mapping gtest suite (unit-tests/test_field_mapping.cpp) with a header-only test shim so the mapping tables can be tested without linking the full bridge.
Added the OEM extension copy step to the top-level Makefile so make produces a bmcweb image that includes the OEM headers and schema.
updated oem-extension/README.md and updated tests/README.md and README.md

How to verify it

make 
make test
make unit-test

image image

@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

No pipelines are associated with this pull request.

@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

No pipelines are associated with this pull request.

@chinmoy-nexthop chinmoy-nexthop marked this pull request as ready for review April 11, 2026 17:02
Comment thread oem-extension/schema/json-schema/SonicManager.json Outdated
Comment thread oem-extension/schema/json-schema/SonicManager.v1_0_0.json Outdated
Comment thread sonic-dbus-bridge/include/rack_manager_receiver.hpp Outdated
Comment thread sonic-dbus-bridge/src/rack_manager_receiver.cpp Outdated
@chinmoy-nexthop chinmoy-nexthop force-pushed the oem/bmcweb-with-dbus-bridge branch from efd6cc8 to ce080d2 Compare April 16, 2026 10:00
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

No pipelines are associated with this pull request.

@chinmoy-nexthop chinmoy-nexthop force-pushed the oem/bmcweb-with-dbus-bridge branch from ce080d2 to 0fd6089 Compare April 16, 2026 10:17
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

No pipelines are associated with this pull request.

Comment thread sonic-dbus-bridge/include/field_mapping.hpp
Comment thread sonic-dbus-bridge/src/rack_manager_receiver.cpp Outdated
Introduce DMTF-style Redfish OEM JSON schemas for the SONiC namespace:
- SonicManager.json (unversioned alias)
- SonicManager.v1_0_0.json (typed properties: RackManager, SubmitAlert,
  SubmitTelemetry action references)
- meson.build entry listing the OEM schemas so the bmcweb build system
  installs them under /usr/share/www/redfish/v1/JsonSchemas.

Reference: sonic-net#3
Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
Add header-only C++ entry points consumed by bmcweb to expose the
SONiC OEM resources:
- sonic_oem_redfish.hpp     : top-level route registrar
- sonic_rack_manager.hpp    : /redfish/v1/Managers/RackManager resource
- sonic_submit_alert.hpp    : SubmitAlert action handler
- sonic_submit_telemetry.hpp: SubmitTelemetry action handler

These files are copied into bmcweb/redfish-core/lib/sonic/ by the build
system and referenced from redfish.cpp via the integration patch.

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
Implement the runtime backend for the SONiC OEM rack-manager actions:

- field_mapping.hpp: declarative mapping between JSON dotted paths in
  the incoming D-Bus payload and (table, key, field) tuples in Redis
  STATE_DB.

- rack_manager_receiver.{hpp,cpp}: registers the
  com.sonic.RackManager service at
  /xyz/openbmc_project/sonic/rack_manager exposing SubmitAlert and
  SubmitTelemetry methods. Parses the JSON payload, walks the
  FieldMapping table, and HSETs the resolved fields into STATE_DB
  (DB 6). Connection logic tries TCP first then falls back to known
  Unix-socket paths, and transparently reconnects on HSET failure.

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
- Add rack_manager_receiver.cpp to the meson sources list.
- Hold a std::unique_ptr<RackManagerReceiver> in BridgeApp and
  construct it after createStateObjects() using the configured
  STATE_DB host/port from ConfigManager.
- On successful initialize(), register the RACK_MANAGER object path
  and interface with the ObjectMapper so other bridge consumers can
  discover it. Failure is non-fatal and logged as a warning.

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
Add quilt patch 0003-Integrate-SONiC-OEM-extension.patch and append
it to patches/series. The patch hooks sonic_oem_redfish.hpp into
bmcweb/redfish-core/src/redfish.cpp so the SONiC OEM routes
(RackManager, SubmitAlert, SubmitTelemetry) are registered alongside
the upstream Redfish service handlers.

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
- Add OEM_EXT_DIR variable and copy-oem-extension PHONY target that
  stages oem-extension/sonic/*.hpp into bmcweb/redfish-core/lib/sonic/
  and the OEM JSON schemas + meson.build into
  bmcweb/redfish-core/schema/oem/sonic/.
- Generate bmcweb/subprojects/stdexec.wrap on demand: sdbusplus pulls
  stdexec as a nested subproject but bmcweb does not ship a top-level
  wrap for it, so a clean build would fail to resolve the dependency.
- Make build-in-docker, apply-patches, build-bmcweb,
  build-bmcweb-native and the sonic-buildimage $(BMCWEB) target all
  depend on copy-oem-extension so the OEM files are present before
  patches are applied.

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
Adds tests/unit-tests/field_mapping_test.cpp covering the declarative
alert/telemetry mapping tables that drive RackManagerReceiver:

  * Tables are non-empty and return the same static instance per call
  * Every entry has non-empty jsonPath / redisKey / redisField and a
    valid FieldType
  * jsonPath uses well-formed dot notation (no spaces, no leading or
    trailing '.')
  * Telemetry keys live under 'RSCM_TELEMETRY|' and alert keys under
    'RSCM_ALERT|' so STATE_DB namespaces stay distinct
  * No two rows in either table target the same (redisKey, redisField)
    pair -- catches silent HSET overwrites when the schema evolves

field_mapping.hpp is header-only, so the existing unit-test runner --
which only compiled <foo>_test.cpp when sonic-dbus-bridge/src/<foo>.cpp
also existed -- silently skipped any header-only target. The Makefile
unit-test recipe is updated to compile such tests standalone, keeping
the existing two-file compilation path intact for tests with a matching
.cpp source.

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
…utes

Adds tests/redfish-api/cases/oem_manager.json covering the OEM
RackManagerInterface exposed under /redfish/v1/Managers/bmc:

GET-side coverage:
  * /redfish/v1/Managers/ and /redfish/v1/Managers/bmc are reachable
  * The BMC Manager payload carries the Oem.SONiC.RackManagerInterface
    sub-resource with the expected @odata.type, @odata.id, Id and Name
  * Actions.#SONiC.SubmitAlert.target and Actions.#SONiC.SubmitTelemetry
    .target are advertised at the canonical action URIs

POST-side coverage (negative tests, no body fixtures needed):
  * Unauthenticated POSTs to SubmitAlert / SubmitTelemetry return 401
  * Authenticated POSTs with an empty body return 400 (malformed JSON)
  * GET on the action target returns 405 (POST-only)

All cases run against the live test stack started by
tests/redfish-api/framework/start_services.sh (dbus-daemon, redis,
sonic-dbus-bridge, bmcweb) and exercise the full bmcweb -> D-Bus ->
bridge path for the OEM routes.

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
… I/O to a worker thread

Splits the rack manager receiver off the Inventory connection onto its
own well-known bus name, and moves the (potentially blocking) Redis
writes off the sdbusplus dispatch thread.

D-Bus surface
  * Add RACK_MANAGER_BUSNAME (com.sonic.RackManager) as a meson conf
    constant and have BridgeApp open a fifth system-bus connection
    that requests it via a dedicated rackManagerConn_ /
    rackManagerServer_. The receiver is registered against this server
    instead of the Inventory one, so the bridge advertises a single,
    purpose-specific name to bmcweb instead of folding alert/telemetry
    methods into the Inventory.Manager namespace.
  * Move the object path under the matching namespace
    (/com/sonic/RackManager) so the bus name, object path and interface
    are aligned. bmcweb's sonic_oem_constants.hpp references the same
    triple.
  * Ship dbus/com.sonic.RackManager.conf -- D-Bus policy allowing root
    to own the name and the bmcweb user to send to it -- and install it
    into /etc/dbus-1/system.d via meson + the debian .install file.

Threading model
  * RackManagerReceiver now runs a single worker thread that owns the
    Redis connection. SubmitAlert / SubmitTelemetry handlers parse the
    JSON inline (cheap, bounded), build a Job of (key, field, value)
    triples, and enqueue it on a bounded std::deque (kMaxQueueDepth =
    1024). The reply 'true' means accepted, not persisted -- alert /
    telemetry ingestion is fire-and-forget, the rack manager retries
    on its own cadence.
  * Worker drains jobs and pipelines HSETs via redisAppendCommand /
    redisGetReply, replacing the previous one-HSET-per-method blocking
    path. Connection loss drops the in-flight job with an ERROR log;
    the next job triggers a lazy reconnect.
  * Destructor sets a stopping_ flag, notifies the cv, and joins the
    worker before freeing redisCtx_ so there is no race on shutdown.
  * Redis SELECT now uses the configured STATE_DB index from
    ConfigManager (getStateDbIndex) instead of a hard-coded 6, matching
    how the rest of the bridge picks its DB index.
  * If the queue is full (slow Redis / wedged worker) newer submissions
    are dropped with a WARN log and the D-Bus method returns false, so
    memory cannot grow unbounded under back-pressure.

initialize() no longer blocks on Redis; the worker connects lazily so
the bridge starts cleanly even if STATE_DB is not yet up at boot.

All touched source files carry the project-standard SPDX / Nexthop AI /
SONiC Project header.

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
…atus mapping

Tightens the bmcweb-side OEM action handlers (SubmitAlert /
SubmitTelemetry), adds a single source of truth for the D-Bus
coordinates they share with the bridge, hardens the bmcweb build glue
in the top-level Makefile, and aligns the integration tests with the
new route shape.

New header sonic_oem_constants.hpp
  * Defines redfish::sonic_oem::rackManagerBusName /
    rackManagerObjectPath / rackManagerInterface in one place so the
    triple cannot drift between SubmitAlert and SubmitTelemetry, and
    matches the values claimed by the bridge in
    sonic-dbus-bridge/include/rack_manager_receiver.hpp.
  * Defines kMaxRequestBodyBytes (64 KiB) used by both action routes.

sonic_submit_alert.hpp / sonic_submit_telemetry.hpp
  * Drop the per-file alertDbus* / telemetryDbus* constants in favour
    of the centralized sonic_oem:: symbols.
  * Reject oversized payloads with 413 Payload Too Large before
    nlohmann::json::parse runs, bounding worst-case CPU and the
    raw-string copy forwarded over D-Bus.
  * Differentiate failure modes on the D-Bus async reply:
      - transport-level errors (bridge down, name not owned, timeout)
        -> 503 serviceTemporarilyUnavailable (Retry-After: 1)
      - bridge returned false (payload rejected by receiver)
        -> operationFailed (4xx) instead of generic 500
    The previous code collapsed both into internalError, hiding the
    'service unavailable' case from operators.
  * Return 204 No Content on success instead of a synthetic 200 +
    messages::success body, which matches the DMTF convention for
    fire-and-forget action invocations that produce no result resource.
  * Drop the trailing slash from the BMCWEB_ROUTE pattern so the
    canonical action URI exactly matches the action target advertised
    on the Manager resource (no implicit redirect).

sonic_oem_redfish.hpp
  * Adds a top-of-file design note covering: why an OEM action was
    chosen over EventService / TelemetryService, why a JSON blob over
    D-Bus instead of typed signals or per-field arguments, and the
    threading / back-pressure contract with the bridge. The header
    itself only includes the route registrars and stays code-light.

Makefile (bmcweb integration glue)
  * Pin stdexec via new STDEXEC_REVISION / STDEXEC_URL variables and
    always overwrite subprojects/stdexec.wrap so the nested sdbusplus
    subproject cannot float against upstream HEAD or be left on a
    stale revision from a previous build.
  * Use 'cp -u' in copy-oem-extension so an older OEM file cannot
    clobber a newer in-tree copy a developer iterated on inside
    bmcweb/, while clean-tree behaviour stays identical.
  * Add inline path-coupling comments documenting that
    bmcweb/redfish-core/lib/sonic/ and
    bmcweb/redfish-core/schema/oem/sonic/json-schema/ are referenced
    by patch 0003 and the schema registration logic, and that renaming
    either side requires updating the patch.

tests/redfish-api/cases/oem_manager.json
  * Remove trailing slashes from the four existing SubmitAlert /
    SubmitTelemetry cases so they exercise the canonical (no-redirect)
    route advertised on the Manager resource.
  * Add test_json_schemas_collection_lists_sonic_manager and
    test_sonic_manager_schema_resource_fetchable to catch the most
    common bmcweb OEM-schema regression -- headers wired but the
    schema directory not picked up by meson schema registration --
    without asserting specific JSON-schema content.

All touched C++ files carry the project-standard SPDX / Nexthop AI /
SONiC Project header.

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
- New SonicSeverity, AlertsByType generic support
- The versioned schema (SonicManager.v1_0_0.json) remains the
single source of truth for any resource-level constraints.

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
- Framework now supports POST request bodies, Redis pre-conditions
  (redis_setup), and post-call Redis validation (redis_validations)

- Added Redfish response-body validation for 4xx/5xx paths, with a guard
  that rejects expected_status: 204 + expected_response combos

Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
Signed-off-by: Chinmoy Dey <chinmoy@nexthop.ai>
@chinmoy-nexthop chinmoy-nexthop force-pushed the oem/bmcweb-with-dbus-bridge branch from 0fd6089 to 15f0086 Compare June 3, 2026 13:01
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

No pipelines are associated with this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants