diff --git a/doc/component-stats/component-stats-framework-hld.md b/doc/component-stats/component-stats-framework-hld.md new file mode 100644 index 00000000000..6031a200816 --- /dev/null +++ b/doc/component-stats/component-stats-framework-hld.md @@ -0,0 +1,404 @@ +# SONiC Component Statistics — Framework HLD + +## Table of Content + +- [Revision](#1-revision) +- [Scope](#2-scope) +- [Definitions/Abbreviations](#3-definitionsabbreviations) +- [Overview](#4-overview) +- [Requirements](#5-requirements) +- [Architecture Design](#6-architecture-design) +- [High-Level Design](#7-high-level-design) +- [SAI API](#8-sai-api) +- [Configuration and management](#9-configuration-and-management) +- [Warmboot and Fastboot Design Impact](#10-warmboot-and-fastboot-design-impact) +- [Memory Consumption](#11-memory-consumption) +- [Restrictions/Limitations](#12-restrictionslimitations) +- [Testing Requirements/Design](#13-testing-requirementsdesign) +- [Open/Action items](#14-openaction-items) + +### 1. Revision + +| Rev | Date | Author | Change Description | +|-----|------------|---------------|------------------------------------------------------| +| 0.1 | 2026-04-28 | Yutong Zhang | Initial revision | + +### 2. Scope + +This HLD specifies a reusable producer-side mechanism for **service-level (control-plane software) counters** in SONiC containers. It introduces: + +1. A new shared library `swss::ComponentStats` in `sonic-swss-common`. +2. A SWSS-specific facade `SwssStats` in `sonic-swss` built on top of that library, which is the first consumer. + +The library publishes counters into `COUNTERS_DB` so that: + +- on-box diagnostic tooling (`redis-cli`, `show ... stats`) keeps working with no new transport, and +- off-box telemetry consumers can pick the counters up via the reporting pipeline described in the companion HLD. + +**This HLD owns the producer side only**: the library, the facade pattern, the hot-path / threading / memory-ordering design, and warmboot / memory / testing concerns for the library itself. The reporting pipeline (how counters travel from `COUNTERS_DB` to Geneva or other off-box telemetry systems) is specified in the companion HLD: + +- [Component Statistics — Reporting HLD](./component-stats-reporting-hld.md) + +### 3. Definitions/Abbreviations + +| Term | Definition | +|-----------------|---------------------------------------------------------------------------------------------| +| Component | A SONiC container that produces service-level counters (e.g. `swss`, `gnmi`, `bmp`). | +| Entity | A logical grouping of metrics inside a component (e.g. an orchagent table, a gNMI path). | +| Metric | A named uint64 counter or gauge inside an entity (e.g. `SET`, `DEL`, `COMPLETE`, `ERROR`). Stored as a Redis hash field on the producer side; surfaces downstream as a wire metric named `_STATS_` — see the [Reporting HLD §7.2](./component-stats-reporting-hld.md#72-swss-metric-design) for the wire schema. | +| Label | A key/value attribute attached to a wire metric by telegraf. The entity name (the part after the `:` in the Redis key) is surfaced as a component-specific label such as `swss.table`. | +| ComponentStats | The new shared library in `sonic-swss-common` providing the producer mechanism. | +| SwssStats | A SWSS-specific facade over `ComponentStats` (lives in `sonic-swss`). | +| DB sink | The output path that mirrors counters into `COUNTERS_DB`. | + +### 4. Overview + +SONiC already publishes **dataplane** counters via the Flex-Counter framework (`CONFIG_DB / FLEX_COUNTER_TABLE` -> `syncd` -> `COUNTERS_DB`). What is missing is **service-level** counters — software-side events such as orchagent task throughput, gNMI request rate, BMP message error counts. Without these we cannot answer questions like *"is orchagent draining tasks?"*, *"is gNMI seeing subscribe failures?"*, *"is one container dropping more events than its peers?"*. + +A naive implementation would put this plumbing — atomic counters, dirty tracking, a 1-second writer thread, and a Redis-side schema — directly inside each container. That is unacceptable: every container would need its own concurrency review, bug fixes would drift, and the on-the-wire schemas would diverge. + +This HLD specifies a single, reusable producer that: + +1. accumulates counters in process-local atomic state with negligible hot-path cost, +2. mirrors them to `COUNTERS_DB` so `redis-cli`, `show ... stats` CLIs, and any other on-box tooling continue to work, +3. exposes a stable public API so each container only needs to write a thin facade (see §7.8 for sizing guidance). + +How the `COUNTERS_DB` rows then reach Geneva or any other off-box system is the responsibility of the [Reporting HLD](./component-stats-reporting-hld.md). + +### 5. Requirements + +**Functional** + +- R1. A reusable C++ library shall accumulate per-component, per-entity, per-metric `uint64` counters. +- R2. The library shall publish counters to `COUNTERS_DB` under a uniform key layout `_STATS:` (Redis hash, fields = metric names, values = decimal `uint64`). The exact key/field contract is normatively defined in the Reporting HLD. +- R3. The library shall be usable by any SONiC container by writing a thin facade that owns only the container-specific metric vocabulary. +- R4. The first consumer of the library is the SWSS-specific facade `SwssStats` (in `sonic-swss/orchagent/`), which exposes a small SWSS-specific public surface: a global `gSwssStatsRecord` enable flag, `SwssStats::getInstance()`, and `recordTask` / `recordComplete` / `recordError` methods. +- R5. The `SwssStats` facade shall write into `COUNTERS_DB` under keys `SWSS_STATS:` with hash fields `SET` / `DEL` / `COMPLETE` / `ERROR`, following the uniform schema in R2. + +**Non-functional** + +- R6. The hot path (`increment` / `setValue`) shall be lock-free and constant-time after the first use of a given (entity, metric) pair. +- R7. Construction of a `ComponentStats` instance shall not crash the host process if Redis is not yet reachable; the sink shall connect lazily and retry independently. +- R8. A failure in the sink (Redis down) shall not affect the hot path. After recovery, no monotonic data point shall be lost beyond intermediate samples (the next successful flush carries the latest cumulative value). +- R9. Idle systems shall produce zero outbound traffic on the sink (driven by per-entity dirty tracking). + +**Out of scope** + +- The reporting pipeline that consumes the `COUNTERS_DB` rows (telegraf, mdm, Geneva, etc.) — see the [Reporting HLD](./component-stats-reporting-hld.md). +- Replacing existing FlexCounter / SAI counter pipelines (those measure dataplane state via SAI; this design measures control-plane software events). +- Defining the metric vocabulary for non-swss containers (`gnmi`, `bmp`, `telemetry`, …); this is left as future work. + +### 6. Architecture Design + +The architecture is unchanged at the SONiC system level. A new library is introduced in `sonic-swss-common`, and a new SWSS-specific facade (its first consumer) is added in `sonic-swss`; future containers may add their own facades using the same library. + +``` ++---------------------------- SONiC switch ------------------------------+ +| | +| orchagent (sonic-swss) gnmi / bmp / telemetry / ... | +| +----------------------+ +----------------------+ | +| | orch.cpp + SwssStats | ... | gnmistats / bmpstats | | +| +----------+-----------+ +----------+-----------+ | +| | instrument | | +| v v | +| +----------------------------------------------------------+ | +| | swss::ComponentStats (in libswsscommon) | | +| | +---------------------------------------------------+ | | +| | | atomic counters + dirty tracking + writer thread | | | +| | +-------------------------+-------------------------+ | | +| | | | | +| | DB sink | | +| | (Redis HSET via swss::Table) | | +| +-----------------------------+----------------------------+ | +| | | +| v | +| +-------------------------+ | +| | COUNTERS_DB | | +| | SWSS_STATS:PORT_TABLE | | +| | GNMI_STATS:/iface/... | | +| | BMP_STATS:... | | +| | | | +| | used by: | | +| | - redis-cli | | +| | - show stats CLI | | +| | - reporting pipeline | --> see Reporting HLD | +| +-------------------------+ | ++------------------------------------------------------------------------+ +``` + +**Layering rule.** `swss-common` knows nothing of orchagent or any specific container; each container knows only its own facade plus `swss::ComponentStats`. New containers get the sink for free by writing a thin wrapper (see §7.8 for sizing guidance). + +**Sink design properties.** + +- *One source of truth.* The sink consumes the atomic-counter snapshot inside `ComponentStats`. +- *No new transport for local debugging.* The `COUNTERS_DB` layout follows the existing convention so `redis-cli`, `show ... stats` CLIs, and any in-band tooling keep working. +- *Sink isolation from hot path.* Failures in the sink (Redis unreachable) do not affect the hot path; they are logged and retried. + +### 7. High-Level Design + +#### 7.1 Repositories changed + +| Repository | What changes | +|--------------------------------|-----------------------------------------------------------------------------| +| `sonic-net/sonic-swss-common` | New library `swss::ComponentStats` + unit tests ([PR #1180](https://github.com/sonic-net/sonic-swss-common/pull/1180)). | +| `sonic-net/sonic-swss` | New `SwssStats` thin facade over `ComponentStats` in `orchagent/` ([PR #4516](https://github.com/sonic-net/sonic-swss/pull/4516)). | + +No platform-specific code is added. No SAI changes. No syncd changes. + +#### 7.2 `swss::ComponentStats` — public API + +```cpp +namespace swss { + +class ComponentStats { +public: + using CounterSnapshot = std::map; + + // Sink configuration. The DB sink is on by default; additional + // sinks (e.g. OTLP) may be added by future revisions and are kept + // off by default. + struct SinkConfig { + bool enableDb = true; // mirror to COUNTERS_DB + }; + + static std::shared_ptr create( + const std::string& componentName, + const std::string& dbName = "COUNTERS_DB", + uint32_t intervalSec = 1, + const SinkConfig& sinks = SinkConfig{}); + + void increment(const std::string& entity, const std::string& metric, uint64_t n = 1); + void setValue (const std::string& entity, const std::string& metric, uint64_t value); + + uint64_t get (const std::string& entity, const std::string& metric); + CounterSnapshot getAll(const std::string& entity); + + void setEnabled(bool on); + bool isEnabled() const; + void stop(); +}; + +} // namespace swss +``` + +`create()` consults a process-wide registry keyed by `componentName`. A second call with the same name returns the existing instance, ensuring containers cannot accidentally start multiple writer threads against the same Redis prefix. + +#### 7.3 Internal state + +Per instance: +- `m_entities : std::map` — `std::map` (not `unordered_map`) so references returned by `getOrCreateEntity` remain valid after later inserts. +- `EntityStats` holds `map>` (heap-allocated because `std::atomic` is not movable) plus a per-entity `atomic version`. +- `m_mutex` guards only the **structure** of the maps (insert/find). Hot-path reads/writes of counter values use `std::atomic` and skip the mutex after the first use. +- `m_running`, `m_enabled` — atomic flags. +- `m_cv` — wakes the writer thread immediately on `stop()` instead of waiting up to `intervalSec`. +- `m_thread` — owns the writer. + +Process-wide: +- `registry : std::map>` (`weak_ptr` so a fully released instance can be destroyed). + +#### 7.4 Hot path + +After the first use of a given `(entity, metric)` pair, `increment()` does +exactly two atomic RMWs and nothing else: + +1. **Relaxed `fetch_add`** on the counter value — accumulates the event. +2. **Release `fetch_add`** on the per-entity *version* — marks the entity + dirty and publishes the new counter value to the writer thread. + Pairs with the writer's acquire-load (see §7.6). + +No mutex acquisition, no allocation, no syscall on the hot path. The +structural mutex is taken only the first time a given `(entity, metric)` +pair is seen, to insert it into the per-entity map. + +#### 7.5 Writer thread + +Runs at `intervalSec` (default 1 s) and flushes the snapshot to the DB sink: + +``` ++---------------------------------------------------------------+ +| Phase A - connect the DB sink (run once, with retry) | +| loop until m_running == false: | +| if !dbConnected: try connect Redis | +| if connected: break | +| else cv.wait_for(intervalSec, predicate=!m_running) | ++---------------------------------------------------------------+ ++---------------------------------------------------------------+ +| Phase B - flush loop | +| loop: | +| cv.wait_for(intervalSec, predicate=!m_running) | +| if !m_running: break | +| | +| # SNAPSHOT (under lock) | +| for each entity e in m_entities: | +| v = e.version.load(acquire) <- pairs (2) | +| if lastVersion[e.name] == v: continue (skip clean) | +| lastVersion[e.name] = v | +| row = [(metric, c.value.load(relaxed)) for c in e] | +| enqueue(name, row) | +| | +| # FAN-OUT (lock released) | +| for (name, row) in queue: | +| try: m_table->set(name, stringify(row)) | +| catch: log warn, continue | ++---------------------------------------------------------------+ +``` + +Three properties: + +1. *Lock released before any I/O.* Round-trips under the structural lock would briefly stall every concurrent `increment()`. +2. *Idle systems generate zero outbound traffic.* When no entity has changed, the queue is empty and the sink is not touched. +3. *Hot-path isolation.* A sink failure is logged and skipped; the hot path is never blocked. + +#### 7.6 Memory ordering correctness + +The release/acquire pair ((2) in 7.4 ↔ acquire-load in 7.5) guarantees: + +> If the writer reads `version == N`, then every counter mutation that contributed to bumping the version up to `N` has already happened-before the reader and is visible. + +Without it, on weakly ordered architectures (ARM, POWER) the writer could see the new version but read an old counter value, recording a stale snapshot. + +#### 7.7 `SwssStats` thin facade + +`SwssStats` (in `sonic-swss/orchagent/`) is a ~130-line translation layer +that owns only the SWSS-specific vocabulary and the global +`gSwssStatsRecord` enable flag consumed by `orch.cpp`. Every call +delegates directly to `swss::ComponentStats::increment()`: + +| `SwssStats` call | Delegates to | Reports as (see Reporting HLD §7.2) | +|-----------------------------|-----------------------------------|--------------------------------------| +| `recordTask(t, "SET")` | `increment(t, "SET")` | `SWSS_STATS_SET{swss.table=t}` | +| `recordTask(t, "DEL")` | `increment(t, "DEL")` | `SWSS_STATS_DEL{swss.table=t}` | +| `recordComplete(t, n)` | `increment(t, "COMPLETE", n)` | `SWSS_STATS_COMPLETE{swss.table=t}` | +| `recordError(t, n)` | `increment(t, "ERROR", n)` | `SWSS_STATS_ERROR{swss.table=t}` | + +The public surface (`gSwssStatsRecord`, `SwssStats::getInstance()`, +`recordTask` / `recordComplete` / `recordError`) and the on-the-wire +`SWSS_STATS:
` Redis layout are deliberately kept narrow and +stable so the SWSS vocabulary remains independent of future evolution +of the underlying `ComponentStats` library. + +The full SWSS metric design (metric names, labels, descriptions) and +the exact `SWSS_STATS:
` Redis schema are owned by the +[Reporting HLD §7.2](./component-stats-reporting-hld.md#72-swss-metric-design), +which is the contract with downstream consumers. + +#### 7.8 Adopting the library in a new container + +A new component `C` adopts the framework by: + +1. Picking an uppercase component name `C`. Counters automatically land in + `COUNTERS_DB` under `C_STATS:*` and surface downstream as metrics + named `C_STATS_` with one label per entity. +2. **Designing a finite vocabulary** of verb-style metric names for the + events the component cares about. Anything high-cardinality + (interface name, neighbour IP, gNMI path, BMP peer) **must** go into + the entity (the part after the `:` in the Redis key) rather than the + metric name, so that dashboards can pivot on the label without + explosion in metric count. See + [Reporting HLD §7.3](./component-stats-reporting-hld.md#73-conventions-for-future-components) + for the rationale. +3. Documenting that vocabulary as a Metric Name | Label List | + Description table in the component's own HLD, identical in shape to + the SWSS table in Reporting HLD §7.2. +4. Writing a thin facade that calls + `swss::ComponentStats::increment()` for each event. + A minimal facade needs only ~30 LoC. The SwssStats facade is larger + (~130 LoC) because it also integrates a `gSwssStatsRecord` enable flag + and a singleton into orchagent's existing `orch.cpp` infrastructure; + new containers that do not need that extra plumbing stay near ~30 LoC. + +No new threads, no new Redis client management, no new test harness +needed. Reporting picks the metrics up automatically via the +`*_STATS:*` pattern match. + +Illustrative future vocabulary for `gnmi` (to be finalised when the +gNMI facade lands): + +| Metric Name | Label List | Description | +|-------------------------|--------------|--------------------------------------------------------------| +| `GNMI_STATS_SUBSCRIBE` | `gnmi.path` | Number of `Subscribe` requests received on the path. | +| `GNMI_STATS_GET` | `gnmi.path` | Number of `Get` RPCs handled on the path. | +| `GNMI_STATS_SET` | `gnmi.path` | Number of `Set` RPCs handled on the path. | +| `GNMI_STATS_ERROR` | `gnmi.path` | Number of RPCs that returned an error on the path. | + +### 8. SAI API + +No SAI API changes are required for this feature. This design measures control-plane software events inside SONiC containers; it does not query or modify any SAI state. + +### 9. Configuration and management + +Not applicable. This HLD introduces no new CLI commands, YANG models, manifests, or `CONFIG_DB` schema. Existing CLIs that already read `COUNTERS_DB` (e.g. `redis-cli -n 2 HGETALL`, `show ... stats` style commands) continue to work and gain visibility into the new `_STATS:` keys for free. + +### 10. Warmboot and Fastboot Design Impact + +Counters are kept in process memory and are reset on container restart, including warmboot and fastboot. This is acceptable because consumers (dashboards, alerts) compute rate-of-change rather than absolute values. + +#### Warmboot and Fastboot Performance Impact + +- The library does **not** add any stalls, sleeps, or I/O operations to the boot critical chain. Construction is non-blocking; the writer thread connects to Redis lazily and retries in the background, so a not-yet-ready dependency cannot delay container start. +- No CPU-heavy processing (Jinja templates, etc.) is added in the boot path. +- No third-party dependency is updated by this HLD. +- The library does not delay any service or Docker container. + +No measurable boot-time degradation is expected. + +### 11. Memory Consumption + +- Per-instance footprint: O(entities × metrics) `uint64` slots plus their `std::map` keys. Bounded by the number of orchagent tables (≈ tens) for the SWSS facade. +- When the feature is disabled at runtime via `setEnabled(false)`, the hot path becomes inert and the writer thread's queue stays empty; memory remains bounded. + +### 12. Restrictions/Limitations + +- Counters reset to zero on container restart by design. Consumers must compute rate-of-change rather than rely on absolute values across restarts. +- The library does not retain history; it relies on downstream consumers (`COUNTERS_DB` readers, the reporting pipeline) for retention. +- The structural mutex (`m_mutex`) is acquired only on the *first* use of a given (entity, metric) pair. Workloads that constantly mint new entity names will see one mutex acquisition per new name; this is not the expected pattern for SONiC containers. + +### 13. Testing Requirements/Design + +#### 13.1 Unit Test cases + +Library unit tests live in `sonic-swss-common/tests/componentstats_ut.cpp`: + +| # | Test | What it proves | +|---|----------------------------|---------------------------------------------------------------------------------------------| +| 1 | BasicIncrement | `increment` + `get` round-trip | +| 2 | MultipleMetrics | metric isolation within an entity | +| 3 | MultipleEntities | entity isolation within a component | +| 4 | SetValueOverwrites | gauge semantics | +| 5 | DisabledIsNoOp | `setEnabled(false)` makes hot path inert | +| 6 | GetAllReturnsSnapshot | bulk read returns the right shape | +| 7 | ConcurrentIncrements | 8 threads × 10 000 increments → exactly 80 000 (no torn writes, no lost updates) | +| 8 | SingletonSameName | `create("X")` returns the same instance | +| 9 | SingletonDifferentNames | `create("X") ≠ create("Y")` | + +A facade-level test suite `swssstats_ut.cpp` (9 cases) is added in `sonic-swss` and exercises the SwssStats vocabulary (`recordTask`/`recordComplete`/`recordError`, `gSwssStatsRecord` enable flag, singleton behaviour) end-to-end against the new backend. + +Run: + +``` +cd sonic-swss-common && ./autogen.sh && ./configure && make check +./tests/tests --gtest_filter='ComponentStats*' +``` + +#### 13.2 System Test cases + +- Boot a `sonic-vs` image built with the two companion PRs. +- Exercise orchagent (e.g. `config vlan add`, `config interface ip add`). +- Verify on-box DB sink: + ``` + redis-cli -n 2 KEYS "SWSS_STATS:*" + redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE" + ``` + Counters increment in proportion to operations; idle dwell shows zero further writes (dirty tracking working). +- Confirm warmboot and fastboot are unaffected (no boot-time regression, no service startup ordering change). + +End-to-end validation of the reporting path (telegraf → mdm → Geneva) is covered in the [Reporting HLD](./component-stats-reporting-hld.md). + +### 14. Open/Action items + +- Phase 1 (this HLD's two PRs) lands the `ComponentStats` library and the `SwssStats` facade with the DB sink fully active. +- Phase 2 onboards additional SONiC containers (`gnmi`, `bmp`, `telemetry`, …) by adding their own facades. Each is a self-contained PR in the relevant repository. +- Phase 3 (future) may add direct OTLP export from the library to a local agent for components that need lower reporting latency than the DB → telegraf path provides. Out of scope for this HLD. + + + diff --git a/doc/component-stats/component-stats-reporting-hld.md b/doc/component-stats/component-stats-reporting-hld.md new file mode 100644 index 00000000000..90a5ef46216 --- /dev/null +++ b/doc/component-stats/component-stats-reporting-hld.md @@ -0,0 +1,302 @@ +# SONiC Component Statistics — Reporting HLD + +## Table of Content + +- [Revision](#1-revision) +- [Scope](#2-scope) +- [Definitions/Abbreviations](#3-definitionsabbreviations) +- [Overview](#4-overview) +- [Requirements](#5-requirements) +- [Architecture Design](#6-architecture-design) +- [High-Level Design](#7-high-level-design) +- [SAI API](#8-sai-api) +- [Configuration and management](#9-configuration-and-management) +- [Warmboot and Fastboot Design Impact](#10-warmboot-and-fastboot-design-impact) +- [Memory Consumption](#11-memory-consumption) +- [Restrictions/Limitations](#12-restrictionslimitations) +- [Testing Requirements/Design](#13-testing-requirementsdesign) +- [Open/Action items](#14-openaction-items) + +### 1. Revision + +| Rev | Date | Author | Change Description | +|-----|------------|---------------|-----------------------| +| 0.1 | 2026-05-12 | Yutong Zhang | Initial revision | + +### 2. Scope + +This HLD specifies how the service-level component counters produced by `swss::ComponentStats` (see the [Framework HLD](./component-stats-framework-hld.md)) are **reported** from a SONiC switch to off-box telemetry systems. + +For the initial revision the reporting path is exactly one: + +``` +component (swss/gnmi/...) + -> ComponentStats library + -> COUNTERS_DB (Redis) + -> telegraf (Geneva mdm pipeline) + -> Geneva +``` + +This HLD owns the **schema contract** between the producer (`ComponentStats`) and the consumer (telegraf). The deployment, configuration, and operation of the telegraf and mdm containers themselves are owned by the NDM "Geneva integration with SONiC" HLD; this document references them but does not duplicate them. + +Direct application-side OTLP export (e.g. the `OpenTelemetry SDK -> mdm` path described in the NDM HLD §4) is **not** part of this revision; it is listed as future work in §14. + +### 3. Definitions/Abbreviations + +| Term | Definition | +|-----------------|---------------------------------------------------------------------------------------------| +| Component | A SONiC container that produces service-level counters (e.g. `swss`, `gnmi`, `bmp`). | +| Entity | A logical grouping of metrics inside a component (e.g. an orchagent table, a gNMI path). | +| Metric | A named `uint64` counter or gauge inside an entity (e.g. `SET`, `DEL`, `COMPLETE`, `ERROR`). Stored as a Redis hash field on the producer side; surfaces downstream as a wire metric named `_STATS_` (see §7.2 for the SWSS instance). | +| Label | A key/value attribute attached to a wire metric. The entity name (the part after the `:` in the Redis key) becomes the label value; the label name is component-specific (e.g. `swss.table` for SWSS — see §7.2). | +| ComponentStats | The reusable producer library specified in the Framework HLD. | +| `COUNTERS_DB` | The existing SONiC Redis database (logical DB 2) holding counter rows. | +| telegraf | The off-box-friendly metric agent running on the switch; configured and operated by NDM. | +| mdm | Geneva metric agent that consumes telegraf output and forwards it to Geneva. | +| NDM HLD | "Geneva integration with SONiC" HLD, owned by the NDM team. | + +### 4. Overview + +The Framework HLD specifies a producer that writes each component's service-level counters into `COUNTERS_DB` under a uniform key layout. To make those counters useful off-box, we need a stable contract between that producer and whatever agent harvests Redis on the switch and forwards data to Geneva. + +NDM has already designed and is rolling out a telegraf-based pipeline for harvesting `COUNTERS_DB` and forwarding to Geneva (see NDM HLD §5 "Existing stats collecting from Database via mdm"). This HLD therefore does **not** introduce a new transport. Instead it: + +1. **Defines the Redis schema** that the producer writes and that telegraf consumes (key layout, hash fields, types, dirty-tracking semantics). +2. **Specifies the SWSS-specific vocabulary** (`SWSS_STATS:
` with `SET` / `DEL` / `COMPLETE` / `ERROR`). +3. **States the conventions** that future components must follow so that telegraf can pick them up by pattern match without a per-component configuration change. + +The result is a thin, declarative contract between two teams: SONiC owns what is written; NDM owns how it is harvested and forwarded. + +### 5. Requirements + +**Functional** + +- R1. Every SONiC container that integrates `ComponentStats` shall expose its counters in `COUNTERS_DB` under the uniform key layout defined in §7.1. +- R2. The schema shall be discoverable by pattern match (`_STATS:*`) so that a single telegraf input definition can pick up all current and future components without code or configuration changes. +- R3. The SWSS facade (`SwssStats`) shall publish counters under `SWSS_STATS:
` with hash fields `SET`, `DEL`, `COMPLETE`, `ERROR` (decimal `uint64`). +- R4. The schema shall include a per-entity *update marker* (the version-bump in the producer; observable to telegraf as the row's hash value changing) so that idle rows are not re-emitted to Geneva every cycle. + +**Non-functional** + +- R5. The reporting path shall not require changes to the SONiC dataplane, syncd, SAI, or the existing Flex-Counter pipeline. +- R6. The reporting path shall not impose any on-the-wire dependency between SONiC and a specific off-box telemetry system. SONiC writes Redis; whatever consumes Redis is replaceable. +- R7. A failure of telegraf, mdm, or Geneva shall not affect the producer or any other SONiC service. + +**Out of scope** + +- Telegraf container packaging, lifecycle, and configuration. See NDM HLD §5.2 ("telegraf design"). +- mdm container deployment, KubeSonic rollout. See NDM HLD §3 and §6. +- Geneva endpoint, authentication, dashboards, alerting. +- Direct OTLP export from the application (see future work, §14). + +### 6. Architecture Design + +``` ++-------------------------- SONiC switch ---------------------------+ +| | +| +-- container (e.g. swss) -----------------------------------+ | +| | application -> ComponentStats library | | +| +------------------------+-----------------------------------+ | +| | HSET | +| v | +| +-------------------------+ | +| | COUNTERS_DB (Redis DB 2)| | +| | SWSS_STATS:PORT_TABLE | | +| | GNMI_STATS:/iface/... | | +| | BMP_STATS:... | | +| +-----------+-------------+ | +| | HSCAN / HGETALL | +| v | +| +-------------------------+ | +| | telegraf | (owned by NDM HLD §5.2) | +| +-----------+-------------+ | +| | | +| v | +| +-------------------------+ | +| | mdm | (owned by NDM HLD §4) | +| +-----------+-------------+ | +| | | ++--------------------------|----------------------------------------+ + v + +--------+ + | Geneva | + +--------+ +``` + +The boundary owned by this HLD is the box labelled `COUNTERS_DB`. Everything above it (the producer) is specified in the Framework HLD; everything below it (telegraf, mdm, Geneva) is specified in the NDM HLD. This HLD owns the **interface between the two**. + +### 7. High-Level Design + +#### 7.1 `COUNTERS_DB` key layout (the contract) + +For a component named `C` (case-insensitive at the API; rendered uppercase on the wire) and an entity `E`: + +``` +db: COUNTERS_DB (logical DB 2) +key: "_STATS:" +type: Redis hash +fields: each metric name -> decimal uint64 string +``` + +Properties guaranteed by the producer: + +- **Stable suffix `_STATS`.** Every component writes under `_STATS:*` and only there, so telegraf can match `*_STATS:*` (or a per-component pattern such as `SWSS_STATS:*`) to discover all rows for that component without an allow-list. +- **Hash, never string.** Field names are metric names; values are decimal `uint64`. Telegraf can call `HGETALL` and produce one measurement per (key, field) pair. +- **Idle suppression.** A row is `HSET` only when at least one of its metrics changed during the producer's 1 s cycle. Rows that did not change are not rewritten. Therefore an idle SONiC produces zero extra Redis traffic and telegraf, when configured to detect "no change since last poll", produces no upstream traffic either. +- **No TTL.** Keys are not expired; their lifetime is the producer process. On container restart they are recreated by the next 1 s flush. +- **No deletion in v1.** Entities that disappear at the application layer leave their last `HSET` in Redis until the container restarts. Garbage collection is left to the application; the framework does not delete keys (this keeps the contract simple). + +Example for `componentName="SWSS"`, entity `PORT_TABLE`: + +``` +redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE" +1) "SET" +2) "1283" +3) "DEL" +4) "17" +5) "COMPLETE" +6) "1300" +7) "ERROR" +8) "0" +``` + +The shape mirrors the existing `COUNTERS:*` keys produced by the Flex-Counter pipeline so that on-box tooling (`redis-cli`, `show ... stats`) needs no changes. + +#### 7.2 SWSS metric design + +When telegraf reads a `SWSS_STATS:
` hash from `COUNTERS_DB` and +forwards it via mdm, each `(key, field)` pair surfaces downstream as a +single metric with one label carrying the orchagent table name. The +SWSS facade emits the following four metrics: + +| Metric Name | Label List | Description | +|-----------------------|---------------|-----------------------------------------------------------------------------------| +| `SWSS_STATS_SET` | `swss.table` | Count of `SET` operations enqueued on the orchagent table named by the label. | +| `SWSS_STATS_DEL` | `swss.table` | Count of `DEL` operations enqueued on the orchagent table named by the label. | +| `SWSS_STATS_COMPLETE` | `swss.table` | Count of operations that finished successfully on the table. | +| `SWSS_STATS_ERROR` | `swss.table` | Count of operations that finished with error on the table. | + +Notes: + +- All values are monotonically increasing `uint64` counters. Consumers + compute rate-of-change; absolute values reset on container restart + (see §10). +- The label value (`swss.table`) is the orchagent table identifier + verbatim — e.g. `PORT_TABLE`, `VLAN_TABLE`, `ROUTE_TABLE` — so + dashboards can filter on a specific table without parsing the Redis + key. +- Mapping back to `COUNTERS_DB`: the metric `SWSS_STATS_` + corresponds to Redis key `SWSS_STATS:` with hash field + ``; the label value is the `` part of the key. See §7.1 + for the key layout and §7.4 for the dirty-tracking semantics that + guarantee idle entities do not produce reporting traffic. + +#### 7.3 Conventions for future components + +When onboarding a new component (`gnmi`, `bmp`, `telemetry`, …) using +the framework: + +1. Pick a stable, uppercase component name `C`. Counters land under + `C_STATS:*` automatically and surface downstream as metrics named + `C_STATS_`. +2. Define a short, finite vocabulary of `` names that describe + the event classes the component cares about (e.g. `SUBSCRIBE`, + `GET`, `SET`, `ERROR`). Avoid putting cardinality-heavy values + (interface name, neighbour IP, gNMI path) inside the metric name; + put them in the entity (`E`) so they become the label value + downstream. Telegraf reads the entity from the Redis key and the + metric from the hash field, so dashboards can pivot freely without + metric-name explosion. +3. Document the vocabulary in the component's own HLD as a Metric + Name | Label List | Description table, identical in shape to §7.2. + A typical label name is `c.entity` for a generic component, or a + domain-specific synonym such as `gnmi.path` / `bmp.peer` / + `swss.table` when that reads better on dashboards. + +No telegraf configuration change is required to onboard a new +component, provided telegraf is configured to scan `*_STATS:*` patterns +(NDM HLD §5.2.1). + +#### 7.4 Interaction with the producer + +The producer (specified in the Framework HLD) maintains a per-entity *version* counter that is bumped on every `increment()` / `setValue()`. The 1 s writer thread snapshots only entities whose version changed since the last cycle and issues one `HSET` per dirty entity. As a result: + +- A row that has not changed since the previous cycle is **not** rewritten — telegraf and Redis monitoring both see this as no activity. +- A row that has changed even once is rewritten with the latest cumulative values, so the next `HGETALL` always returns the latest snapshot. +- There is no risk of telegraf reading a half-written row: each `HSET` is atomic on the Redis side, and a single `HSET` writes all fields of the entity together. + +#### 7.5 Telegraf interface (consumed, not specified here) + +Telegraf is expected to: + +- Run on the switch alongside the SONiC containers (NDM HLD §5.2.2 "telegraf container"). +- Scan `COUNTERS_DB` for keys matching `*_STATS:*`. +- Convert each `(key, field)` pair into a metric in the schema defined + by §7.2 / §7.3 of this HLD: the metric name is + `_STATS_`, the entity part of the Redis key + becomes the label value, and the label name is component-specific + (e.g. `swss.table` for SWSS — see §7.2). The hostname is attached as + an additional label by telegraf itself. +- Forward to mdm. + +The exact telegraf configuration (input plugin, polling interval, output to mdm) is owned by the NDM HLD §5.2.1. This HLD only commits to the schema described in §7.1 / §7.2 / §7.3. + +### 8. SAI API + +No SAI API changes are required. This HLD covers a Redis schema and an interface to a consumer agent; SAI is not involved. + +### 9. Configuration and management + +Not applicable. This HLD introduces no new CLI commands, YANG models, manifests, or `CONFIG_DB` schema. Operator-facing configuration of telegraf / mdm is documented in the NDM HLD. + +### 10. Warmboot and Fastboot Design Impact + +The Redis schema is process-local: keys live in `COUNTERS_DB` for the duration of the producer container. On warmboot / fastboot the producer container restarts, the keys are recreated at the next 1 s flush, and counters start again from zero (see Framework HLD §10). Telegraf treats the appearance of fresh keys as new measurements; consumers compute rate-of-change and tolerate the reset. + +No boot-critical-chain dependency is added. + +### 11. Memory Consumption + +The reporting path adds no new in-container state beyond what the Framework HLD already describes for the DB sink (one Redis client per producer instance). Redis-side memory is bounded by the number of `(component, entity)` rows × the number of fields × the size of a `uint64` ASCII string; for the SWSS facade this is on the order of tens of rows × four fields. + +Telegraf and mdm memory are owned by the NDM HLD. + +### 12. Restrictions/Limitations + +- The schema is hash-only. Field values are decimal `uint64` strings; non-numeric fields are not supported. Components that need richer types must use a different reporting path (out of scope). +- The schema does not encode metric units. Units are implicit in the metric name (events) for v1; if a future component needs to report bytes / seconds / etc. it should put the unit in the metric name (e.g. `BYTES_RX`) until a more elaborate schema is introduced. +- Entity names are opaque strings. They must be safe for use as a Redis key suffix and for use as an attribute value downstream; in practice all SONiC table names already satisfy this. +- No deletion in v1 (see §7.1). Stale rows accumulate until container restart. + +### 13. Testing Requirements/Design + +#### 13.1 Unit / library tests + +The library-level invariants (`HSET` on dirty entities, idle suppression, field naming) are covered by the Framework HLD unit-test suite (`componentstats_ut.cpp`). No additional unit tests are introduced by this HLD. + +#### 13.2 System tests + +- Boot a `sonic-vs` image that includes the Framework HLD's two companion PRs. +- Exercise orchagent so that the SWSS facade increments counters (e.g. `config vlan add`, `config interface ip add`). +- Verify the schema directly in Redis: + ``` + redis-cli -n 2 KEYS "SWSS_STATS:*" + redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE" + ``` + Confirm that: + - The key shape matches §7.1. + - All four SWSS fields (`SET`, `DEL`, `COMPLETE`, `ERROR`) are present and are decimal integers. + - After a quiescent dwell, no `HSET` traffic is observed (idle suppression). +- End-to-end with telegraf (on a testbed configured per the NDM HLD): + exercise orchagent and confirm the four metrics defined in §7.2 + (`SWSS_STATS_SET` / `SWSS_STATS_DEL` / `SWSS_STATS_COMPLETE` / + `SWSS_STATS_ERROR`) arrive in Geneva carrying the `swss.table` label + for the exercised orchagent tables. + +### 14. Open/Action items + +- The single reporting path in this revision is `COUNTERS_DB -> telegraf -> mdm -> Geneva`. Direct OTLP export from the application (the `OpenTelemetry SDK -> mdm` path described in NDM HLD §4) is a possible future addition; it would be specified in a future revision of this document if and when SONiC components need lower reporting latency than 1 s polling can provide. +- Garbage collection of stale `*_STATS:` keys on long-lived containers is left for a future revision. The current behaviour (cleared on container restart) is sufficient for the planned consumers. +- When additional components (`gnmi`, `bmp`, `telemetry`, …) adopt the framework, each one should add its vocabulary table (in the `Metric Name | Label List | Description` shape of §7.2) to **its own component's HLD**, following the conventions in §7.3. A cross-reference to that table may optionally be added to §7.3 of this HLD so that all known vocabularies are discoverable from one place. + +