Add reusable ComponentStats library for per-container statistics.#1180
Add reusable ComponentStats library for per-container statistics.#1180yutongzhang-microsoft wants to merge 8 commits into
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
feb05e7 to
bc9b47c
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Temporarily repoint src/sonic-swss and src/sonic-swss-common to the forks that carry the ComponentStats refactor so CI can build an end-to-end image and exercise the new SwssStats thin wrapper. - sonic-swss-common -> yutongzhang-microsoft/sonic-swss-common@bc9b47c (feature/component-stats, PR sonic-net/sonic-swss-common#1180) - sonic-swss -> yutongzhang-microsoft/sonic-swss@f77ef29 (swss-stats-use-componentstats, PR sonic-net/sonic-swss#4516) This commit will be revised once both PRs merge into their respective upstream repos. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Introduces a new reusable ComponentStats facility in sonic-swss-common to provide container-agnostic in-memory counters with periodic Redis flushing, intended to be shared by multiple SONiC components (orchagent/gnmi/bmp/telemetry, etc.).
Changes:
- Added
swss::ComponentStatsimplementation with singleton-by-component registry, atomic counters, dirty tracking, and a background writer thread that flushes to Redis. - Added Google Test unit/integration coverage for core behaviors (increments, singleton semantics, concurrency, Redis flushing, dirty-tracking, shutdown, and connect-retry).
- Wired new sources into the autotools build/test targets.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
common/componentstats.h |
Declares the new ComponentStats API and documents threading/behavior contracts. |
common/componentstats.cpp |
Implements singleton registry, atomic counter storage, and Redis writer thread with connect-retry and periodic flush. |
common/Makefile.am |
Adds componentstats.cpp to the common library build. |
tests/componentstats_ut.cpp |
Adds unit + Redis integration tests for ComponentStats. |
tests/Makefile.am |
Adds the new test file to the test binary build. |
| * | ||
| * Each instance writes counters to a Redis hash of the form: | ||
| * <COMPONENT>_STATS:<entity> | ||
| * where each field is a user-defined metric name (e.g. SET, DEL, GET, ERROR…). |
There was a problem hiding this comment.
The comment uses a Unicode ellipsis character ("…") inside the public API documentation. This repository otherwise appears to use ASCII-only source, and non-ASCII punctuation can cause tooling/encoding issues. Consider replacing it with "...".
| * where each field is a user-defined metric name (e.g. SET, DEL, GET, ERROR…). | |
| * where each field is a user-defined metric name (e.g. SET, DEL, GET, ERROR...). |
| /// setValue() become no-ops; existing in-memory counters are preserved | ||
| /// and the writer thread keeps running (it just has nothing new to | ||
| /// flush, so previously-flushed values stay in Redis as-is). |
There was a problem hiding this comment.
The setEnabled() documentation says the writer thread will have "nothing new to flush" and Redis values will stay "as-is" while disabled, but writerThread() does not check m_enabled and can still flush updates that were made just before disabling (or flush after re-enabling without a clear policy). Either update the doc to match the behavior, or gate flushing on m_enabled (and decide whether dirty state should be preserved for a later flush).
| /// setValue() become no-ops; existing in-memory counters are preserved | |
| /// and the writer thread keeps running (it just has nothing new to | |
| /// flush, so previously-flushed values stay in Redis as-is). | |
| /// setValue() become no-ops and existing in-memory counters are preserved. | |
| /// The writer thread continues to run; disabling does not guarantee that | |
| /// counters already marked dirty before setEnabled(false) will not still | |
| /// be flushed to Redis. Redis therefore remains unchanged only if there | |
| /// was no pending dirty state at the time recording was disabled. |
| { | ||
| SWSS_LOG_WARN("ComponentStats[%s]: DB connect failed: %s. Retrying…", | ||
| m_component.c_str(), e.what()); | ||
| std::unique_lock<std::mutex> lock(m_mutex); | ||
| m_cv.wait_for(lock, std::chrono::seconds(m_intervalSec), | ||
| [this]{ return !m_running.load(std::memory_order_relaxed); }); | ||
| } |
There was a problem hiding this comment.
The log message contains a Unicode ellipsis character ("…"). For consistency and to avoid build/log encoding issues in environments that assume ASCII source files, prefer "...".
| try | ||
| { | ||
| m_table->set(keys[i], values[i]); | ||
| } | ||
| catch (const std::exception& e) | ||
| { | ||
| SWSS_LOG_WARN("ComponentStats[%s]: write failed for %s: %s", | ||
| m_component.c_str(), keys[i].c_str(), e.what()); | ||
| } | ||
| } |
There was a problem hiding this comment.
On Redis write failure, the code only logs and continues using the existing DBConnector/Table. If the underlying Redis connection is broken (e.g., Redis restart), subsequent flushes will likely fail forever and stats will never recover. Consider resetting m_table/m_db and re-entering the connect-retry logic (or otherwise rebuilding the connection) after repeated write errors.
| // Wait for at least two flush intervals to give the writer thread a | ||
| // chance to re-flush. With dirty-tracking it must stay at "999". | ||
| std::this_thread::sleep_for(std::chrono::milliseconds(2500)); | ||
|
|
||
| std::string value; | ||
| ASSERT_TRUE(tbl.hget("e", "X", value)); | ||
| EXPECT_EQ("999", value); |
There was a problem hiding this comment.
This test can pass even if the writer thread never performs another flush after the sentinel write (it only sleeps and then reads Redis). To make it a stronger regression test for the "skip unchanged" path, consider adding a separate change on a different entity/metric and waiting for it to flush, while asserting the sentinel value remains unchanged.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
972c3a1 to
d8d7dc6
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
banidoru
left a comment
There was a problem hiding this comment.
- Hot path is not lock-free — despite the design-goal claim, every
increment()/setValue()acquiresm_mutextwice (once ingetOrCreateEntity, once ingetOrCreateCounter), even when the entity+metric already exist. This is the critical perf gap. - No final flush on shutdown —
stop()wakes the writer thread, which breaks out of the loop without flushing dirty counters. Data written just before teardown is silently lost. - Singleton registry leaks entries — expired
weak_ptrslots are never pruned from the static registry map; long-lived processes cycling component names will accumulate dead entries. - Overall: solid Redis reconnect logic, clean API surface, good test coverage. The lock contention on the hot path needs fixing before this can deliver on the stated lock-free promise.
| * increment() / setValue() update the counter with std::atomic ops. | ||
| * The first reference to a new entity or metric still takes m_mutex | ||
| * to insert into the maps. | ||
| * - Deferred DB connection: the DBConnector is created inside the writer |
There was a problem hiding this comment.
The comment claims "Lock-free counter update: once an entity/metric pair has been seen, increment() / setValue() update the counter with std::atomic ops." — but the implementation in componentstats.cpp acquires m_mutex twice on every call (once in getOrCreateEntity, once in getOrCreateCounter), even when both already exist.
To actually deliver on this promise, increment()/setValue() need a lock-free fast path — e.g. cache the EntityStats& and Counter& at the call site and only fall back to the locked getOrCreate* on first use, or use a read-optimized pattern (shared_mutex with read-lock for lookups, write-lock only for inserts).
There was a problem hiding this comment.
Fixed in b21b74e. The hot path now goes through a single lookupOrCreate(entity, metric) that takes std::shared_mutex in shared_lock mode for the read-only lookup; concurrent increment() callers no longer serialize against each other. Only the first-ever insertion for a new (entity, metric) pair upgrades to an exclusive lock. The header doc and the design-goals section are updated to describe this accurately.
| { | ||
| if (!isEnabled() || n == 0) | ||
| { | ||
| return; |
There was a problem hiding this comment.
getOrCreateEntity and getOrCreateCounter each take m_mutex. For the hot path where the entity and metric already exist, this means two full mutex lock/unlock cycles per increment() call. With 8+ threads hammering counters, this serializes the "lock-free" path.
Consider at minimum combining them into a single lock acquisition, or better yet using std::shared_mutex (read-lock for lookup, write-lock for insert) so concurrent reads don't contend.
There was a problem hiding this comment.
Fixed in b21b74e. getOrCreateEntity + getOrCreateCounter have been merged into a single lookupOrCreate() that takes the lock at most once. The structural mutex is now std::shared_mutex, so the lookup uses a shared_lock (concurrent readers proceed in parallel) and only the first-time insert path upgrades to unique_lock.
| [this]{ return !m_running.load(std::memory_order_relaxed); }); | ||
| if (!m_running.load(std::memory_order_relaxed)) | ||
| { | ||
| break; |
There was a problem hiding this comment.
When m_running becomes false, the writer breaks out of the loop immediately without flushing dirty entities. Any counters updated since the last periodic flush are silently lost.
Add a final flush pass after the loop exits (while m_table is still valid) so that the last batch of dirty data lands in Redis before shutdown.
There was a problem hiding this comment.
Fixed in b21b74e. The flush logic is now factored into a private flushDirty() method, and writerThread() calls it one more time after the main loop exits — before joining — so any counter updates that landed after the last periodic flush are persisted to Redis on stop(). Best-effort: if Redis is already gone the final flush logs and moves on rather than blocking shutdown. New test FinalFlushOnShutdown covers this: uses intervalSec=60, dirties an entity, calls stop() immediately, and verifies the value lands in Redis.
| { | ||
| const std::string key = toUpper(componentName); | ||
| const uint32_t effectiveInterval = intervalSec == 0 ? 1 : intervalSec; | ||
|
|
There was a problem hiding this comment.
Expired weak_ptr entries are never removed from the registry. If a process creates and destroys many differently-named ComponentStats instances over its lifetime, this map grows without bound.
Easy fix: prune expired entries inside create() (e.g. iterate and erase expired slots before inserting), or erase the entry in the destructor under registryMutex().
There was a problem hiding this comment.
Fixed in b21b74e. create() now calls pruneExpiredRegistryLocked() (under registryMutex()) before inserting a new slot, so expired weak_ptr entries are reaped on each create() call. New test RegistryPrunesExpiredInstances releases one instance, creates 10 unrelated ones to force a prune cycle, then re-creates the original name and asserts it gets a fresh pointer.
| m_thread = std::make_unique<std::thread>(&ComponentStats::writerThread, this); | ||
| SWSS_LOG_NOTICE("ComponentStats[%s] initialized (db=%s, interval=%us)", | ||
| m_component.c_str(), m_dbName.c_str(), m_intervalSec); | ||
| } |
There was a problem hiding this comment.
intervalSec is clamped to 1 both here (line 87: m_intervalSec(intervalSec == 0 ? 1 : intervalSec)) and in create() (line 51: effectiveInterval = intervalSec == 0 ? 1 : intervalSec). The double-clamping is harmless but unnecessary — pick one location and document it there.
There was a problem hiding this comment.
Fixed in b21b74e. The ctor no longer clamps; create() is now the single chokepoint that ensures effectiveInterval >= 1, and the ctor's member-initializer just stores what it was passed.
banidoru
left a comment
There was a problem hiding this comment.
- The separation of
getOrCreateEntityandgetOrCreateCounterleads to inefficient double-locking on the hot path for new metrics. - Memory ordering for counter increments can be relaxed for better performance.
- Connection and reconnection logic for Redis could be more robust by using an exponential backoff strategy instead of a fixed interval.
- Individual Redis write failures are not retried, which could lead to data loss during transient network issues.
|
|
||
| // Returns a stable reference. Safe to keep after the lock is released | ||
| // because std::map does not invalidate element references on insert. | ||
| // INVARIANT: no code path may erase from m_entities or from |
There was a problem hiding this comment.
The two separate getOrCreate functions are called sequentially in increment() and setValue(), each taking the main mutex. This results in two separate lock/unlock cycles on the hot path for every new entity/metric pair, which is inefficient. Consider combining these into a single getOrCreateCounter(entity, metric) method that takes the lock once, creates the entity if needed, and then creates the counter if needed, returning a reference to the counter. This would halve the locking overhead for new metrics.
There was a problem hiding this comment.
Done in b21b74e — both methods are now a single lookupOrCreate(entity, metric) taking at most one lock acquisition (shared_lock on the hit path, unique_lock on the slow path that has to insert).
| { | ||
| std::lock_guard<std::mutex> lock(m_mutex); | ||
| auto it = e.metrics.find(metric); | ||
| if (it == e.metrics.end()) |
There was a problem hiding this comment.
Calling getOrCreateEntity and then getOrCreateCounter separately releases the mutex between the two operations. While std::map reference stability prevents a crash, there's a potential but unlikely ABA-style problem if an entity were to be created, deleted (if a delete API is ever added), and re-created. Combining them into one locked operation as suggested for componentstats.h would make this pattern more robust and future-proof.
There was a problem hiding this comment.
Resolved by the same change: lookupOrCreate() performs the entity lookup, the metric lookup, and (if needed) both inserts under a single lock acquisition. The lock is never released between the two halves, so there is no longer a window where another thread could race with an erase. As before, no erase API is exposed; if one is ever added, the layered structure stays consistent under a single unique_lock.
| if (it == e.metrics.end()) | ||
| { | ||
| // std::map::emplace constructs Counter in place; no move required. | ||
| it = e.metrics.emplace(std::piecewise_construct, |
There was a problem hiding this comment.
The memory order for this fetch_add can be relaxed to std::memory_order_relaxed. The subsequent fetch_add on the version with std::memory_order_release is sufficient to ensure that this counter update is visible to the writer thread before the version is updated. The release-acquire pair on the version number acts as the necessary memory barrier.
There was a problem hiding this comment.
Already the case — the existing code uses std::memory_order_relaxed on the counter fetch_add and std::memory_order_release on the version bump, exactly as you describe. The b21b74e refactor preserves that ordering; no change needed here.
| } | ||
|
|
||
| void ComponentStats::writerThread() | ||
| { |
There was a problem hiding this comment.
The retry logic here for the initial connection uses wait_for on the condition variable. This is fine, but it could be slightly more robust. If the Redis server is down during startup, this loop will retry every m_intervalSec. If m_intervalSec is large (e.g., 60s), the component might take a very long time to connect. A simple exponential backoff strategy (e.g., 1s, 2s, 4s, ... up to a max) for the retry delay would be more conventional and potentially recover faster without spamming on short intervals.
There was a problem hiding this comment.
Acked. On SONiC the writer connects to the local in-container Redis, which is typically ready within seconds of container start, so a fixed 1 s retry has been adequate in practice. Switching to exponential backoff is a reasonable hardening but I'd prefer to defer it to a follow-up so this PR stays focused on the correctness fixes (lock-free path, final flush, gauge race, registry leak). Filed mentally as a follow-up; will open a tracking issue if you prefer.
| values.push_back(std::move(row)); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
When a write failure occurs, there's no attempt to retry the failed write. The code logs the error and moves on. The consecutiveWriteFailures logic will eventually trigger a reconnect, but the specific updates that failed during this period are lost. A simple retry mechanism (e.g., retry the m_table->set call once or twice) for the individual write could improve data fidelity during transient network blips without adding much complexity.
There was a problem hiding this comment.
Acked. The current behavior is: a write failure logs and is skipped, and after kMaxConsecutiveWriteFailures=3 failures the writer drops the connection and reconnects (lastVersions.clear() ensures every entity gets re-flushed against the new connection with its latest cumulative value). So no monotonic data point is permanently lost — the next successful flush carries the up-to-date value. Adding per-call retry would help only for very-narrow transient blips and would complicate the failure-handling path; I'd prefer to leave the current "log + reconnect" policy as is. Happy to add per-call retry if you feel strongly.
| "dropping Redis connection and reconnecting", | ||
| m_component.c_str(), consecutiveWriteFailures); | ||
| m_table.reset(); | ||
| m_db.reset(); |
There was a problem hiding this comment.
The logic to re-establish the connection after failures is good. However, similar to the initial connection loop, it retries on the fixed m_intervalSec. Implementing an exponential backoff here could provide a more robust and responsive recovery behavior when Redis is down for an extended period.
There was a problem hiding this comment.
Same answer as the initial-connect case: deferring exponential backoff to a follow-up so this PR stays scoped to the correctness fixes. The reconnect path already triggers only after 3 consecutive failures, so the polling rate during outage is already bounded.
banidoru
left a comment
There was a problem hiding this comment.
- Overall: solid design — deferred DB connect, version-based dirty tracking, and stable
std::mapreferences are well thought out. setValuerace withincrement: concurrentstore+ version bump is not atomic relative tofetch_addinincrement()— an increment can land betweenstoreand the version bump, getting silently overwritten on the nextsetValuecall from another thread. Needscompare_exchangeloop or single-lock path forsetValue.- No Redis key cleanup: entities written to Redis are never deleted. Long-running containers with transient entity names will leak stale keys in COUNTERS_DB indefinitely.
- Test singleton leakage:
ComponentStats::createreturns process-lifetime singletons, so test order matters. Earlier tests' instances leak into later tests, risking flaky failures if component names collide. - Agree with R1/R2: the lock-free claim vs two-mutex-acquisition reality, missing final flush on shutdown, and registry leak are the highest-priority fixes.
| // Release ordering pairs with the acquire in writerThread() so that counter | ||
| // writes become visible before the version bump. | ||
| e.version.fetch_add(1, std::memory_order_release); | ||
| } |
There was a problem hiding this comment.
setValue has a data race with concurrent increment: thread A calls c.value.store(value) at line 163, then thread B's fetch_add lands, then thread A bumps e.version. The writer sees the new version but reads the post-increment value, which is correct — however, a second setValue call can silently discard the increment that happened between the store and the version bump, because the caller of setValue intended to set an absolute value but didn't account for concurrent increments.
This is inherent to mixing gauge-style setValue and counter-style increment on the same metric from different threads. At minimum, document this caveat prominently. If you want stronger guarantees, setValue should be done under the same lock or use a CAS loop.
There was a problem hiding this comment.
Fixed in b21b74e, partially. setValue() now takes m_structMutex in exclusive mode for the store + version bump, so its own observation is atomic and concurrent setValue() calls are serialized. However — as you point out — an increment() whose shared_lock lookup races ahead of setValue's exclusive lock can still land its fetch_add outside the lock, so the writer may observe value + n rather than value. Fully serializing that would require dragging every counter mutation under the lock, which defeats the lock-free hot path. The header doc now explicitly states this caveat: mixing gauge setValue and counter increment on the same metric from concurrent threads is best-effort, and callers should pick one style per metric for deterministic gauge semantics. New test ConcurrentIncrementAndSetValueDoNotCrash exercises the mixed path as a UB / safety check.
| // the version bump we're about to read. | ||
| uint64_t curVer = stats.version.load(std::memory_order_acquire); | ||
|
|
||
| auto vit = lastVersions.find(name); |
There was a problem hiding this comment.
Entities written to Redis (m_table->set(...)) are never cleaned up. If a container creates transient entity names (e.g., session IDs, task IDs), stale keys will accumulate in COUNTERS_DB indefinitely. Consider:
- Adding a
removeEntity()API that erases fromm_entitiesand callsm_table->del(). - Or documenting that entity names must be stable/bounded and callers are responsible for cleanup.
There was a problem hiding this comment.
Acked, declined for v1. The HLD (sonic-net/SONiC#2312, Reporting HLD §7.1 "No deletion in v1" and §12) explicitly states that entity GC is out of scope for the initial revision: stale rows live until the producer container restarts, and callers are expected to use stable / bounded entity names. The expected first consumer (SwssStats) has bounded entity cardinality = number of orchagent tables (~tens), so this is safe in practice. Adding removeEntity() is reserved for a future revision per the HLD's Open/Action items.
There was a problem hiding this comment.
Following up on this — wanted to lay out the actual design reason rather than just pointing at the HLD, since the HLD's "out of scope for v1" is a result of the reasoning below.
Why removeEntity() is non-trivial here
The lock-free hot path depends on std::map reference stability: lookupOrCreate() returns a Counter& / EntityStats&, releases the lock, and the caller then mutates atomics on those references concurrently:
cpp auto ref = lookupOrCreate(entity, metric); // shared_lock taken + released ref.counter.value.fetch_add(...); // lock-free atomic ref.entity.version.fetch_add(...);
If we expose removeEntity(), thread A could hold a Counter& while thread B erases the entity from the map — classic use-after-free.
Making erase safe requires one of:
shared_ptr<Counter>per metric: adds an atomic refcount op on every hot-path access (cache-line ping-pong is the dominant cost in busy counter workloads).- RCU / hazard pointers: significant code complexity and a quiescent-state mechanism.
- Take
unique_lockfor every mutation: kills the lock-free path entirely.
Why no current consumer needs it
All v1 / planned consumers have topology-bounded entity cardinality:
| Consumer | Entity | Cardinality |
|---|---|---|
| SwssStats (sonic-swss#4516) | orchagent table name | ~tens, stable for container lifetime |
| gnmi (future) | gNMI subscription path | bounded by subscriber config |
| bmp (future) | BMP peer address | bounded by BGP topology |
None of them mint short-lived entity names (no per-RPC IDs, per-session UUIDs, etc.), so growth is bounded and removeEntity() would not actually be exercised.
v1's "GC" mechanism
The Redis schema has no TTL; keys live for the container's process lifetime and are recreated on restart (HLD §10). Stale-key lifetime is bounded by container restart frequency, which is sufficient for the planned consumers.
When v2 would do this
If a future consumer needs truly short-lived entities (RPC IDs, session-scoped objects), the right move is to redesign the storage layer at the same time (e.g. switch to shared_ptr<Counter> or RCU) and document the new concurrency model. That's a non-trivial change deserving its own HLD revision, not a follow-up patch on top of this one.
So this is a deliberate design trade-off, not just deferred work. Happy to add the shared_ptr variant if you think the SwssStats / future consumers will need it sooner — let me know if you want me to spike that.
| TEST(ComponentStats, IncrementBasic) | ||
| { | ||
| auto s = ComponentStats::create("UT1"); | ||
| s->increment("entity_a", "SET"); |
There was a problem hiding this comment.
Each test creates a ComponentStats singleton keyed by component name, and these persist for the entire process lifetime (the registry holds weak_ptr, but the shared_ptr returned by create() is held until the test scope ends — and the writer thread may still be running). With unique names like UT1, UT2, etc., this works today, but:
- Each test leaks a writer thread until
stop()orshared_ptrdestruction — many tests = many threads alive simultaneously. - If anyone reuses a component name across tests, they'll silently share state.
Consider calling s->stop() at the end of each in-memory test to clean up the writer thread promptly.
There was a problem hiding this comment.
Fixed in b21b74e. Every test that constructs a ComponentStats instance now calls s->stop() at the end, so the writer thread is joined and the singleton is released before the next test runs. Distinct component names (UT1 … UT_*) are still used so that even if a release races with the registry prune, two tests cannot accidentally share state.
… leak This commit addresses banidoru's review comments on PR sonic-net#1180. * Hot path is now effectively lock-free. lookupOrCreate() takes a shared_lock on m_structMutex for the read-only lookup and only upgrades to an exclusive lock when a new (entity, metric) pair must be inserted. Concurrent callers no longer serialize against each other, matching the stated design goal. * getOrCreateEntity + getOrCreateCounter merged into a single lookupOrCreate(entity, metric) so the hot path acquires at most one lock instead of two. * writerThread now performs a final flush pass after the main loop exits, so counter updates that landed after the last periodic flush are persisted to Redis on stop() rather than being silently dropped. * setValue takes m_structMutex in exclusive mode for the store + version bump and the public doc explicitly warns that mixing setValue (gauge) and increment (counter) on the same metric from concurrent threads is best-effort -- callers must pick one style per metric for deterministic gauge semantics. * Singleton registry no longer leaks: create() prunes expired weak_ptr slots before insertion so long-running processes that cycle component names do not accumulate dead entries. * m_cv now pairs with a dedicated m_cvMutex (separate from the structural mutex) since std::shared_mutex is not compatible with std::condition_variable::wait_for. * intervalSec is now clamped once in create() and the constructor trusts the caller, removing the duplicate clamp. * Unit tests: every test that uses a writer thread now calls stop() to keep the per-component singletons from leaking writer threads across test cases. New tests cover the final-flush-on-shutdown guarantee, registry pruning of released instances, and the concurrent-increment-vs-setValue no-crash invariant. Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…review fixes) Picks up review fixes from sonic-net/sonic-swss-common#1180 that address banidoru's feedback: * Lock-free hot path via shared_mutex + merged lookupOrCreate * Final flush on shutdown so no in-flight updates are dropped * Registry pruning of expired weak_ptr slots * setValue exclusive lock + documented gauge/counter mixing caveat Public API (increment/setValue/get/getAll/setEnabled/stop) is unchanged, so sonic-swss SwssStats facade does not need to move. Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…+ sonic-swss-common#1180) Recreated on top of fresh upstream master to drop the conflicts from sonic-net#27456. Repoints two submodules to forks carrying the refactored library + thin-wrapper: | submodule | now | source PR | |---|---|---| | src/sonic-swss-common | b21b74e (feature/component-stats) | sonic-net/sonic-swss-common#1180 | | src/sonic-swss | 3710025 (swss-stats-use-componentstats) | sonic-net/sonic-swss#4516 | .gitmodules URLs are temporarily switched to https://github.com/yutongzhang-microsoft/... so CI can resolve the non-upstream SHAs. The swss-common SHA already includes banidoru's review fixes (lock-free hot path via shared_mutex + merged lookupOrCreate, final flush on shutdown, registry pruning, setValue exclusive lock + docs). Supersedes sonic-net#27456. Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw run |
|
Retrying failed(or canceled) jobs... |
|
Retrying failed(or canceled) stages in build 1123777: ✅Stage Build:
|
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
[component-stats] Add HLD for SONiC component statistics ## What I'm doing Adding a new High-Level Design document at `doc/component-stats/component-stats-hld.md` that specifies a reusable mechanism for exposing **service-level (control-plane software) counters** from SONiC containers. The HLD introduces: 1. A new shared library `swss::ComponentStats` in `sonic-swss-common`. 2. A SWSS-specific facade `SwssStats` in `sonic-swss` built on top of that library, as the first consumer. Counters are published to two sinks driven from a single in-process atomic snapshot: - **`COUNTERS_DB`** — for parity with the existing Flex-Counter pipeline and for on-box diagnostic tooling (`redis-cli`, `show ... stats`). - **Local OpenTelemetry (OTLP) Collector sidecar** — so the same counters can be forwarded to off-box telemetry systems (e.g. Geneva mdm) that consume OTLP. ## Why SONiC already has dataplane counters (Flex-Counter / SAI), but no uniform mechanism for **service-level** counters such as orchagent task throughput, gNMI request rate, or BMP error counts. A naive per-container implementation would duplicate atomic counter management, dirty tracking, the writer thread, the Redis schema, and an OTLP exporter in every container — concurrency review, bug fixes, and on-the-wire schemas would all drift. This HLD specifies one reusable producer that any container can adopt with a ~100-line facade. ## Companion PRs - `sonic-net/sonic-swss-common` [#1180](sonic-net/sonic-swss-common#1180) — `swss::ComponentStats` library + unit tests. - `sonic-net/sonic-swss` [#4516](sonic-net/sonic-swss#4516) `SwssStats` thin facade over `ComponentStats` in `orchagent/`. Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extracts the generic mechanism behind sonic-swss#4434 into a container-agnostic library living in sonic-swss-common, so that orchagent, gnmi, bmp, telemetry etc. can all share the same statistics plumbing. Key features: - Lock-free hot path (atomic counters, memory_order_relaxed increments) - Version-based dirty tracking (writer thread skips unchanged entities) - Deferred DB connection with retry (safe to construct before Redis is up) - Stable references via std::map (no rehash invalidation surprises) - condition_variable-based clean shutdown (no 1s teardown delay) - Per-component singleton keyed by name; Redis keys follow <COMPONENT>_STATS:<entity> - Runtime enable/disable knob (setEnabled) so CONFIG_DB can toggle at runtime Users define their own metric names; the library only provides storage and periodic Redis flushing. A companion PR in sonic-swss will refactor SwssStats to become a thin wrapper around ComponentStats. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
Add Redis-backed tests covering the writer thread paths that the in-memory tests cannot reach: - WriterFlushesToRedis: verifies counters land in Redis through the writer thread (covers successful DB connect, flush loop, and Table::set call) - WriterSkipsUnchangedEntities: verifies version-based dirty tracking (covers the no-change skip branch) - DestructorStopsThread: verifies the shared_ptr destructor cleanly joins the writer thread - ConnectRetryOnBadDb: uses an unknown DB name to exercise the connect-retry catch block and confirm stop() promptly wakes the cv_wait Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: yutongzhang-microsoft <yutongzhang@microsoft.com>
…nter storage - create(): warn (don't silently ignore) when later callers pass a different dbName/intervalSec; document 'first call wins' in the header. - Drop unique_ptr<Counter> indirection: std::map::emplace constructs Counter in place and std::map never invalidates references on insert, so the extra heap allocation and dereference are unnecessary. - Header doc: clarify that 'lock-free hot path' applies to the counter update only (entity/metric creation still takes m_mutex), document intervalSec=0 clamp and ASCII-only component names, and capture the reference-stability invariant so future contributors don't break it. - setEnabled() doc: spell out that disabling preserves in-memory counters and previously-flushed Redis values. - writerThread(): comment why the post-retry !m_table guard is defensive. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: yutongzhang-microsoft <yutongzhang@microsoft.com>
…hanged test, doc/log fixes - componentstats.cpp: after N consecutive Redis write failures, drop the table/connector and re-enter the connect-retry loop so a Redis restart is recovered automatically instead of logging forever. - componentstats_ut.cpp: strengthen WriterSkipsUnchangedEntities by dirtying a second entity and asserting it flushes -- this fails fast if the writer thread is dead, instead of silently passing. - componentstats.h: clarify setEnabled() doc -- disabling does not abort an in-flight flush of already-dirty entities. - Replace Unicode U+2026 (...) with ASCII ... in header doc and log msg. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
… leak This commit addresses banidoru's review comments on PR sonic-net#1180. * Hot path is now effectively lock-free. lookupOrCreate() takes a shared_lock on m_structMutex for the read-only lookup and only upgrades to an exclusive lock when a new (entity, metric) pair must be inserted. Concurrent callers no longer serialize against each other, matching the stated design goal. * getOrCreateEntity + getOrCreateCounter merged into a single lookupOrCreate(entity, metric) so the hot path acquires at most one lock instead of two. * writerThread now performs a final flush pass after the main loop exits, so counter updates that landed after the last periodic flush are persisted to Redis on stop() rather than being silently dropped. * setValue takes m_structMutex in exclusive mode for the store + version bump and the public doc explicitly warns that mixing setValue (gauge) and increment (counter) on the same metric from concurrent threads is best-effort -- callers must pick one style per metric for deterministic gauge semantics. * Singleton registry no longer leaks: create() prunes expired weak_ptr slots before insertion so long-running processes that cycle component names do not accumulate dead entries. * m_cv now pairs with a dedicated m_cvMutex (separate from the structural mutex) since std::shared_mutex is not compatible with std::condition_variable::wait_for. * intervalSec is now clamped once in create() and the constructor trusts the caller, removing the duplicate clamp. * Unit tests: every test that uses a writer thread now calls stop() to keep the per-component singletons from leaking writer threads across test cases. New tests cover the final-flush-on-shutdown guarantee, registry pruning of released instances, and the concurrent-increment-vs-setValue no-crash invariant. Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI on the previous commit failed because swss-common's autotools build uses -std=c++14 (see configure.ac CFLAGS_COMMON), and std::shared_mutex is a C++17 addition. std::shared_timed_mutex is the C++14 ancestor that provides the same shared/exclusive locking API; the timed-wait portion is not used. Drop-in replacement, no behavior change. Fixes the "'shared_mutex' is not a member of 'std'" errors at: common/componentstats.h:200 common/componentstats.cpp:157 (and other shared_lock/unique_lock sites) Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…g batch The previous version threw on the first failed m_table->set() within a batch. Because lastVersions[name]=curVer is set in the snapshot phase (under shared_lock, before any write happens), entities sequenced AFTER the failing one would have their lastVersions slot set but never get written -- the next periodic pass would then see them as "unchanged" and silently skip them. Switched to per-entity try/catch in the write loop: log + rollback lastVersions[failed_key] and keep going. After the loop, if any failure happened, throw a summary error so writerThread can still bump the consecutiveWriteFailures counter and trigger reconnect if Redis is fully gone. Restores the per-write isolation that the original loop in writerThread had. Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous version of this test asserted that re-creating a component with the same name (after the first instance was released and a churn of unrelated names triggered a prune cycle) would return a *different* raw pointer. That assumption is wrong: glibc's allocator is free to recycle the same address for the second instance, and in CI it does. Log from the failing run shows firstPtr=0x...49a0 == s2.get()=0x...49a0. Replaced the raw-pointer compare with a behavioural check: assert that the re-created instance starts with a zero counter for an entity the old instance had incremented to 7. If the registry had failed to prune (and somehow kept the old instance alive), the new get() would return 7 instead of 0. This tests the actual singleton-recycling invariant without relying on the allocator returning a fresh address. Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
42ee6fe to
4575804
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
[component-stats] Add HLD for SONiC component statistics ## What I'm doing Adding a new High-Level Design document at `doc/component-stats/component-stats-hld.md` that specifies a reusable mechanism for exposing **service-level (control-plane software) counters** from SONiC containers. The HLD introduces: 1. A new shared library `swss::ComponentStats` in `sonic-swss-common`. 2. A SWSS-specific facade `SwssStats` in `sonic-swss` built on top of that library, as the first consumer. Counters are published to two sinks driven from a single in-process atomic snapshot: - **`COUNTERS_DB`** — for parity with the existing Flex-Counter pipeline and for on-box diagnostic tooling (`redis-cli`, `show ... stats`). - **Local OpenTelemetry (OTLP) Collector sidecar** — so the same counters can be forwarded to off-box telemetry systems (e.g. Geneva mdm) that consume OTLP. ## Why SONiC already has dataplane counters (Flex-Counter / SAI), but no uniform mechanism for **service-level** counters such as orchagent task throughput, gNMI request rate, or BMP error counts. A naive per-container implementation would duplicate atomic counter management, dirty tracking, the writer thread, the Redis schema, and an OTLP exporter in every container — concurrency review, bug fixes, and on-the-wire schemas would all drift. This HLD specifies one reusable producer that any container can adopt with a ~100-line facade. ## Companion PRs - `sonic-net/sonic-swss-common` [sonic-net#1180](sonic-net/sonic-swss-common#1180) — `swss::ComponentStats` library + unit tests. - `sonic-net/sonic-swss` [#4516](sonic-net/sonic-swss#4516) `SwssStats` thin facade over `ComponentStats` in `orchagent/`. Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
What I did
Added a new reusable library
swss::ComponentStatsunderlibswsscommon, together with its unit tests. The library lets any SONiC container publish service-level counters intoCOUNTERS_DBunder a uniform<COMPONENT>_STATS:<entity>key layout.This is a purely additive change — no existing API is modified or removed, and no current consumer of
libswsscommoncalls into the new code, so behaviour of every existing SONiC binary is unchanged.Why I did it
SONiC already has dataplane counters (Flex-Counter → syncd →
COUNTERS_DB). What we are missing is service-level counters — software-side events such as orchagent task throughput, gNMI request rate, BMP message error counts. Without them we cannot answer questions like "is orchagent draining tasks?", "is gNMI seeing subscribe failures?", "is one container dropping more events than its peers?".If each container rolled its own plumbing for this, every container would carry its own concurrency review, the bug fixes would drift, and the on-the-wire schemas would diverge. A single shared library is the only way to keep these properties consistent across SONiC.
Full motivation and design are in the HLD: sonic-net/SONiC#2312.
How I did it
tests/componentstats_ut.cpp, 9 cases) covering basic round-trip, gauge semantics, runtime enable/disable, bulk read, singleton behaviour, and 8-thread × 10 000-increment concurrency.Related