perf(nn): amortise CompileIP/CompileDNS via per-container matcher cache#42
perf(nn): amortise CompileIP/CompileDNS via per-container matcher cache#42entlein wants to merge 3 commits into
Conversation
Profile-checksum-invalidated cache of compiled networkmatch.IPMatcher /
DNSMatcher per (containerID, neighborIndex). The previous code path
re-compiled every NetworkNeighbor's entries on each CEL function-cache
miss; this PR builds each matcher at most once per profile-checksum
lifetime and reuses it across subsequent misses.
Design:
matcherCache (sync.Map) inside nnLibrary, zero-value safe so existing
test fixtures that construct nnLibrary{} directly continue to work
without changes.
Per-container entry tagged with the profile's SyncChecksumMetadataKey
annotation. On lookup: if checksum matches, reuse; else allocate a
fresh containerMatchers and store with LoadOrStore (concurrent-safe).
Per-neighbor matchers are nil-init and lazily compiled on first use,
so a profile with 10 egress entries that only ever fires through 2 of
them pays compile cost for only those 2.
Benchmarks (arm64, -benchtime=1s):
IP, realistic profile (5 neighbors x 3 entries, observation misses all):
Cold (per-call recompile): 1733 ns/op 1920 B/op 76 allocs/op
Hot (cached matchers) : 177 ns/op 32 B/op 2 allocs/op
~ -90% time, -98% bytes, -97% allocs
DNS, realistic profile:
Cold: 1219 ns/op 1800 B/op 41 allocs/op
Hot : 318 ns/op 272 B/op 7 allocs/op
~ -74% time, -85% bytes, -83% allocs
Churning profile (checksum flips every iteration — pathological):
1527 ns/op 1936 B/op 77 allocs/op
Matches cold path: cache overhead itself is negligible; the savings
come strictly from amortising compile across stable-checksum windows.
In production this stacks on top of the existing CEL functionCache
(which already absorbs same-(containerID,observed) cache hits). The
matcher cache catches what slips through: unique-observation cache
misses within a profile-checksum lifetime.
Touched:
- matcher_cache.go new file: cache impl
- matcher_cache_bench_test.go new file: comparison bench
- network.go use cached matchers in all 6 CEL fns
- nn.go matcherCache field on nnLibrary
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR adds a per-container matcher cache to the network-neighborhood CEL library with lazy, atomic per-neighbor compilation of IP/DNS matchers, integrates the cache into CEL IP/DNS/port functions, and includes benchmarks and tests exercising hot/cold and churn scenarios. ChangesNetwork Neighborhood Matcher Caching
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/rulemanager/cel/libraries/networkneighborhood/matcher_cache.go`:
- Around line 61-68: The current use of c.m.LoadOrStore returns an existing
containerMatchers even when it is stale (not just checksum-mismatched), causing
stale shape entries to be reused; replace the LoadOrStore logic in
matcher_cache.go with explicit Load and validation: call c.m.Load(containerID),
if present cast to *containerMatchers (containerMatchers) and check both
checksum and the staleness condition used earlier (the neighbor-count/shape
validation used around Lines 49–50); if the existing cm is stale or checksum
differs, call c.m.Store(containerID, fresh) and return fresh; if no entry was
found, store and return fresh. Keep references to c.m.Load, c.m.Store,
containerMatchers and checksum in your changes.
- Around line 79-96: The ip and dns pointer fields are being lazily populated
without synchronization causing race-detector failures; add per-field sync.Once
members (e.g., ipOnce and dnsOnce) to the neighborMatchers struct and change
containerMatchers.ipMatcher and dnsMatcher to call the appropriate Once.Do
wrapper that runs the existing compile/assignment logic so compilation still
happens at most once per field while preserving lazy-init semantics and avoiding
races (keep the existing entries construction and
networkmatch.CompileIP/CompileDNS calls inside the Once.Do closures and assign
the result to nm.ip / nm.dns).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 665f1acd-13bb-4898-8755-4a4c83f90081
📒 Files selected for processing (4)
pkg/rulemanager/cel/libraries/networkneighborhood/matcher_cache.gopkg/rulemanager/cel/libraries/networkneighborhood/matcher_cache_bench_test.gopkg/rulemanager/cel/libraries/networkneighborhood/network.gopkg/rulemanager/cel/libraries/networkneighborhood/nn.go
…s replace (CR #42) Two findings from CodeRabbit round 1, both fixed: 1. Stale-entry shape race in getOrBuild (Major) Old code used LoadOrStore on the staleness path and only replaced on checksum mismatch — but a shape mismatch (neighbor count change) could leak the stale entry to a caller whose profile has a different shape, which then index-panics in ipMatcher/dnsMatcher. Fix: when staleness is detected (by checksum OR shape), always Store unconditionally. Worst-case contention: several goroutines build shape-correct fresh entries and one Store wins; all callers still see a shape-correct entry. Orphans get GC'd. 2. Unsynchronised lazy-init of per-neighbor matchers (Critical) neighborMatchers.ip / .dns were *Matcher with a non-atomic 'if nil then build then assign' pattern — a real data race. Fix: switched to atomic.Pointer[networkmatch.IPMatcher] (and DNS). First-build callers may race on Compile but only one pointer wins via CompareAndSwap; everyone returns the winning matcher. Pure functions (no shared state) so duplicate Compile work is wasteful but not incorrect. New tests in matcher_cache_test.go pin the contract: - TestMatcherCache_ConcurrentFirstBuild: 64 goroutines racing on the same slot, run under -race, asserts matchers are populated exactly once - TestMatcherCache_StaleEntryReplaced: shape-mismatch path returns a fresh containerMatchers, not the stale one - TestMatcherCache_ChecksumPreservedAcrossCalls: same checksum hits cache (no rebuild) Benchmarks re-run after atomic.Pointer switch — negligible impact (177 → 186 ns/op, still 8x faster than cold path). All headline savings preserved.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Performance Benchmark ResultsNode-Agent Resource Usage
Dedup EffectivenessNo data available. Event Counters
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/rulemanager/cel/libraries/networkneighborhood/matcher_cache_test.go`:
- Around line 50-61: Add a start barrier so all goroutines begin the contested
calls at the same time: create a start channel (e.g., start := make(chan
struct{})) before launching the loop, have each goroutine block on <-start (or
<-start with a receive) immediately inside the goroutine before calling
lib.wasAddressInEgress and lib.isDomainInEgress, and after spawning all
goroutines close(start) to release them simultaneously; reference the test's
goroutines/WG setup and the calls to lib.wasAddressInEgress and
lib.isDomainInEgress when applying the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: f5a58d1c-711e-4226-a3e9-2b20ea7c79b2
📒 Files selected for processing (2)
pkg/rulemanager/cel/libraries/networkneighborhood/matcher_cache.gopkg/rulemanager/cel/libraries/networkneighborhood/matcher_cache_test.go
…und 2) Without the barrier, goroutine launch jitter staggers first-call arrivals, hiding any unsynchronised-write data race during the first-build window. With the barrier, all 64 goroutines hit the contended path simultaneously when close(start) fires — much tighter race-detector coverage of the atomic.Pointer.CompareAndSwap path.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Performance Benchmark ResultsNode-Agent Resource Usage
Dedup EffectivenessNo data available. |
Performance Benchmark ResultsNode-Agent Resource Usage
Dedup Effectiveness (AFTER only)
Event Counters
|
Summary
Follow-up to PR #41 (network-wildcards). Caches the compiled
`networkmatch.IPMatcher` / `DNSMatcher` per `(containerID, neighborIndex)`,
invalidated by the profile's `SyncChecksumMetadataKey` annotation. Previously
every CEL function-cache miss re-compiled every `NetworkNeighbor`'s entries;
now each neighbor's matcher is built at most once per profile-checksum
lifetime and reused across subsequent misses.
This builds on top of the existing CEL `functionCache` (which already
absorbs same-`(containerID, observed)` cache hits). The matcher cache
catches what slips through: unique-observation misses within a
profile-checksum window.
Benchmark evidence
Worst-case realistic profile (5 neighbors x 3 IP entries, observation misses
every entry), arm64, `-benchtime=1s`:
The churning case (pathological worst-case where every call invalidates
the cache) matches the cold-path baseline. Confirms the cache layer has
no measurable overhead of its own; the wins come strictly from
amortising compile across stable-checksum windows.
Design
```
type matcherCache struct { m sync.Map } // containerID -> *containerMatchers
type containerMatchers struct {
checksum string
egress []neighborMatchers // lazy-built per-index
ingress []neighborMatchers
}
type neighborMatchers struct {
ip *networkmatch.IPMatcher // nil until first match call
dns *networkmatch.DNSMatcher
}
```
Zero-value safe: `matcherCache{}` is a valid empty cache, so existing test
fixtures that construct `nnLibrary{}` directly continue to work without
explicit init.
Per-neighbor matchers are nil-init and lazily compiled on first use, so a
profile with 10 egress entries that only ever fires through 2 of them
pays compile cost for only those 2.
Test plan
Base branch
This PR targets `feat/network-wildcards` (PR #41) because it depends on
the new `networkmatch.CompileIP` / `CompileDNS` API introduced there.
When #41 merges to main, the base of this PR will auto-rebase.