Skip to content

Fail closed unusable userspace SNAT pools#1417

Merged
psaab merged 2 commits into
masterfrom
codex/next-1377-snat-runtime
May 18, 2026
Merged

Fail closed unusable userspace SNAT pools#1417
psaab merged 2 commits into
masterfrom
codex/next-1377-snat-runtime

Conversation

@psaab
Copy link
Copy Markdown
Owner

@psaab psaab commented May 18, 2026

Summary

  • preserve unusable source-NAT pool rules in the userspace snapshot with explicit pool_unusable reasons instead of dropping rule intent
  • split Rust source-NAT lookup into no-match, matched, and unavailable results, then fail closed at the four AF_XDP source-NAT decision sites before session creation/forwarding
  • record recent exceptions for missing/empty/invalid/wrong-family/exhausted pool failures and update Feature gap: address-persistent SNAT pool mode silently degrades to round-robin in userspace-dp #1377 docs plus the runtime contract guard

Validation

Copilot AI review requested due to automatic review settings May 18, 2026 04:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes userspace AF_XDP source-NAT pool handling to preserve unusable pool rules in snapshots and fail closed at runtime instead of silently forwarding without SNAT.

Changes:

  • Adds pool_unusable metadata to Go/Rust source-NAT snapshots.
  • Splits Rust SNAT lookup into no-match, matched, and unavailable outcomes.
  • Updates AF_XDP SNAT decision sites, tests, contract guard, docs, and action log for the fail-closed contract.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
userspace-dp/tests/snat_contract_doc_guard.rs Updates contract guard from fail-open to fail-closed expectations.
userspace-dp/src/protocol.rs Adds unusable-pool fields to Rust snapshot protocol.
userspace-dp/src/nat.rs Adds unavailable SNAT lookup results and fail reasons.
userspace-dp/src/nat_tests.rs Updates/adds SNAT pool failure unit coverage.
userspace-dp/src/afxdp/poll_descriptor.rs Applies fail-closed SNAT handling at AF_XDP decision sites.
userspace-dp/src/afxdp/mod.rs Re-exports/imports new SNAT failure/result types.
userspace-dp/src/afxdp/forwarding/mod.rs Adds forwarding helper returning structured SNAT lookup results.
README.md Updates high-level userspace SNAT status.
pkg/dataplane/userspace/snapshot.go Preserves unusable pool rules in Go snapshots.
pkg/dataplane/userspace/protocol.go Adds unusable-pool fields to Go snapshot protocol.
pkg/dataplane/userspace/manager_test.go Updates snapshot tests for preserved unusable rules.
docs/userspace-dataplane-gaps.md Refreshes capability/gap status for fail-closed pool SNAT.
docs/userspace-dataplane-architecture.md Updates architecture notes for SNAT fail-closed behavior.
docs/pr/1373-retire-ebpf-dataplane/README.md Updates #1377 status summary.
docs/pr/1373-retire-ebpf-dataplane/plan.md Updates retirement-plan dependency summary.
docs/pr/1373-retire-ebpf-dataplane/plan-1377-snat-pools.md Rewrites #1377 runtime boundary from fail-open to fail-closed.
_Log.md Records the PR action and validation commands.
Comments suppressed due to low confidence (2)

userspace-dp/src/afxdp/poll_descriptor.rs:1237

  • This new fail-closed branch is only protected by the doc guard/source-text checks and the lower-level NAT lookup unit tests; I could not find an AF_XDP poll-path test that drives an unavailable SNAT pool through this branch and asserts the packet is recycled without creating/forwarding a session. Because these four branches are the production enforcement point for the fail-closed contract, please add a functional poll-path test (normal and pending-neighbor paths, or a shared helper-level test) that would fail if this continue were removed or the error were converted back to a default NAT decision.
                                                Err(failure) => {
                                                    record_source_nat_failure(
                                                        telemetry,
                                                        worker_ctx,
                                                        meta,
                                                        flow,
                                                        from_zone_id,
                                                        to_zone_id,
                                                        desc.len,
                                                        &failure,
                                                    );
                                                    binding.scratch.scratch_recycle.push(desc.addr);
                                                    continue;

userspace-dp/src/afxdp/poll_descriptor.rs:2155

  • This pending-neighbor fail-closed branch lacks a functional poll-path test that proves an unavailable SNAT pool prevents missing-neighbor seed session creation. The existing NAT lookup tests do not exercise this session-build retry path, so a regression here could still install an untranslated seed session while the source-text guard passes.
                                                Err(failure) => {
                                                    record_source_nat_failure(
                                                        telemetry,
                                                        worker_ctx,
                                                        meta,
                                                        flow,
                                                        from_zone_id,
                                                        to_zone_id,
                                                        desc.len,
                                                        &failure,
                                                    );
                                                    binding.scratch.scratch_recycle.push(desc.addr);
                                                    continue;

Comment on lines +51 to +59
record_exception(
worker_ctx.recent_exceptions,
&worker_ctx.ident,
failure.exception_reason(),
packet_length,
Some(meta),
Some(&debug),
worker_ctx.forwarding,
);
@psaab
Copy link
Copy Markdown
Owner Author

psaab commented May 18, 2026

Claude r1 review on 851c8111 — CLOSES THE #1377 RUNTIME FAIL-OPEN GAP

Verdict: MERGE-READY (pending Codex/Gemini)

This is the comprehensive #1377 closure. The architecture.md text we iterated on in prior PRs ("runtime remains fail-open at the poll_descriptor.rs source-NAT call sites") is now factually obsolete — this PR makes it fail-closed end-to-end.

Fail-closed gate verified (Rust side)

New nat.rs types:

pub(crate) enum SourceNatLookup {
    NoMatch,
    Matched(NatDecision),
    Unavailable(SourceNatFailure),
}

pub(crate) struct SourceNatFailure {
    pub(crate) rule_name: String,
    pub(crate) pool_name: String,
    pub(crate) reason: SourceNatFailureReason,
}

pub(crate) enum SourceNatFailureReason {
    MissingPool,
    EmptyPool,
    InvalidPool,
    InvalidPortRange,
    WrongAddressFamily,
    AllocatorExhausted,
}

New poll_descriptor.rs gate (source_nat_decision_for_flow): returns Result<NatDecision, SourceNatFailure>. The caller's match arms can now distinguish "no match" (default decision OK) from "unavailable" (fail-closed required).

Telemetry (record_source_nat_failure): increments counters.exception_packets, records to recent_exceptions with failure.exception_reason() string. Operator-visible AND categorized by reason.

Snapshot side (Go)

snapshot.go no longer SKIPS unusable pools — it marks them poolUnusable: true with poolUnusableReason so the rule reaches Rust which can fail-closed at decision time (rather than the rule disappearing from the snapshot and being silently bypassed).

This is a key design choice: marking-then-rejecting at Rust gives operator-visible telemetry on EACH dropped packet, vs silently omitting the rule and forwarding without SNAT.

Docs + guard test flipped

docs/userspace-dataplane-architecture.md:295-304 (new):

"Pool-mode rules with missing pools, empty pools, invalid port ranges, malformed addresses, or no address for the packet family fail-closed at the current poll_descriptor.rs source-NAT call sites before session creation or forwarding, and record recent-exception reasons such as source_nat_pool_missing, source_nat_pool_empty, and source_nat_pool_invalid_port_range."

The doc guard test renamed: snat_contract_documents_current_fail_open_runtimesnat_contract_documents_current_fail_closed_runtime — actively pinning the new contract.

Hostile concerns for Codex

A. Are ALL four source-NAT call sites in poll_descriptor.rs (the ones we tracked through prior rounds) wired through source_nat_decision_for_flow? Grep match_source_nat_for_flow to verify the old path is gone.
B. AllocatorExhausted case: when does it fire? Pool full at allocation time — does telemetry distinguish exhaustion from missing-pool?
C. In-flight session behavior: a session created when pool was usable, then config change marks it unusable — does the in-flight session continue or get torn down?
D. Backward compat: an operator with a borderline-unusable pool in production — does commit check warn or does it silently apply and start dropping?
E. nat_tests.rs coverage — pins each SourceNatFailureReason variant?

Recommendation

MERGE-READY — multi-round saga concludes. The architecture is correct: snapshot marks → Rust decides fail-closed → telemetry records reason. Doc guard updated.

Awaiting Codex (task-mpaou1e4-wucm5a) + Gemini Pro 3 (task-mpaoufbi-fz5h20).

Not merging — author's decision.

@psaab
Copy link
Copy Markdown
Owner Author

psaab commented May 18, 2026

Round-1 triple-review synthesis on 851c8111#1377 SAGA CLOSURE

Reviewer Verdict
Claude MERGE-READY
Codex MERGE-NEEDS-MINOR (runtime exception drops rule/pool names)
Gemini Pro 3 MERGE-READY (with minor backward-compat note)

r1 verifies the comprehensive #1377 closure

All three reviewers confirm the structural correctness:

Codex:

"The actual fail-closed runtime gate is in place before session creation/forwarding. All four poll_descriptor.rs source-NAT sites are gated. The old match_source_nat_for_flow(...).unwrap_or_default() pattern is gone."

Gemini:

"The legacy unwrap_or_default() fail-open calls were removed. All four source-NAT call sites in poll_descriptor.rs now match against source_nat_decision_for_flow(). On failure, they immediately recycle the buffer and break out of forwarding logic."

The SourceNatFailureReason enum covers exactly the architecture.md taxonomy: MissingPool, EmptyPool, InvalidPool, InvalidPortRange, WrongAddressFamily, AllocatorExhausted.

Codex MINOR — operator visibility incomplete

"record_source_nat_failure increments exception_packets and records a recent exception, but it drops failure.rule_name and failure.pool_name. There is no Rust runtime log line. Go snapshot build logs missing/empty/invalid-range pools with rule/pool, but runtime-only cases like malformed pool address, wrong family, and allocator failure are only visible as generic recent-exception reasons."

The exception_reason string identifies WHAT happened (e.g., source_nat_pool_missing), but not WHICH rule or pool. Operators triaging mixed-rule configs need both.

Gemini deliberate-break note

"Prior behavior allowed 'wrong-family' pools to be silently skipped so that a subsequent, compatible rule could match the packet (the fail-open bug pathway). This PR enforces strict top-down matching: if a rule matches the flow but the pool cannot translate the family, it now explicitly fails closed, shadowing any later rules."

This is correct firewall-matching precedence. Operators who relied on implicit fall-through will see behavior change — proven by the test renaming pool_snat_wrong_family_pool_fails_closed_before_later_rule. Worth a deployment-note callout.

Docs + guard test verified

Codex: "docs/userspace-dataplane-architecture.md removes the stale 'runtime remains fail-open' wording. snat_contract_doc_guard.rs was refreshed to require exactly four source_nat_decision_for_flow call sites and reject stale fail-open wording."

The test guard from prior PRs is now actively pinning the new contract.

In-flight session behavior verified

Gemini: "Active, in-flight sessions are unaffected as they rely entirely on the established fast-path reverse-session state map. The PR makes zero changes to session.rs or forwarding.rs active fast paths."

So config-change-while-sessions-active: existing flows continue, new flows after the change fail-closed.

Recommendation

Strongly consider in this PR (Codex MINOR): extend record_exception to carry rule_name + pool_name from SourceNatFailure. Adding these fields to the existing exception record makes operator triage immediate. Should be a single struct extension.

Defer: Rust runtime log line on each failure (debug log only — recent-exception telemetry is the right operator surface for now).

Document for operators: the deliberate wrong-family fail-closed break (Gemini note) — likely a release-notes / migration-guide entry.

#1377 closure

This PR concludes a multi-round saga. The architecture is correct: Go snapshot marks unusable rules → Rust evaluator decides fail-closed → telemetry records categorized reason. The doc guard pin and architecture.md updates close the documentation contract.

Codex task: task-mpaou1e4-wucm5a. Gemini task: task-mpaoufbi-fz5h20. Not merging — author's decision.

@psaab
Copy link
Copy Markdown
Owner Author

psaab commented May 18, 2026

Claude r2 review on 773754a3

Verdict: MERGE-READY (pending Codex/Gemini)

r1 MINOR closure: rule/pool names surfaced in SNAT exceptions

r2 diff shows the right shape — disposition.rs +104/-30 (main change), protocol.rs +4 (wire fields added), statusfmt.go +6 (operator render), forwarding/tests.rs +41 (test coverage), statusfmt_test.go +4.

The Rust+Go protocol.rs / protocol.go pattern from prior PRs (e.g. #1410 wire fields) suggests this PR adds rule_name + pool_name as new wire fields on the exception record.

Need Codex to verify:

  • Exception struct now carries rule_name + pool_name in disposition.rs
  • Both Rust protocol.rs AND Go protocol.go have matching new fields (the cross-language wire discipline)
  • statusfmt.go renders the new fields in show ... exceptions output
  • forwarding/tests.rs pins each named-failure case (missing/empty/invalid-range/malformed/wrong-family/exhausted) propagating rule+pool

Recommendation

MERGE-READY structurally — the closure pattern matches what I asked for. Awaiting Codex (task-mpapya9a-m8xuea) + Gemini Pro 3 (task-mpapyj0h-77y36a) for verification.

Not merging — author's decision.

@psaab psaab force-pushed the codex/next-1377-snat-runtime branch from 773754a to 9276a1a Compare May 18, 2026 04:45
@psaab
Copy link
Copy Markdown
Owner Author

psaab commented May 18, 2026

Round-2 triple-review synthesis on 773754a3

Reviewer Verdict
Claude MERGE-READY
Codex MERGE-NEEDS-MINOR (forwarding test only pins WrongAddressFamily; allocator-exhausted untested)
Gemini Pro 3 MERGE-READY

r1 MINOR closed — both reviewers verify

Codex:

"Yes. Rust protocol.rs exports rule_name / pool_name; Go protocol.go has matching RuleName string \json:'rule_name,omitempty'`andPoolName. record_source_nat_failurenow callsrecord_source_nat_exception, which copies failure.rule_nameandfailure.pool_nameinto the exception before pushing it.statusfmt.gorenders non-empty values asrule=andpool=, and statusfmt_test.gochecksrule=snat-a/pool=pool-a`."

Gemini: quoted verification of all three checks (exception fields + wire + statusfmt).

Codex MINOR — forwarding test scope

"forwarding/tests.rs adds one forwarding-level case, WrongAddressFamily, asserting rule_name='wrong-family' and pool_name='v6-only'. The other named cases are not pinned there. Lower-level nat_tests.rs already covers missing, empty, invalid range, malformed/invalid pool, and wrong-family with rule/pool identity, but I found no allocator-exhausted propagation test."

The cross-layer coverage:

  • nat_tests.rs: pins identity at the matcher layer for missing/empty/invalid/malformed/wrong-family
  • forwarding/tests.rs: pins identity at the forwarding-runtime layer ONLY for wrong-family
  • Allocator-exhausted: untested at either layer

Defensible to ship as-is since the nat_tests coverage proves the identity flows through SourceNatFailure::for_rule. Allocator-exhausted is a real gap worth a follow-up.

Gemini backward-compat verification

"This is additive JSON. Rust has no deny_unknown_fields, Go uses normal encoding/json decoding. Older peers should ignore the new keys and still parse ProcessStatus."

Recommendation

MERGE-READY. Strongly consider adding allocator-exhausted forwarding test (Codex MINOR) — easy to add now that the test fixture exists for wrong-family.

Codex task: task-mpapya9a-m8xuea. Gemini task: task-mpapyj0h-77y36a. Not merging — author's decision.

@psaab psaab force-pushed the codex/next-1377-snat-runtime branch from 9276a1a to 5ebd76a Compare May 18, 2026 05:40
Copilot AI review requested due to automatic review settings May 18, 2026 05:40
@psaab
Copy link
Copy Markdown
Owner Author

psaab commented May 18, 2026

Claude r3 review on 5ebd76a4

Verdict: MERGE-READY (pending Codex/Gemini)

r2 MINOR closure verified — allocator-exhausted forwarding test added

New test in userspace-dp/src/afxdp/forwarding/tests.rs:

#[test]
fn source_nat_allocator_exhausted_reports_rule_and_pool_identity() {
    let mut snapshot = nat_snapshot();
    snapshot.source_nat_rules = vec![SourceNATRuleSnapshot {
        name: "exhausted".to_string(),
        ...
        pool_name: "tiny-pool".to_string(),
        pool_unusable: true,
        pool_unusable_reason: "allocator_exhausted".to_string(),
        ..Default::default()
    }];
    ...
    assert_eq!(
        match_source_nat_for_flow_result(&state, &from_zone, &to_zone, 12, &flow),
        SourceNatLookup::Unavailable(SourceNatFailure {
            rule_name: "exhausted".to_string(),
            pool_name: "tiny-pool".to_string(),
            reason: SourceNatFailureReason::AllocatorExhausted,
        })
    );
}

Pins the propagation: pool_unusable_reason: "allocator_exhausted" flows through Go snapshot → Rust forwarding state → SourceNatLookup::Unavailable(SourceNatFailure) with the expected rule/pool/reason identity.

Coverage now complete

Combined with the prior r2 test for WrongAddressFamily, the forwarding-layer coverage now pins both runtime-only failure modes that aren't reachable through the Go snapshot validation path. The nat_tests.rs lower-layer coverage handles the snapshot-rejected cases (missing/empty/invalid-port-range/malformed).

Recommendation

MERGE-READY. The remaining Codex r1 MINOR ("rule_name + pool_name on exception telemetry") was closed in r2; r3 closes the final r2 MINOR on test coverage.

Awaiting Codex (task-mpas41zd-w4c93j) + Gemini Pro 3 (task-mpas4d3o-pw4ix7). Not merging — author's decision.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 2 comments.

Comment on lines +133 to +134
let Some(egress) = forwarding.egress.get(&egress_ifindex) else {
return SourceNatLookup::NoMatch;
Comment on lines +1225 to +1237
Err(failure) => {
record_source_nat_failure(
telemetry,
worker_ctx,
meta,
flow,
from_zone_id,
to_zone_id,
desc.len,
&failure,
);
binding.scratch.scratch_recycle.push(desc.addr);
continue;
psaab added 2 commits May 17, 2026 22:45
Add forwarding-level coverage for allocator-exhausted source NAT pools so exception identity is pinned for the pool exhaustion path, not only wrong-family or unavailable-pool cases.
@psaab psaab force-pushed the codex/next-1377-snat-runtime branch from 5ebd76a to 35cd30d Compare May 18, 2026 05:46
@psaab psaab merged commit b9b7910 into master May 18, 2026
@psaab
Copy link
Copy Markdown
Owner Author

psaab commented May 18, 2026

Round-3 triple-review synthesis on 5ebd76a4

Reviewer Verdict
Claude MERGE-READY
Codex MERGE-READY
Gemini Pro 3 MERGE-READY

All three converge. r2 MINOR (allocator-exhausted forwarding test) is closed.

Codex confirmation

"The new test pins SourceNatFailureReason::AllocatorExhausted plus rule_name = 'exhausted' and pool_name = 'tiny-pool'. The fixture sets pool_unusable: true and pool_unusable_reason: 'allocator_exhausted', which maps through parse_source_nat_rules to SourceNatFailureReason::AllocatorExhausted."

Codex non-blocking note

"Forwarding-layer tests still do not separately cover MissingPool, EmptyPool, InvalidPool, or InvalidPortRange. Those are covered at the NAT unit layer, and they share the same forwarding propagation path once they become SourceNatFailure, so I would not hold the PR for that."

Recommendation

Merge-ready. Both WrongAddressFamily (r2) and AllocatorExhausted (r3) — the runtime-only failure modes that can't be caught at the Go snapshot layer — now have explicit forwarding-test coverage.

Codex task: task-mpas41zd-w4c93j. Gemini task: task-mpas4d3o-pw4ix7. Not merging — author's decision.

psaab added a commit that referenced this pull request May 18, 2026
* userspace: fail closed unusable SNAT pools
* snat: include rule and pool identity in exceptions

Add forwarding-level coverage for allocator-exhausted source NAT pools so
exception identity is pinned for the pool exhaustion path, not only
wrong-family or unavailable-pool cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants