feat(infrastructure): defense-in-depth SSRF protection for outbound HTTP#24
Conversation
… paths Tenant operators supply URLs in Provider and Executor configurations that flow into outbound HTTP requests. Without protection these are CWE-918 SSRF sinks: cloud metadata exfiltration, internal lateral movement, and DNS rebinding bypass. This change adds in-process defense-in-depth across every outbound path that originates from tenant config. Architecture (3 in-process layers): 1. New pkg/safehttp helper centralizes SSRF policy: Options() caches the SSRF_ALLOW_PRIVATE env var, NewClient() wraps redirect checks, ValidateAndPin() does DNS pinning + IP validation + optional per-tenant host allowlist matching in a single TOCTOU-free step. PinResult carries PinnedURL (IP literal) plus Authority (original Host header) so virtual-host routing survives DNS pinning. 98.3 percent test coverage covering RFC1918, loopback, link-local, AWS IMDS, IPv4-mapped IPv6, suffix-confusion, redirect chains. 2. Write-time validation in create/update commands for ProviderConfig and ExecutorConfig rejects bad URLs before persisting. New error sentinels FLK-0317, FLK-0318, FLK-0319 map to HTTP 422. 3. Runtime validation in test connectivity commands and the generic HTTP runner (pkg/executors/http/runner.go) plus the OIDC discovery and token fetcher clients re-validates on every request. ProviderConfiguration now carries an optional AllowedHosts list (opt-in, default permissive for backward compatibility). Entries support exact match or suffix-with-label-boundary (.acme.com matches api.acme.com but not acme.com.attacker.io). Wave 2 wires it model-to-MongoDB; Waves 3-4 prepare the sentinel ErrExecutorConfigHostNotAllowed but cannot enforce the allowlist transitively today because ExecutorConfiguration has no parent ProviderConfiguration FK. Tracked as known gap in PROJECT_RULES. Verification: full test suite (1468 tests / 52 packages) green; lint shows only one pre-existing gocognit warning in bootstrap/config.go; make format clean. Test policy override via SetAllowPrivateForTest + TestMain in midaz/tracer/auth packages keeps loopback-httptest tests green without weakening production policy. Out of scope (SRE coordination): Kubernetes NetworkPolicy, IMDSv2 hop-limit enforcement, egress proxy (Smokescreen) — documented as network-layer defense in depth in docs/PROJECT_RULES.md.
Wave 1-5 substituted the hostname in the URL with an IP literal (PinResult.PinnedURL) and assigned the original authority to req.Host. That works for HTTP but breaks HTTPS: Go derives TLS SNI and certificate verification from req.URL.Host, so an IP literal there makes the cert verification fail with "no IP SANs". req.Host only affects the HTTP Host header (layer 7), never TLS. In production this would have broken every HTTPS provider on day one: Midaz core banking, Tracer, customer KYC/AML endpoints, OIDC discovery and token fetching. The unit tests missed it because httptest.NewTLSServer emits a cert that includes 127.0.0.1 as an IP SAN — masking the bug. This change moves SSRF IP enforcement to Transport.DialContext. The request is issued against the original URL (hostname intact); the transport resolves DNS at dial time, checks each resolved IP against libSSRF.IsBlockedAddr, and dials the first safe one. TLS handshake uses the correct SNI because req.URL.Host is unchanged; DNS rebinding is closed out by atomic resolve-and-dial in the same call. API changes: - PinResult struct removed; ValidateAndPin replaced by Validate which returns only an error. Used for write-time pre-flight validation. - SafeTransport() exposed for callers that need to compose with otelhttp.NewTransport (the HTTP runner does). - NewClient now unconditionally installs SafeTransport. A new pkg/safehttp/safehttp_tls_test.go adds the test that should have existed all along: it creates a TLS server with a DNS-only cert (no IP SAN), routes dials to the loopback server via a hostname-rewrite transport, and asserts that SafeTransport completes the handshake successfully. Verified the test catches the bug — running it against the old design produces "x509: cannot validate certificate for 127.0.0.1 because it doesn't contain any IP SANs", which is exactly the failure mode users would have seen in production. All call sites migrated: HTTP runner, both test-connectivity commands, OIDC discovery, token fetcher, create/update executor config. Full test suite (1472 unit + integration + e2e) green; lint shows only the pre-existing gocognit warning in bootstrap/config.go.
lerian-studio
left a comment
There was a problem hiding this comment.
Pull requests to main can only come from:
developrelease-candidatehotfix/*
Your source branch: feature/ssrf-protection-executor-config
Please change the base branch or create a PR from an allowed branch.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR implements comprehensive SSRF protections across provider and executor configuration management, HTTP execution, and authentication flows by introducing a new ChangesSSRF Protection & Allowlist Enforcement
|
📊 Unit Test Coverage Report:
|
| Metric | Value |
|---|---|
| Overall Coverage | 45.2% |
| Threshold | 85% |
Coverage by Package
| Package | Coverage |
|---|---|
github.com/LerianStudio/flowker/api |
100.0% |
github.com/LerianStudio/flowker/internal/adapters/http/in/audit |
0.0% |
github.com/LerianStudio/flowker/internal/adapters/http/in/catalog |
60.9% |
github.com/LerianStudio/flowker/internal/adapters/http/in/dashboard |
91.5% |
github.com/LerianStudio/flowker/internal/adapters/http/in/execution |
81.0% |
github.com/LerianStudio/flowker/internal/adapters/http/in/executor_configuration |
70.8% |
github.com/LerianStudio/flowker/internal/adapters/http/in/health |
100.0% |
github.com/LerianStudio/flowker/internal/adapters/http/in/middleware/testutil |
0.0% |
github.com/LerianStudio/flowker/internal/adapters/http/in/middleware |
44.4% |
github.com/LerianStudio/flowker/internal/adapters/http/in/provider_configuration |
33.1% |
github.com/LerianStudio/flowker/internal/adapters/http/in/readyz |
81.3% |
github.com/LerianStudio/flowker/internal/adapters/http/in/webhook |
87.5% |
github.com/LerianStudio/flowker/internal/adapters/http/in/workflow |
69.7% |
github.com/LerianStudio/flowker/internal/adapters/http/in |
33.3% |
github.com/LerianStudio/flowker/internal/adapters/mongodb/dashboard |
43.0% |
github.com/LerianStudio/flowker/internal/adapters/mongodb/execution |
64.8% |
github.com/LerianStudio/flowker/internal/adapters/mongodb/executor_configuration |
36.6% |
github.com/LerianStudio/flowker/internal/adapters/mongodb/provider_configuration |
63.0% |
github.com/LerianStudio/flowker/internal/adapters/mongodb/workflow |
69.2% |
github.com/LerianStudio/flowker/internal/adapters/postgresql/audit |
42.2% |
github.com/LerianStudio/flowker/internal/bootstrap |
40.5% |
github.com/LerianStudio/flowker/internal/services/command |
37.2% |
github.com/LerianStudio/flowker/internal/services/query |
11.2% |
github.com/LerianStudio/flowker/internal/services |
3.2% |
github.com/LerianStudio/flowker/internal/testutil |
63.6% |
github.com/LerianStudio/flowker/pkg/circuitbreaker |
97.6% |
github.com/LerianStudio/flowker/pkg/clock |
0.0% |
github.com/LerianStudio/flowker/pkg/condition |
87.9% |
github.com/LerianStudio/flowker/pkg/contextutil |
0.0% |
github.com/LerianStudio/flowker/pkg/executor/base |
20.0% |
github.com/LerianStudio/flowker/pkg/executor |
39.5% |
github.com/LerianStudio/flowker/pkg/executors/http/auth |
58.2% |
github.com/LerianStudio/flowker/pkg/executors/http |
49.5% |
github.com/LerianStudio/flowker/pkg/executors/midaz |
83.2% |
github.com/LerianStudio/flowker/pkg/executors/s3 |
0.0% |
github.com/LerianStudio/flowker/pkg/executors/tracer |
72.2% |
github.com/LerianStudio/flowker/pkg/executors |
57.1% |
github.com/LerianStudio/flowker/pkg/model |
58.7% |
github.com/LerianStudio/flowker/pkg/net/http |
91.3% |
github.com/LerianStudio/flowker/pkg/pagination |
0.0% |
github.com/LerianStudio/flowker/pkg/safehttp |
90.3% |
github.com/LerianStudio/flowker/pkg/templates/tracer_midaz |
81.5% |
github.com/LerianStudio/flowker/pkg/templates |
0.0% |
github.com/LerianStudio/flowker/pkg/transformation |
67.2% |
github.com/LerianStudio/flowker/pkg/triggers/webhook |
0.0% |
github.com/LerianStudio/flowker/pkg/triggers |
0.0% |
github.com/LerianStudio/flowker/pkg/webhook |
100.0% |
github.com/LerianStudio/flowker/pkg |
90.3% |
Generated by Go PR Analysis workflow
🔒 Security Scan Results —
|
| Policy | Status |
|---|---|
| Default non-root user | ✅ Passed |
| No fixable critical/high CVEs | ✅ Passed |
| No high-profile vulnerabilities | ✅ Passed |
| No AGPL v3 licenses | ✅ Passed |
Pre-release Version Check
✅ No unstable version pins found.
There was a problem hiding this comment.
Pull request overview
Adds defense-in-depth SSRF protections for tenant-supplied outbound HTTP URLs, plus a provider-level host allowlist model/persistence path.
Changes:
- Introduces
pkg/safehttpwith validation, redirect guarding, and SSRF-aware transport. - Wires SSRF validation into executor/provider connectivity, HTTP runner, and OIDC discovery/token paths.
- Adds
allowedHoststo provider configuration domain, inputs, MongoDB model, constants, docs, and tests.
Reviewed changes
Copilot reviewed 33 out of 35 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
pkg/safehttp/safehttp.go |
Adds shared SSRF validation/client/transport helpers. |
pkg/safehttp/safehttp_tls_test.go |
Adds TLS/SNI and redirect regression tests. |
pkg/safehttp/safehttp_test.go |
Adds safehttp unit coverage. |
pkg/model/provider_configuration.go |
Adds provider allowlist domain state and validation. |
pkg/model/provider_configuration_test.go |
Tests allowlist domain behavior. |
pkg/model/provider_configuration_input.go |
Adds create/update allowlist input fields. |
pkg/model/mask_sensitive_config_test.go |
Updates constructor call signature. |
pkg/executors/tracer/main_test.go |
Enables loopback for tracer executor tests. |
pkg/executors/midaz/main_test.go |
Enables loopback for midaz executor tests. |
pkg/executors/http/runner.go |
Adds SSRF validation and safe transport to HTTP runner. |
pkg/executors/http/runner_test.go |
Adds HTTP runner SSRF tests. |
pkg/executors/http/auth/token_fetcher.go |
Adds safe client/validation to token fetches. |
pkg/executors/http/auth/oidc_discovery.go |
Adds safe client/validation to OIDC discovery. |
pkg/executors/http/auth/main_test.go |
Enables loopback for auth tests. |
pkg/constant/errors.go |
Adds new SSRF/allowlist error codes. |
internal/services/command/update_provider_config.go |
Wires allowlist updates into provider config update. |
internal/services/command/update_provider_config_test.go |
Tests provider allowlist update behavior. |
internal/services/command/update_executor_config.go |
Adds write-time SSRF validation to executor updates. |
internal/services/command/update_executor_config_test.go |
Tests executor update SSRF validation. |
internal/services/command/test_provider_config_connectivity.go |
Refactors provider connectivity checks to safehttp. |
internal/services/command/test_provider_config_connectivity_test.go |
Updates connectivity tests for safehttp behavior. |
internal/services/command/test_executor_connectivity.go |
Adds safe client and preflight SSRF checks. |
internal/services/command/test_executor_connectivity_test.go |
Updates executor connectivity SSRF test coverage. |
internal/services/command/create_provider_config_test.go |
Tests provider allowlist creation behavior. |
internal/services/command/create_executor_config.go |
Adds write-time SSRF validation to executor creation. |
internal/services/command/create_executor_config_test.go |
Tests executor create SSRF validation. |
internal/adapters/mongodb/provider_configuration/pagination_test.go |
Updates reconstruction calls. |
internal/adapters/mongodb/provider_configuration/model.go |
Persists provider allowlist in MongoDB. |
internal/adapters/mongodb/provider_configuration/model_test.go |
Tests MongoDB allowlist round trips. |
internal/adapters/http/in/readyz/sample_bodies_test.go |
Formatting-only alignment update. |
internal/adapters/http/in/readyz/checker_test.go |
Formatting-only alignment update. |
internal/adapters/http/in/provider_configuration/handler.go |
Maps invalid allowlist errors to HTTP 422. |
internal/adapters/http/in/executor_configuration/handler.go |
Maps executor SSRF errors to HTTP 422. |
internal/adapters/http/in/catalog/handler_test.go |
Updates constructor call signatures. |
docs/PROJECT_RULES.md |
Documents SSRF protection rules and gaps. |
Comments suppressed due to low confidence (1)
pkg/model/provider_configuration.go:423
- This comment says the sentinel is kept on
Errfor the unwrap chain, but the returnedpkg.ValidationErrorbelow does not setErr.errors.Ismay still match throughValidationError.Isby code, but the documentation is inaccurate and could lead future code to rely on an unwrap chain that is not present.
// newAllowedHostError returns a ValidationError wrapping the sentinel with a
// human-readable reason. The sentinel is kept on Err so callers can use
// errors.Is(err, ErrProviderConfigInvalidAllowedHost) via the Unwrap chain
// (and the Code-based Is on ValidationError still matches the sentinel form).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/constant/errors.go (1)
122-134:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDuplicate error code FLK-0300.
ErrProviderConfigIDRequiredon line 128 uses codeFLK-0300, which is already assigned toErrUnexpectedFieldsInTheRequeston line 18. This will cause ambiguous error identification in logs and API responses.🐛 Suggested fix: Use an unallocated code
- ErrProviderConfigIDRequired = errors.New("FLK-0300") + ErrProviderConfigIDRequired = errors.New("FLK-0316")Note: Verify FLK-0316 is not already in use elsewhere. The comment mentions "FLK-0316" but I don't see it defined in this file.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/constant/errors.go` around lines 122 - 134, ErrProviderConfigIDRequired currently reuses code "FLK-0300" which duplicates ErrUnexpectedFieldsInTheRequest; change the error constant value for ErrProviderConfigIDRequired to an unused code (e.g., replace "FLK-0300" with a new/unused code such as "FLK-0316" if it is free) so each error constant is unique, and update any tests or usages that reference ErrProviderConfigIDRequired to the new code; verify FLK-0316 (or whichever new code you choose) is not already defined elsewhere before committing.
♻️ Duplicate comments (1)
pkg/model/provider_configuration_input.go (1)
33-40:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSame
hostname_rfc1123issue as CreateProviderConfigurationInput.The
AllowedHostsfield here has the same validator tag issue that will reject leading-dot suffix entries.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/model/provider_configuration_input.go` around lines 33 - 40, The AllowedHosts validator on the AllowedHosts field in provider_configuration_input.go currently uses hostname_rfc1123 which rejects leading-dot suffix entries; update the struct tag for AllowedHosts to match the validator used by CreateProviderConfigurationInput (the one that permits entries starting with "."), i.e., replace the hostname_rfc1123 rule with the same validation rule used in CreateProviderConfigurationInput so leading-dot domain suffixes are accepted while preserving the existing max and dive constraints.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/PROJECT_RULES.md`:
- Around line 216-218: Update the SSRF guidance to remove the
ValidateAndPin/pinned-IP workflow and instead require a preflight call to
safehttp.Validate(...) and runtime enforcement via the safe transport
(constructed with safehttp.NewClient(...) / the safe Transport that enforces
checks in Transport.DialContext). Replace references instructing users to issue
requests against PinResult.PinnedURL and to set req.Host = PinResult.Authority
with guidance to preserve the original hostname in req.URL (so
SNI/virtual-hosting is preserved) while relying on the transport’s dial-time
validation; also ensure the CheckRedirect guidance still points to
safehttp.NewClient(...) so safehttp.CheckRedirectGuard re-validates redirect
targets. Apply the same edits to the other occurrence noted.
In `@internal/services/command/create_executor_config.go`:
- Around line 71-84: The validation error handling in the create executor config
flow incorrectly remaps context cancellation/timeouts to
ErrExecutorConfigSSRFBlocked; update the switch after safehttp.Validate in
create_executor_config.go (the block handling errors from safehttp.Validate and
calling libOtel.HandleSpanBusinessErrorEvent/span) to first detect
context.Canceled and context.DeadlineExceeded and return those errors (or
wrap/preserve them) rather than mapping them to
constant.ErrExecutorConfigSSRFBlocked, keeping the existing host-not-allowed and
other SSRF branches unchanged.
- Around line 75-84: The SSRF-related branches in the executor config validation
currently return sentinel errors (constant.ErrExecutorConfigHostNotAllowed /
constant.ErrExecutorConfigSSRFBlocked) but should return a typed validation
error; update the two branches that call libOtel.HandleSpanBusinessErrorEvent
(the cases for safehttp.ErrBlocked, safehttp.ErrDNSFailed,
safehttp.ErrInvalidURL and the default branch) to construct and return a
pkg.ValidationError that wraps the underlying err (preserving the error message)
instead of fmt.Errorf with the sentinel, while keeping the span error events
intact and retaining the same context string arguments passed to
libOtel.HandleSpanBusinessErrorEvent so callers receive a ValidationError for
input-validation failures.
In `@internal/services/command/test_provider_config_connectivity.go`:
- Around line 47-50: The connectivity check currently constructs the safe HTTP
client before loading provider config and validates base_url with nil
allowedHosts; update the flow to call FindByID first, capture allowed :=
providerConfig.AllowedHosts(), then construct the safe client with
safehttp.NewClient(httpClient, defaultProviderConfigTestTimeout, allowed) (or
the existing NewClient overload that accepts an allowlist) and pass allowed into
the pre-flight validation routine that validates base_url and into the runtime
redirect/dial enforcement (CheckRedirectGuard / dial allowlist) so the same
allowlist from providerConfig.AllowedHosts() is used for base_url validation,
redirect checks, and connection dialing (apply the same change to the other
occurrences noted around lines 88-92 and 166-188).
In `@pkg/executors/http/runner.go`:
- Around line 166-190: The function setupHTTPClient currently mutates the
caller's http.Client and discards any existing Transport; instead, make a
shallow copy of the incoming client (e.g., newClient := *client or if client ==
nil create a fresh one) and operate on that copy so the caller's client is not
mutated. Preserve the original transport by reading origTr := client.Transport;
if nil use http.DefaultTransport; then build a safe transport around it via
safeTr := safehttp.SafeTransport(origTr) and wrap with telemetry:
newClient.Transport = otelhttp.NewTransport(safeTr). Also set newClient.Timeout
when zero, wrap CheckRedirect via newClient.CheckRedirect =
safehttp.CheckRedirectGuard(client.CheckRedirect), and return &newClient. Use
the symbols setupHTTPClient, safehttp.SafeTransport, otelhttp.NewTransport, and
safehttp.CheckRedirectGuard to locate code.
In `@pkg/model/provider_configuration_input.go`:
- Around line 16-25: The DTO's AllowedHosts validation is too strict: remove the
hostname_rfc1123 validator from the AllowedHosts struct tag in
provider_configuration_input.go so leading-dot suffix entries (e.g.,
".acme.com") are not rejected at the DTO layer; keep the existing list-level
constraints (omitempty, max=100) and let the domain-level validateAllowedHosts
in provider_configuration.go perform the detailed per-entry validation for
suffix semantics.
In `@pkg/safehttp/safehttp.go`:
- Around line 192-205: NewClient currently mutates the provided baseClient by
setting baseClient.Transport and baseClient.CheckRedirect which can cause
surprising shared-client mutations; fix by creating a shallow clone of the
passed *http.Client (copy the value into a new http.Client variable and set its
Transport = SafeTransport() and CheckRedirect = CheckRedirectGuard(...)) and
return the clone instead of modifying the original, while preserving Timeout
behavior and using the original baseClient fields for other values;
alternatively, if you intentionally keep mutation, add a clear doc comment on
NewClient explaining it mutates the passed client so callers know to pass a
copy.
---
Outside diff comments:
In `@pkg/constant/errors.go`:
- Around line 122-134: ErrProviderConfigIDRequired currently reuses code
"FLK-0300" which duplicates ErrUnexpectedFieldsInTheRequest; change the error
constant value for ErrProviderConfigIDRequired to an unused code (e.g., replace
"FLK-0300" with a new/unused code such as "FLK-0316" if it is free) so each
error constant is unique, and update any tests or usages that reference
ErrProviderConfigIDRequired to the new code; verify FLK-0316 (or whichever new
code you choose) is not already defined elsewhere before committing.
---
Duplicate comments:
In `@pkg/model/provider_configuration_input.go`:
- Around line 33-40: The AllowedHosts validator on the AllowedHosts field in
provider_configuration_input.go currently uses hostname_rfc1123 which rejects
leading-dot suffix entries; update the struct tag for AllowedHosts to match the
validator used by CreateProviderConfigurationInput (the one that permits entries
starting with "."), i.e., replace the hostname_rfc1123 rule with the same
validation rule used in CreateProviderConfigurationInput so leading-dot domain
suffixes are accepted while preserving the existing max and dive constraints.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3bc2bdea-9fb9-4132-9592-5377205da3f5
📒 Files selected for processing (35)
docs/PROJECT_RULES.mdinternal/adapters/http/in/catalog/handler_test.gointernal/adapters/http/in/executor_configuration/handler.gointernal/adapters/http/in/provider_configuration/handler.gointernal/adapters/http/in/readyz/checker_test.gointernal/adapters/http/in/readyz/sample_bodies_test.gointernal/adapters/mongodb/provider_configuration/model.gointernal/adapters/mongodb/provider_configuration/model_test.gointernal/adapters/mongodb/provider_configuration/pagination_test.gointernal/services/command/create_executor_config.gointernal/services/command/create_executor_config_test.gointernal/services/command/create_provider_config_test.gointernal/services/command/test_executor_connectivity.gointernal/services/command/test_executor_connectivity_test.gointernal/services/command/test_provider_config_connectivity.gointernal/services/command/test_provider_config_connectivity_test.gointernal/services/command/update_executor_config.gointernal/services/command/update_executor_config_test.gointernal/services/command/update_provider_config.gointernal/services/command/update_provider_config_test.gopkg/constant/errors.gopkg/executors/http/auth/main_test.gopkg/executors/http/auth/oidc_discovery.gopkg/executors/http/auth/token_fetcher.gopkg/executors/http/runner.gopkg/executors/http/runner_test.gopkg/executors/midaz/main_test.gopkg/executors/tracer/main_test.gopkg/model/mask_sensitive_config_test.gopkg/model/provider_configuration.gopkg/model/provider_configuration_input.gopkg/model/provider_configuration_test.gopkg/safehttp/safehttp.gopkg/safehttp/safehttp_test.gopkg/safehttp/safehttp_tls_test.go
GHAS Code Scanning runs gosec standalone, which honors the native '#nosec' annotation but ignores '//nolint:gosec' (a golangci-lint convention). The PR currently fails its gosec check on token_fetcher.go:353 even though the same line passes locally under 'make lint' (which uses golangci-lint). Add both annotations on the same line so: - golangci-lint reads '//nolint:gosec' at the start of the trailing comment and suppresses the gosec rule. - gosec standalone reads '#nosec G704' anywhere in the trailing comment and suppresses G704 specifically. Also corrects the prose above the line to reference G704 (the actual rule reported by GHAS) instead of the incorrect G107 used before.
Resolved: PR base changed from main to develop.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/executors/http/auth/token_fetcher.go (1)
58-63:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEnsure injected
DiscoveryClientinherits the SSRF-protected transport.When a non-nil
discoveryClientis passed, it's reused without validation. If a caller constructsDiscoveryClientoutsideNewDiscoveryClient(bypassing the safehttp wrapping), issuer discovery would run on an unwrapped transport while token requests use the safe one. This violates the contract that all OIDC outbound traffic is SSRF-hardened.All current call sites pass
niland trigger the safe code path, but the constructor should not rely on caller discipline. Consider either rebuildingDiscoveryClientfrom the wrappedhttpClientto guarantee consistency, or requiring a safer interface that prevents this gap.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/executors/http/auth/token_fetcher.go` around lines 58 - 63, NewTokenFetcher currently wraps httpClient with safehttp.NewClient but reuses a non-nil discoveryClient as-is, risking an unwrapped transport for OIDC discovery; ensure the DiscoveryClient always uses the SSRF-protected transport by recreating or reinitializing it from the wrapped httpClient instead of accepting an external instance unchanged. Locate NewTokenFetcher and change the logic so after httpClient = safehttp.NewClient(httpClient, ...), you either (a) always call NewDiscoveryClient(httpClient) and ignore the passed discoveryClient, or (b) validate and replace the passed discoveryClient's internal HTTP transport/client with the wrapped httpClient (reconstructing it) so DiscoveryClient and TokenFetcher use the same safehttp client; use the symbols NewTokenFetcher, NewDiscoveryClient and DiscoveryClient to find and update the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@pkg/executors/http/auth/token_fetcher.go`:
- Around line 58-63: NewTokenFetcher currently wraps httpClient with
safehttp.NewClient but reuses a non-nil discoveryClient as-is, risking an
unwrapped transport for OIDC discovery; ensure the DiscoveryClient always uses
the SSRF-protected transport by recreating or reinitializing it from the wrapped
httpClient instead of accepting an external instance unchanged. Locate
NewTokenFetcher and change the logic so after httpClient =
safehttp.NewClient(httpClient, ...), you either (a) always call
NewDiscoveryClient(httpClient) and ignore the passed discoveryClient, or (b)
validate and replace the passed discoveryClient's internal HTTP transport/client
with the wrapped httpClient (reconstructing it) so DiscoveryClient and
TokenFetcher use the same safehttp client; use the symbols NewTokenFetcher,
NewDiscoveryClient and DiscoveryClient to find and update the code.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 31c3fe84-c836-4c97-a9c7-43bafc6717be
📒 Files selected for processing (1)
pkg/executors/http/auth/token_fetcher.go
…ction-executor-config # Conflicts: # pkg/constant/errors.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/constant/errors.go`:
- Around line 164-167: The error constants ErrExecutorConfigSSRFBlocked and
ErrExecutorConfigHostNotAllowed are using codes in the Provider Configuration
range; change their error strings to use the Executor Connectivity Testing codes
FLK-0284 and FLK-0285 respectively so they reside within the Executor domain
range (update the values in the ErrExecutorConfigSSRFBlocked and
ErrExecutorConfigHostNotAllowed declarations and adjust the surrounding comment
to reflect the new allocation).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 97d2888d-403d-4e02-857e-a5d78af05586
📒 Files selected for processing (1)
pkg/constant/errors.go
The first gosec annotation attempt placed both directives on the same
line:
resp, err := f.httpClient.Do(req) //nolint:gosec // #nosec G704 -- ...
GHAS Code Scanning (gosec standalone) still flagged the line. Gosec's
nosec parser requires the `#nosec` token to be the FIRST token after
`//` in a comment, but Go treats everything after the first `//` as a
single comment, so `nolint:gosec` is the first token and `#nosec G704`
becomes mid-comment text that gosec ignores.
Split the two annotations onto separate lines:
//nolint:gosec
resp, err := f.httpClient.Do(req) // #nosec G704 -- ...
golangci-lint honors `//nolint:gosec` on the preceding line per its
directive rules; gosec honors `// #nosec G704` as a trailing comment
because `#nosec` is now the first token of that comment.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/executors/http/auth/token_fetcher.go`:
- Around line 358-359: Remove the stray leading whitespace before the method
call in the statement using f.httpClient.Do(req) in token_fetcher.go; edit the
expression so it reads exactly "resp, err := f.httpClient.Do(req) // ..." with
no extra spaces/tabs before ".Do", then run gofmt (or your editor's formatting)
to normalize separators and ensure golangci-lint's wsl_v5 (leading-whitespace)
warning is resolved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 58e43027-2a95-4265-b835-7ddfc43d576f
📒 Files selected for processing (1)
pkg/executors/http/auth/token_fetcher.go
Resolves correctness bugs surfaced by Copilot + CodeRabbit review on PR #24, plus the latent #nosec annotation issue from the prior pass. 1. SafeTransport no longer honours HTTP_PROXY / HTTPS_PROXY env vars. The SSRF guarantee is "DNS is re-resolved at dial time and every candidate IP is validated before connect". Honouring proxy env vars redirected the dial to the proxy host, bypassing the policy entirely. Document the trade-off: corporate egress proxies must be CONNECT-style (e.g. Smokescreen) configured at network layer, not via env vars. 2. dialSafe now enforces a minimum critical blocklist (metadata + link local, IPv4 169.254.0.0/16, IPv6 fe80::/10 and fd00:ec2::/64) that applies unconditionally — even with SSRF_ALLOW_PRIVATE=true. Before this change, dev mode silently lost protection against the Capital One scenario. Standard libSSRF policy (RFC1918, loopback) remains togglable for local development as before. 3. AllowedHosts DTO validator no longer rejects suffix entries like ".acme.com". hostname_rfc1123 enforces that hostnames start with an alphanumeric character, but the domain layer (validateAllowedHosts) explicitly supports leading-dot suffix matching. The DTO check was duplicating — and contradicting — domain validation. 4. ProviderConfig connectivity test now passes providerConfig.AllowedHosts() into safehttp.Validate instead of nil, so the tenant allowlist is actually enforced on the test path (the previous pass stored it but ignored it at runtime). 5. ProviderConfig Update validates that the effective allowlist still permits the configuration's existing base URLs before persisting. Without this, a tenant could PATCH the allowlist to a set that excludes the URL the runtime needs to call — leaving persisted state internally inconsistent. 6. ProviderConfig Create applies the same cross-field validation — the freshly persisted state cannot be born inconsistent. 7. safehttp.NewClient and runner.setupHTTPClient no longer mutate the caller-supplied *http.Client. They take a shallow copy first, then install SafeTransport / CheckRedirectGuard on the clone. Callers that share an http.Client across goroutines are no longer at risk of seeing surprise Transport / CheckRedirect swaps. 8. context.Canceled and context.DeadlineExceeded propagate through the command-layer error mapping instead of being misclassified as ErrExecutorConfigSSRFBlocked. Affected paths: create / update executor config, create / update provider config, test connectivity for both. Commands now return *pkg.ValidationError (per project convention) rather than fmt-wrapped sentinels. Supporting changes: - pkg.ValidationError.Is widened to match plain sentinel targets by Code, so handlers can keep using errors.Is(err, constant.ErrXxx) regardless of whether the error is value-form or fmt-wrapped. - New sentinel ErrProviderConfigHostNotInAllowlist (FLK-0320). - model.ValidateAllowedHosts exported so commands can dry-run per-entry validation before the cross-field check (preserves more-specific error messages like "IP literal not allowed" or "wildcard rejected"). Verification: make format clean; make lint shows only the pre-existing gocognit warning in bootstrap/config.go; 1504 unit tests across 52 packages all pass.
|
CodeRabbit chat interactions are restricted to organization members for this repository. Ask an organization member to interact with CodeRabbit, or set |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/services/command/create_executor_config_test.go`:
- Around line 129-130: Update the misleading comment in
create_executor_config_test.go that refers to safehttp.ValidateAndPin; replace
that reference with the actual function name safehttp.Validate so the comment
accurately reflects the implementation and avoids confusion when readers inspect
functions like safehttp.Validate used elsewhere.
In `@internal/services/command/test_executor_connectivity.go`:
- Around line 242-246: The stage details currently build the "url" by
concatenating config.BaseURL() + endpoint.Path(), which can produce malformed
URLs; replace that concatenation with the actual request target variable
endpointURL (the URL used to make the request) when populating the details map
(where details := map[string]any{ "statusCode": resp.StatusCode,
"responseTimeMs": durationMs, "url": ... }), so observability reflects the real
request target.
In `@pkg/safehttp/safehttp.go`:
- Around line 109-143: The duplicated sync.Once.Do block in Options() and
allowPrivateEnabled() should be extracted into a single helper (e.g.,
initAllowPrivate or ensureAllowPrivateLoaded) so both functions call that helper
instead of repeating the logic; move the closure that locks allowPrivateMu and
sets allowPrivate from os.Getenv("SSRF_ALLOW_PRIVATE") into that helper, have
the helper use allowPrivateOnceP.Do(...) and leave Options() and
allowPrivateEnabled() to only acquire allowPrivateMu where needed and return
their respective results, keeping the existing symbols (Options,
allowPrivateEnabled, allowPrivateOnceP, allowPrivateMu, allowPrivate) unchanged.
- Around line 396-420: The runner currently calls safehttp.Validate(reqCtx,
finalURL, nil) and CheckRedirectGuard has no tenant allowlist, so tenant-scoped
allowedHosts are never enforced; fix by retrieving the provider's allowed hosts
from ProviderConfiguration and pass them instead of nil into safehttp.Validate
in the HTTP executor (replace safehttp.Validate(reqCtx, finalURL, nil) with
safehttp.Validate(reqCtx, finalURL, allowedHosts)), ensure the same allowedHosts
are used for pre-flight URL validation, and when creating the http.Client use a
CheckRedirectGuard closure that captures and uses that allowedHosts (or
equivalent options) so redirects are validated against the tenant allowlist as
well (update places where CheckRedirectGuard is constructed to accept/capture
allowedHosts and call libSSRF.ValidateURL with those options).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9ed895f6-721d-45ff-9893-859df7e93ecb
📒 Files selected for processing (22)
internal/adapters/http/in/provider_configuration/handler.gointernal/services/command/create_executor_config.gointernal/services/command/create_executor_config_test.gointernal/services/command/create_provider_config.gointernal/services/command/create_provider_config_test.gointernal/services/command/test_executor_connectivity.gointernal/services/command/test_executor_connectivity_test.gointernal/services/command/test_provider_config_connectivity.gointernal/services/command/test_provider_config_connectivity_test.gointernal/services/command/update_executor_config.gointernal/services/command/update_executor_config_test.gointernal/services/command/update_provider_config.gointernal/services/command/update_provider_config_test.gopkg/constant/errors.gopkg/errors.gopkg/executors/http/auth/token_fetcher.gopkg/executors/http/runner.gopkg/executors/http/runner_test.gopkg/model/provider_configuration.gopkg/model/provider_configuration_input.gopkg/safehttp/safehttp.gopkg/safehttp/safehttp_test.go
Final round of fixes from PR review. No behaviour changes; documentation, DTOs, and error-code allocation cleanup. 1. Rewrite docs/PROJECT_RULES.md SSRF section. The Wave 5 version instructed callers to use ValidateAndPin + PinResult.PinnedURL + req.Host = Authority — the exact pattern Wave 6 removed because it broke TLS SNI. Following the doc as written would recreate the bug. New version describes the dial-time enforcement model: issue request against original hostname URL, let SafeTransport.DialContext re-resolve and validate at dial time, use safehttp.Validate for write-time pre-flight. Also documents the proxy gap, the SSRF_ALLOW_PRIVATE metadata carve-out, and the redirect-allowlist gap as accepted. 2. Reallocate FLK-0318 / FLK-0319 (Executor SSRF errors) into the Executor Connectivity Testing range as FLK-0284 / FLK-0285. The old codes lived in the Provider Configuration range (FLK-0290 to 0349), violating the domain-per-range invariant. Domain alignment matters because clients translate codes to user-facing categories. Safe to change now (PR not merged, codes not yet in production). 3. Expose AllowedHosts in ProviderConfigurationOutput so GET / LIST responses return what tenants wrote via PATCH. Field uses omitempty to preserve payload shape for legacy configurations. 4. Document the redirect-allowlist limitation in safehttp.CheckRedirectGuard godoc and PROJECT_RULES.md "Known gaps". The static SSRF policy (IP ranges via libSSRF.ValidateURL + dial-time IsBlockedAddr) still applies to redirect targets; only the tenant hostname allowlist is not re-evaluated. Defense in depth via dial-time IP policy makes the residual risk acceptable; full enforcement would require context plumbing or per-tenant client instances — deferred as overengineering. New regression test (TestSafeTransport_DialBlocksMetadataEvenOnRedirect) pins the dial-layer guarantee. 5. Nitpicks: extract ensureAllowPrivateCached() so Options() and allowPrivateEnabled() share one initialization path; fix test_executor_connectivity to surface the actual endpointURL in stage details (instead of naive BaseURL+Path concat); update stale comments still referencing ValidateAndPin in command tests. Verification: 1507 unit tests across 52 packages all pass; make lint shows only the pre-existing gocognit warning in bootstrap/config.go.
Integrate develop's ConnectivityProber pattern (Midaz provider-aware probes) with the SSRF defense-in-depth design on this branch: - Prober short-circuit runs before extractBaseURL because providers may use non-base_url config keys (e.g. Midaz uses onboarding_base_url). Dial-time SSRF protection still applies because c.httpClient is wrapped via safehttp.NewClient at constructor time. - Adopt HEAD's extractBaseURL(ctx, config, allowedHosts) signature; drop the now-unused pinnedURL/authority variables. - Update develop's ReconstructProviderConfiguration call sites (6 in tests) to pass the new allowedHosts arg as nil. - Merge provider-config command test files (AA conflicts): keep both the AllowedHosts propagation tests (HEAD) and the schema-validation HTTP 422 regression tests (develop).
Summary
pkg/safehttp— a shared helper with dial-time IP validation viaTransport.DialContext,CheckRedirectGuardfor redirect chains, and an optional per-tenant hostname allowlist onProviderConfiguration.AllowedHosts. New error sentinels (FLK-0317/0318/0319) map to HTTP 422.develop: substituting an IP literal inreq.URL.Host(the previous "pinned URL" approach) breaks cert verification (x509: cannot validate certificate ... no IP SANs). The new design keeps the hostname in the URL and enforces SSRF at dial time, so HTTPS to providers with hostname certs (Midaz, Tracer, customer endpoints) actually works. A regression test with a DNS-only cert is included.Architecture (3 in-process layers)
safehttp.Validate(ctx, url, allowedHosts)in create/update commands rejects bad URLs before persisting. Tenant operator gets fast 422 instead of failures at execution.pkg/safehttp(98%+ coverage) centralizes policy. Provider connectivity test was refactored to consume it (DRY); was the only previously-protected surface.safehttp.NewClientinstallsSafeTransport()whoseDialContextre-resolves DNS and runslibSSRF.IsBlockedAddrper request. TOCTOU/DNS-rebinding closed by atomic resolve+dial.Known gaps (tracked, out of scope)
ExecutorConfigurationhas noproviderConfigurationIDFK today, so the tenant allowlist onProviderConfigurationcan't yet be enforced transitively through executor URLs. Static policy (RFC1918, loopback, link-local, IMDS) still applies. The sentinelErrExecutorConfigHostNotAllowedis wired so the change is localized when the FK lands.Config["url"]at runtime with no link to anyExecutorConfigurationrecord — same gap, same mitigation.NetworkPolicy, IMDSv2 hop-limit, egress proxy like Smokescreen) are explicitly SRE responsibility, documented indocs/PROJECT_RULES.md.Test plan
make formatclean.make lint— zero new findings; one pre-existinggocognitwarning ininternal/bootstrap/config.go:116untouched.make testfull suite (unit + integration + e2e) — 1472+ tests, exit 0.pkg/safehttpcoverage ≥ 98%.TestSafeTransport_HTTPS_HostnameCert_Succeedsreproduces the TLS SNI bug against the old design (verified with throwaway demo test) and passes against the new design.make rebuild-up+ create a ProviderConfig withallowedHosts=["httpbin.org"], then attempt to update an ExecutorConfig withbaseURL=https://169.254.169.254→ expect 422FLK-0318 ErrExecutorConfigSSRFBlocked.