feat(service): implement ConnectivityProber for Tracer with X-API-Key auth#25
Conversation
Wrap base.Provider in a tracerProvider struct so the Tracer provider can
satisfy executor.ConnectivityProber alongside executor.Provider. The probe
issues a single read-only GET against /v1/validations?limit=1 with the
X-API-Key header derived from the provider config.
Failure-mode mapping mirrors the Midaz prober pattern:
- Missing base_url -> Reached=false (Connectivity stage)
- SSRF block -> Reached=false (Connectivity stage)
- Missing api_key -> Reached=true, AuthApplied=false (Auth stage)
- Transport error post-SSRF -> Reached=false (Connectivity stage)
- HTTP response received -> Reached=true, AuthApplied=true (status carries
the upstream verdict)
A local copy of ssrfOptions() is kept inline; this is the third occurrence
of the helper (Midaz, command, Tracer). The canonical definition lives in
internal/services/command/test_provider_config_connectivity.go and a
rule-of-three extraction to a shared package is deliberately deferred to a
focused follow-up to keep this PR narrowly scoped.
Tests cover happy path, invalid api_key (401), missing api_key, missing
base_url, SSRF block, transport error, 5xx upstream, and nil http.Client.
Cover the four flows the new Tracer prober is responsible for:
- happy_path -> 200 from /v1/validations, overall=passed, probed
URL terminates in /v1/validations?limit=1, mock
confirms X-API-Key was forwarded.
- invalid_api_key -> mock returns 401, overall=partial: connectivity
passed, auth failed, e2e skipped (mirrors midaz
invalid_credentials semantics).
- missing_api_key -> schema validation rejects the CREATE call before
any probe runs. The provider config never persists.
Pre-existing handler mapping returns 500 with
FLK-9999 instead of 422 because errors.Is doesn't
match against the ValidationError struct -- this
is a separate, latent bug noted in the test.
- tracer_5xx -> mock returns 503, overall=partial: connectivity
and auth passed, e2e failed with the upstream
status code in the error message.
setupTracerMock records the last X-API-Key header so assertions can verify
the prober actually forwarded credentials. e2eTestStage now exposes the
Details map carried in the JSON response so tests can inspect the probed
URL recorded by the End-to-End stage.
The legacy KYC/AML connectivity tests are intentionally untouched -- they
validate the generic three-call path that providers without a registered
ConnectivityProber still fall back to.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
WalkthroughImplements a Tracer connectivity probe: SSRF-validated, IP-pinned GET to /v1/validations?limit=1 with timeouts and Host preservation, optional X-API-Key application, clear failure attribution, plus unit and e2e tests and a provider wrapper exposing the probe. ChangesTracer Connectivity Probe Feature
Comment |
📊 Unit Test Coverage Report:
|
| Metric | Value |
|---|---|
| Overall Coverage | 45.5% |
| Threshold | 85% |
Coverage by Package
| Package | Coverage |
|---|---|
github.com/LerianStudio/flowker/api |
100.0% |
github.com/LerianStudio/flowker/internal/adapters/http/in/audit |
0.0% |
github.com/LerianStudio/flowker/internal/adapters/http/in/catalog |
60.9% |
github.com/LerianStudio/flowker/internal/adapters/http/in/dashboard |
91.5% |
github.com/LerianStudio/flowker/internal/adapters/http/in/execution |
81.0% |
github.com/LerianStudio/flowker/internal/adapters/http/in/executor_configuration |
70.8% |
github.com/LerianStudio/flowker/internal/adapters/http/in/health |
100.0% |
github.com/LerianStudio/flowker/internal/adapters/http/in/middleware/testutil |
0.0% |
github.com/LerianStudio/flowker/internal/adapters/http/in/middleware |
44.4% |
github.com/LerianStudio/flowker/internal/adapters/http/in/provider_configuration |
33.1% |
github.com/LerianStudio/flowker/internal/adapters/http/in/readyz |
81.3% |
github.com/LerianStudio/flowker/internal/adapters/http/in/webhook |
87.5% |
github.com/LerianStudio/flowker/internal/adapters/http/in/workflow |
69.7% |
github.com/LerianStudio/flowker/internal/adapters/http/in |
33.3% |
github.com/LerianStudio/flowker/internal/adapters/mongodb/dashboard |
43.0% |
github.com/LerianStudio/flowker/internal/adapters/mongodb/execution |
64.8% |
github.com/LerianStudio/flowker/internal/adapters/mongodb/executor_configuration |
36.6% |
github.com/LerianStudio/flowker/internal/adapters/mongodb/provider_configuration |
63.0% |
github.com/LerianStudio/flowker/internal/adapters/mongodb/workflow |
69.2% |
github.com/LerianStudio/flowker/internal/adapters/postgresql/audit |
42.2% |
github.com/LerianStudio/flowker/internal/bootstrap |
40.5% |
github.com/LerianStudio/flowker/internal/services/command |
37.2% |
github.com/LerianStudio/flowker/internal/services/query |
11.2% |
github.com/LerianStudio/flowker/internal/services |
3.2% |
github.com/LerianStudio/flowker/internal/testutil |
63.6% |
github.com/LerianStudio/flowker/pkg/circuitbreaker |
97.6% |
github.com/LerianStudio/flowker/pkg/clock |
0.0% |
github.com/LerianStudio/flowker/pkg/condition |
87.9% |
github.com/LerianStudio/flowker/pkg/contextutil |
0.0% |
github.com/LerianStudio/flowker/pkg/executor/base |
20.0% |
github.com/LerianStudio/flowker/pkg/executor |
39.5% |
github.com/LerianStudio/flowker/pkg/executors/http/auth |
58.2% |
github.com/LerianStudio/flowker/pkg/executors/http |
49.5% |
github.com/LerianStudio/flowker/pkg/executors/midaz |
83.2% |
github.com/LerianStudio/flowker/pkg/executors/s3 |
0.0% |
github.com/LerianStudio/flowker/pkg/executors/tracer |
90.6% |
github.com/LerianStudio/flowker/pkg/executors |
57.1% |
github.com/LerianStudio/flowker/pkg/model |
58.7% |
github.com/LerianStudio/flowker/pkg/net/http |
91.3% |
github.com/LerianStudio/flowker/pkg/pagination |
0.0% |
github.com/LerianStudio/flowker/pkg/safehttp |
90.3% |
github.com/LerianStudio/flowker/pkg/templates/tracer_midaz |
81.5% |
github.com/LerianStudio/flowker/pkg/templates |
0.0% |
github.com/LerianStudio/flowker/pkg/transformation |
67.2% |
github.com/LerianStudio/flowker/pkg/triggers/webhook |
0.0% |
github.com/LerianStudio/flowker/pkg/triggers |
0.0% |
github.com/LerianStudio/flowker/pkg/webhook |
100.0% |
github.com/LerianStudio/flowker/pkg |
90.3% |
Generated by Go PR Analysis workflow
🔒 Security Scan Results —
|
| Policy | Status |
|---|---|
| Default non-root user | ✅ Passed |
| No fixable critical/high CVEs | ✅ Passed |
| No high-profile vulnerabilities | ✅ Passed |
| No AGPL v3 licenses | ✅ Passed |
Pre-release Version Check
✅ No unstable version pins found.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/executors/tracer/connectivity_prober.go`:
- Around line 54-59: The Probe method on tracerProvider currently creates a new
http.Client with Timeout probeTimeout when httpClient is nil but also applies a
context timeout, causing ambiguity; fix this by ensuring the passed httpClient
respects the probe timeout: in tracerProvider.Probe, if httpClient is nil set
httpClient = &http.Client{Timeout: probeTimeout}, else if httpClient != nil and
httpClient.Timeout == 0 set httpClient.Timeout = probeTimeout (leave non-zero
timeouts untouched), and add a short comment near probeTimeout and the Probe
function documenting that the effective timeout is enforced via the
client.Timeout (or the context deadline) so callers know the probe enforces a
probeTimeout fallback.
In `@tests/e2e/tracer_mock_test.go`:
- Around line 49-62: The mock handler registered with
mux.HandleFunc("/v1/validations", func(w http.ResponseWriter, r *http.Request) {
... }) currently accepts any HTTP method and query; update it to enforce the
probe contract by validating r.Method == http.MethodGet and
r.URL.Query().Get("limit") == "1" early in the handler (before using
opts.forceStatus or writing success response), and return a 400/405 JSON error
if the checks fail; keep the existing behavior for storing the API key via
m.lastAPIKey.Store and for honoring opts.forceStatus when the request is valid.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 62485cc1-0caa-4155-8dfe-1ea250938072
📒 Files selected for processing (6)
pkg/executors/tracer/connectivity_prober.gopkg/executors/tracer/connectivity_prober_test.gopkg/executors/tracer/provider.gopkg/executors/tracer/ssrf_test_helpers.gotests/e2e/full_e2e_test.gotests/e2e/tracer_mock_test.go
- Annotate `apiKeyHeader` const with `#nosec G101`: the value is the
HTTP header NAME ("X-API-Key"), not a credential. GitHub Advanced
Security's native gosec scan flagged this as a hardcoded credential
and blocked the PR; the directive unblocks it without weakening any
real check.
- Probe timeout: when the caller passes a non-nil http.Client with
Timeout=0, upgrade it to probeTimeout. Defense-in-depth alongside
the existing context.WithTimeout cap so the probe enforces a wall-
time bound via either mechanism. Documented in the probeTimeout
comment.
- E2E Tracer mock: enforce probe contract on /v1/validations — reject
non-GET with 405 and missing/wrong `limit=1` query with 400 before
storing the API key or honoring forceStatus. The mock now asserts
the exact wire shape the prober uses, not just the path.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/executors/tracer/connectivity_prober.go`:
- Around line 31-35: libSSRF.ResolveAndValidate is currently called with the
original ctx before the code creates context.WithTimeout(ctx, probeTimeout), so
DNS/SSRF resolution can exceed probeTimeout; fix by creating a timed context
earlier and using that same context for both libSSRF.ResolveAndValidate and the
HTTP request. Specifically, call ctx, cancel := context.WithTimeout(origCtx,
probeTimeout) (and defer cancel()) before invoking libSSRF.ResolveAndValidate
and use that ctx for creating the request and for the HTTP client call so the
probeTimeout bounds resolution and request phases consistently.
- Around line 70-75: Probe currently mutates the caller-supplied *http.Client by
setting httpClient.Timeout which is unsafe; instead, when httpClient is nil
create a new client as before, but if httpClient.Timeout == 0 make a shallow
copy (e.g. copy := *httpClient), set copy.Timeout = probeTimeout, and use the
copy for the Do call; ensure all references to httpClient in Probe use the
cloned client so the original passed-in client is not modified.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 19246e02-df81-435e-b7f1-38691ae2febc
📒 Files selected for processing (2)
pkg/executors/tracer/connectivity_prober.gotests/e2e/tracer_mock_test.go
Brings in 10 commits from develop, including: - fix(security): #nosec G704 suppression on token_fetcher.go (a465f62/bb749e1) - fix(security): SSRF dial-time enforcement - fix(infrastructure): 8 correctness findings, audit-retry backoff, schema validation - docs(infrastructure): SSRF docs alignment Resolves the G704 lint finding inherited from pre-rebase develop.
InitServers is a linear bootstrap/wire-up routine. The complexity
scores (gocognit 41, gocyclo 39, both > 25) come from the sequential
`if err != nil { return nil, err }` chain across config load, logger,
telemetry, TLS validation, MongoDB pool, executor catalog,
repositories, and HTTP/gRPC server startup.
Extracting helpers would add indirection without simplifying the
control flow — each step consumes outputs from the previous one and
the readable shape *is* the linear sequence. A genuine refactor is a
separate, larger concern tracked outside this PR.
This is the same fix landing on develop via #34; when #34 merges
first, a subsequent rebase here will skip this commit because the
diff is identical.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/bootstrap/config.go`:
- Line 121: The file-level comment in InitServers incorrectly mentions
"HTTP/gRPC servers" while only the HTTP server is created via NewHTTPServer;
update the comment in InitServers to accurately reflect that only the HTTP
server is initialized (or, if you intend to initialize a gRPC server here, add
the gRPC initialization logic and corresponding helper similar to
NewHTTPServer). Locate InitServers and change the phrase "HTTP/gRPC servers" to
"HTTP server" (or implement a gRPC initializer and reference it if adding gRPC
support).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5dc4fb4d-3a32-4f1a-ba29-6df9f757f2b9
📒 Files selected for processing (1)
internal/bootstrap/config.go
The previous implementation derived `context.WithTimeout` only around the HTTP request, leaving `libSSRF.ResolveAndValidate` running under the caller's unbounded ctx. The probeTimeout doc claimed a complete wall-time cap, but DNS/SSRF could exceed it — contradicting the contract. Fix: create a single `probeCtx` at the start of Probe and pass it to BOTH phases. probeTimeout now bounds the end-to-end probe as documented. The client-side `httpClient.Timeout` cap is retained as a defense-in-depth fallback for any transport-internal code path that may not honor the request context. Resolves CodeRabbit finding flagged on probeTimeout (PR #25, line 35): #25 (comment)
The Timeout==0 branch previously mutated the caller-supplied *http.Client by writing httpClient.Timeout = probeTimeout. http.Client is documented safe for concurrent USE but NOT concurrent MUTATION, so any caller that shares a single client across goroutines would race. Fix: shallow-copy the caller's client and mutate the copy. The underlying Transport is shared (preserving connection pooling) but the Timeout change is local to this probe call. Adds TestTracerConnectivityProber_DoesNotMutateCallerClient as a regression guard. Resolves CodeRabbit finding on http.Client mutation safety: #25 (comment)
InitServers only initializes an HTTP server (NewHTTPServer at the end of the function); no gRPC server is wired here. The nolint justification incorrectly listed "HTTP/gRPC servers" — corrected to "the HTTP server". Resolves CodeRabbit finding on the docs inaccuracy: #25 (comment)
…e fix The previous commit (cb142d2) made `make sec` honor gosec's exit code, surfacing 3 pre-existing findings that golangci-lint silently excludes via .golangci.yml (G101 and G115 are blanket-suppressed there). Standalone gosec doesn't read .golangci.yml, so `make sec` was now stricter than CI — breaking the local CI parity contract pre-push / pre-merge promise. Suppressing per-site (with justifications) restores parity AND keeps G101/G115 ACTIVE so any future legitimate hit gets caught (unlike the blanket exclusion in .golangci.yml). Suppressions: internal/adapters/http/in/middleware/apikey.go:17 HeaderAPIKey = "X-API-Key" G101 false positive — HTTP header NAME, not a credential value. (Identical pattern to apiKeyHeader fixed via #nosec in PR #25.) pkg/executors/http/auth/types.go:30 TypeOIDCClientCredentials Type = "oidc_client_credentials" G101 false positive — Type discriminator (auth flow identifier), not a credential value. internal/adapters/postgresql/audit/repository.go:275 builder.Limit(uint64(limit + 1)) G115 with documented justification — filter.Limit is guaranteed 1..100 by the query layer (search_audit_logs.go:52-56 clamps <=0 to default 20 and >100 to 100), so `limit + 1` cannot overflow int and the uint64 cast is safe. Reference added to the comment so future readers can verify the bound. Tradeoff vs blanket -exclude=G101,G115 in the gosec CLI: 3 file edits vs 1 Makefile line. Chose per-site for precision — the suppression lives next to the code it justifies and survives future refactors.
Documents the optional executor.ConnectivityProber interface introduced in #23 and adopted by #25: when to implement it, the ProbeOutcome to stages mapping, the recommended pattern, reusable utilities (libSSRF, auth factory, ssrfopt), and anti-patterns. Also adds a one-line pointer from CLAUDE.md. Companion to the Midaz and Tracer prober PRs.
Summary
executor.ConnectivityProberfor the Tracer provider, following the pattern established for Midaz in feat(service): provider-aware connectivity test with Midaz OIDC probe #23.GET {base_url}/v1/validations?limit=1withX-API-Key: {api_key}— the read-only endpoint documented inltracer/docs/design-doc-v2.mdand already exposed by thetracer.list-validationsexecutor.*base.Providerin atracerProviderstruct so the interface can be attached (same wrapper pattern Midaz uses).Why
Tracer was falling back to the legacy generic 3x
GET base_urlconnectivity check. While Tracer's root might happen to respond, the test gave no real signal about: (a) is the API key valid? (b) can the service actually serve a read query? A single GET on/v1/validations?limit=1with the configured key answers both with one round-trip and maps cleanly onto the 3 stages.Differences from Midaz probe
onboarding_base_url+transaction_base_url)base_urlX-API-Keyheader/v1/organizations/{org}/ledgers/{ledger}/accounts?limit=1/v1/validations?limit=1Simpler shape, identical framework.
Test plan
go test -tags=unit ./pkg/executors/tracer/...): 14 tests pass — 5 pre-existing + 9 new (TestTracerConnectivityProber_*: happy_path, invalid_key, missing_key, missing_url, ssrf_block, transport_error, 5xx, nil_client, auth_required).go test ./...): 235 tests pass across 51 packages — no regressions.make test-e2e): 69 tests pass — includes 4 new Tracer scenarios + the 4 Midaz scenarios from feat(service): provider-aware connectivity test with Midaz OIDC probe #23 + the legacytest_kyc_connectivity/test_aml_connectivity(still green).go vet: clean.make lint: zero new warnings (the 2 inherited from develop'sInitServerscognitive complexity andtoken_fetcher.goG704 are pre-existing).base_url,api_key),POST /v1/provider-configurations/{id}/test, verifystages[2].details.urlends with/v1/validations?limit=1.Notable findings during implementation
ssrfOptions()helper now has 3 copies (command, midaz, tracer). This crosses the rule-of-three threshold. Deliberately not extracted in this PR to keep scope tight — a separate refactor PR can move it into a shared package without coupling provider implementations to each other.validateConfigAgainstSchemafails,create_provider_config.go:82-94wraps it inpkg.ValidationError{Code: ErrProviderConfigInvalidSchema}, but the handler atprovider_configuration/handler.go:364checks viaerrors.Is(err, ErrProviderConfigInvalidSchema)— which fails against aValidationErrorstruct holding the sentinel only in itsCodefield. Result: schema rejections return HTTP 500 instead of 422. Not fixed here (out of scope); the E2E test asserts the user-visible contract (creation rejected, no persistence) and is robust to either status code. Worth a separate ticket.required: [base_url, api_key]blocks persistence upfront. The runtime auth-fail branch (probe with no api_key) stays covered by unit tests (TestTracerConnectivityProber_MissingAPIKey).Provider coverage after this PR
S3 is the obvious next candidate (
HEAD bucketwith SigV4 would be the canonical probe).