Skip to content

fix(service): record real authenticated actor identity (Taura finding)#35

Merged
alexgarzao merged 14 commits into
developfrom
fix/audit-actor-identity
May 21, 2026
Merged

fix(service): record real authenticated actor identity (Taura finding)#35
alexgarzao merged 14 commits into
developfrom
fix/audit-actor-identity

Conversation

@alexgarzao
Copy link
Copy Markdown
Collaborator

Pull Request Checklist

Pull Request Type

  • Core
  • Pipeline
  • Tests
  • Documentation
  • Helm

Checklist

  • I have tested these changes locally.
  • I have updated the documentation accordingly.
  • I have added necessary comments to the code, especially in complex areas.
  • I have ensured that my changes adhere to the project's coding standards.
  • I have checked for any potential security issues.
  • I have ensured that all tests pass.
  • I have updated the version appropriately (if applicable).
  • I have confirmed this code is ready for review.

Additional Notes

Context — Taura security finding

The security firm Taura identified a structural flaw shared by Flowker (this PR) and Tracer (out of scope): every audit event was recorded with a hardcoded generic actor (type="system", id="flowker", ipAddress="0.0.0.0"), regardless of which authenticated user or application triggered the action. Per the finding: "ações realizadas por usuários privilegiados tornam-se indistinguíveis de atividades legítimas esperadas da plataforma, inviabilizando a responsabilização individual e reduzindo significativamente a efetividade de processos de investigação e auditoria pós-incidente."

Root cause: internal/services/command/record_audit_event.go:85 (pre-fix), where NewAuditActor(System, "flowker", "") was unconditional.

What changed (per commit)

  1. build(gitignore): ignore .worktrees/ for parallel branches — tooling.
  2. feature(contextutil): add Principal for authenticated identity — new Principal{Type, ID, Name} carried via context.Context. Type kept as string to avoid an import cycle into pkg/model.
  3. fix(audit): record real authenticated actor identity in events — extends AuditActor with name, new error FLK-0613, replaces hardcoded system actor with resolveActor(ctx) capturing the Principal before the fire-and-forget goroutine. Includes migration 000002_audit_actor_name that adds the column and updates both calculate_audit_event_hash() and verify_audit_hash_chain() to include actor_name in the canonical hash input (see migration header for the re-baseline trade-off — applies cleanly to environments without pre-existing audit rows).
  4. feature(auth): propagate JWT and API key identity to request context — new jwt_claims.go parses Bearer via jwt.ParseUnverified (same primitive lib-auth uses internally; trust still gated by lib-auth's Authorize); API key middleware stamps Principal{api_key, <API_KEY_LABEL>}. New env var API_KEY_LABEL (default flowker-default).
  5. test(audit): integration test for actor identity end-to-end — system-fallback integration test.
  6. test(integration): expand audit actor coverage — table-driven coverage of every workflow event type (CREATE/UPDATE/ACTIVATE/DEACTIVATE/DRAFT/DELETE), SQL round-trip for actor_name, and dual-server harness exercising the real api_key path with X-API-Key.
  7. test(e2e): audit assertion in webhook execution flow — full pipeline: webhook → execution → audit row with the expected actor.
  8. fix(security): suppress gosec G101 false positives on audit actor label — narrow // #nosec G101 on the flowker-default constant (audit identifier, not a credential), matching the existing pattern used in this repo.

Resulting behavior

Endpoint type New audit actor
Management (OIDC/JWT) type=user, id=<sub>, name=<preferred_username or email>, ipAddress=<real client IP>
Webhook execution (API Key) type=api_key, id=<API_KEY_LABEL>, ipAddress=<real client IP>
Background / unauthenticated type=system, id=flowker (preserved fallback for genuine system events)

Test results (local)

  • go build ./... — clean
  • go vet ./... — clean
  • make lint — 0 issues
  • make sec — 0 issues
  • Unit: 1589 passed (was 240 before; +1349 from new model/contextutil/middleware/audit-writer coverage and develop-side test additions)
  • Integration (audit subset): 13 passed
  • E2E (audit subset): 14 passed
  • make pre-merge — green

Conscious deviations / follow-ups

  • docs/PROJECT_RULES.md not updated with the new actor semantics — acceptance criterion in the plan, deferred to a follow-up PR to keep this one focused on code + tests.
  • Hash chain re-baseline: the new hash formula includes actor_name. Pre-migration rows (hashed under the old formula) will fail verify_audit_hash_chain after the migration. The migration's .up.sql documents this explicitly; recommended in environments without production audit data (Flowker's current posture).
  • lib-auth/v2 claims: lib-auth v2.7.0 validates JWTs but does not expose claims to the consumer. The middleware uses ParseUnverified to extract identity after lib-auth's validation gate runs — same primitive lib-auth uses internally for its own authorization round-trip, so the security envelope is unchanged.

Obs: Please, always remember to target your PR to develop branch instead of main.

@alexgarzao alexgarzao requested a review from a team as a code owner May 21, 2026 22:14
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 437d6a4d-466a-4bde-bdd1-1a6dccf7797a

📥 Commits

Reviewing files that changed from the base of the PR and between e986ef1 and 2382d13.

📒 Files selected for processing (1)
  • tests/integration/audit_actor_jwt_test.go

Walkthrough

Captures principals (API-key or JWT) into request context, resolves actor (type/id/name/ip) when recording audit events, persists actor_name in the DB (with migrations/hash updates), and adds unit/integration/e2e tests plus a .gitignore entry.

Changes

Audit Actor Attribution

Layer / File(s) Summary
Principal context model and audit actor domain
pkg/contextutil/context.go, pkg/contextutil/context_test.go, pkg/model/audit_entry.go, pkg/model/audit_entry_output.go, pkg/model/audit_entry_test.go, pkg/constant/errors.go
Principal struct stores authenticated Type/ID/Name. WithPrincipal/GetPrincipal manage request-scoped principal storage. AuditActor accepts and validates a name (max 255 chars), adds Name() accessor, and exposes validation error constants. Unit tests validate behavior.
HTTP middleware principal stamping
internal/adapters/http/in/middleware/jwt_claims.go, internal/adapters/http/in/middleware/apikey.go, internal/adapters/http/in/middleware/apikey_test.go, internal/adapters/http/in/middleware/auth_guard.go, internal/adapters/http/in/middleware/auth_guard_test.go
Adds JWT claim helpers (bearerToken, parseUnverifiedClaims, stringClaim). API-key middleware resolves a label (configurable, default flowker-default) and stamps a Principal{Type:"api_key", ID:label} into Fiber user context. Auth guard extracts bearer sub and name (preferred_username or email) and stamps a Principal before authorization. Tests cover stamping and failure cases.
Audit event recording and database persistence
internal/services/command/record_audit_event.go, internal/services/command/record_audit_event_test.go, internal/bootstrap/config.go, migrations/000002_audit_actor_name.up.sql, migrations/000002_audit_actor_name.down.sql, internal/adapters/postgresql/audit/repository.go, internal/adapters/postgresql/audit/repository_test.go
recordEvent now resolves actor via context principal and client IP, with a systemActorID fallback. Migration adds nullable actor_name and updates hash trigger/verification to include it. Repository persists and reconstructs actor_name; tests and bootstrap wire API key label config for the auth guard.
Test coverage
tests/integration/audit_actor_test.go, tests/integration/audit_repository_sql_test.go, tests/integration/main_test.go, tests/e2e/audit_actor_e2e_test.go, various unit tests under pkg and internal
Adds unit tests for context and model, middleware unit tests, command-level unit tests for audit recording resolution, integration harness that boots dual servers (primary no-auth, secondary API-key), integration tests asserting system and API-key actor attribution, SQL round-trip tests for actor_name, and an e2e webhook execution audit test.

Git Worktree Configuration

Layer / File(s) Summary
.gitignore
.gitignore
Adds .worktrees/ ignore pattern for Git worktree support.

Comment @coderabbitai help to get the list of available commands and usage tips.

@lerian-studio
Copy link
Copy Markdown
Contributor

lerian-studio commented May 21, 2026

🔒 Security Scan Results — flowker

Trivy

Filesystem Scan

✅ No vulnerabilities or secrets found.

Docker Image Scan

✅ No vulnerabilities found.


Docker Hub Health Score Compliance

✅ Policies — 4/4 met

Policy Status
Default non-root user ✅ Passed
No fixable critical/high CVEs ✅ Passed
No high-profile vulnerabilities ✅ Passed
No AGPL v3 licenses ✅ Passed

Pre-release Version Check

✅ No unstable version pins found.


🔍 View full scan logs

@lerian-studio
Copy link
Copy Markdown
Contributor

lerian-studio commented May 21, 2026

📊 Unit Test Coverage Report: flowker

Metric Value
Overall Coverage 46.3% ⚠️ BELOW THRESHOLD
Threshold 85%

Coverage by Package

Package Coverage
github.com/LerianStudio/flowker/api 100.0%
github.com/LerianStudio/flowker/internal/adapters/http/in/audit 0.0%
github.com/LerianStudio/flowker/internal/adapters/http/in/catalog 60.9%
github.com/LerianStudio/flowker/internal/adapters/http/in/dashboard 91.5%
github.com/LerianStudio/flowker/internal/adapters/http/in/execution 81.0%
github.com/LerianStudio/flowker/internal/adapters/http/in/executor_configuration 70.8%
github.com/LerianStudio/flowker/internal/adapters/http/in/health 100.0%
github.com/LerianStudio/flowker/internal/adapters/http/in/middleware/testutil 0.0%
github.com/LerianStudio/flowker/internal/adapters/http/in/middleware 57.9%
github.com/LerianStudio/flowker/internal/adapters/http/in/provider_configuration 33.1%
github.com/LerianStudio/flowker/internal/adapters/http/in/readyz 81.3%
github.com/LerianStudio/flowker/internal/adapters/http/in/webhook 87.5%
github.com/LerianStudio/flowker/internal/adapters/http/in/workflow 69.7%
github.com/LerianStudio/flowker/internal/adapters/http/in 33.3%
github.com/LerianStudio/flowker/internal/adapters/mongodb/dashboard 43.0%
github.com/LerianStudio/flowker/internal/adapters/mongodb/execution 64.8%
github.com/LerianStudio/flowker/internal/adapters/mongodb/executor_configuration 36.6%
github.com/LerianStudio/flowker/internal/adapters/mongodb/provider_configuration 63.0%
github.com/LerianStudio/flowker/internal/adapters/mongodb/workflow 69.2%
github.com/LerianStudio/flowker/internal/adapters/postgresql/audit 41.9%
github.com/LerianStudio/flowker/internal/bootstrap 40.5%
github.com/LerianStudio/flowker/internal/services/command 42.1%
github.com/LerianStudio/flowker/internal/services/query 11.2%
github.com/LerianStudio/flowker/internal/services 3.2%
github.com/LerianStudio/flowker/internal/testutil 63.6%
github.com/LerianStudio/flowker/pkg/circuitbreaker 97.6%
github.com/LerianStudio/flowker/pkg/clock 0.0%
github.com/LerianStudio/flowker/pkg/condition 87.9%
github.com/LerianStudio/flowker/pkg/contextutil 100.0%
github.com/LerianStudio/flowker/pkg/executor/base 20.0%
github.com/LerianStudio/flowker/pkg/executor 39.5%
github.com/LerianStudio/flowker/pkg/executors/http/auth 58.2%
github.com/LerianStudio/flowker/pkg/executors/http 49.5%
github.com/LerianStudio/flowker/pkg/executors/midaz 80.8%
github.com/LerianStudio/flowker/pkg/executors/s3 0.0%
github.com/LerianStudio/flowker/pkg/executors/tracer 88.0%
github.com/LerianStudio/flowker/pkg/executors 57.1%
github.com/LerianStudio/flowker/pkg/model 60.7%
github.com/LerianStudio/flowker/pkg/net/http 91.3%
github.com/LerianStudio/flowker/pkg/pagination 0.0%
github.com/LerianStudio/flowker/pkg/safehttp 90.3%
github.com/LerianStudio/flowker/pkg/ssrfopt 100.0%
github.com/LerianStudio/flowker/pkg/templates/tracer_midaz 81.5%
github.com/LerianStudio/flowker/pkg/templates 0.0%
github.com/LerianStudio/flowker/pkg/transformation 67.2%
github.com/LerianStudio/flowker/pkg/triggers/webhook 0.0%
github.com/LerianStudio/flowker/pkg/triggers 0.0%
github.com/LerianStudio/flowker/pkg/webhook 100.0%
github.com/LerianStudio/flowker/pkg 90.3%

Generated by Go PR Analysis workflow

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/adapters/http/in/middleware/apikey.go`:
- Around line 45-48: Trim cfg.Label before checking for empty so whitespace-only
labels aren't treated as valid; replace the current label assignment (label :=
cfg.Label ... if label == "" { label = defaultAPIKeyLabel }) with logic that
does label = strings.TrimSpace(cfg.Label) and then fall back to
defaultAPIKeyLabel when the trimmed label is empty, updating any import/use in
the same file (apikey.go) to include strings where needed and keeping the
variable name label unchanged for downstream audit attribution.

In `@internal/services/command/record_audit_event.go`:
- Around line 170-176: The current logic in the contextutil.GetPrincipal
handling keeps p.ID/p.Name when Principal.Type is unrecognized, causing
misattribution; change the branch in the record-audit-actor logic so that if
model.AuditActorType(p.Type) is not valid you do not reuse p.ID or p.Name but
instead return the canonical system actor (use model.AuditActorTypeSystem) —
i.e., detect invalid actorType and directly return the canonical system/flowker
actor via model.NewAuditActor(...) rather than rewriting only the type while
keeping p.ID/p.Name; update the code around contextutil.GetPrincipal(...) and
model.NewAuditActor(...) accordingly.

In `@migrations/000002_audit_actor_name.down.sql`:
- Around line 44-119: The DOWN migration drops actor_name but the restored
verify_audit_hash_chain() function recomputes hashes without actor_name, which
will invalidate rows that were created when actor_name was included; before
dropping actor_name in the ALTER TABLE, either (A) detect any audit_events rows
that were created/hashed with actor_name (e.g. non-null actor_name or created_at
>= the 000002 migration activation time) and abort the downgrade by RAISE
EXCEPTION with a clear message, or (B) recompute and rewrite the chain hashes
and previous_hash values for the affected range so hashes match the legacy
algorithm (use the same hash_input used in the restored verify_audit_hash_chain
to generate expected_hash and update hash and previous_hash in audit_events,
walking the chain in order). Ensure changes reference verify_audit_hash_chain,
audit_events, actor_name, hash and previous_hash so the migration either blocks
when affected rows exist or remaps stored hashes before dropping actor_name.

In `@migrations/000002_audit_actor_name.up.sql`:
- Around line 35-57: The verifier currently recomputes every row with the new
actor_name term which will invalidate pre-migration rows; either (A) preserve
the legacy formula for historical rows by changing the verifier
(verify_audit_hash_chain / the internal compute hash function used by it) to
branch: if actor_name IS NULL (or id <= N if you capture the migration cutoff)
compute the legacy hash, else compute the new hash that includes actor_name, or
(B) perform an explicit re-baseline in this migration: add a data-migration step
in the .up.sql that recomputes and updates stored hashes in audit_events for all
existing rows using the new formula (and mark them as rebaselined), and ensure
.down.sql restores the legacy verifier consistently; pick one of these fixes and
implement it in the verifier function or the migration data-update so historical
rows remain verifiable.

In `@tests/e2e/audit_actor_e2e_test.go`:
- Around line 235-251: The polling loop in the test currently breaks only when
hasEventType(entries, "EXECUTION_COMPLETED") is true, so executions that
terminate with "EXECUTION_FAILED" wait until timeout; update the loop condition
inside the for loop that checks entries (the block that calls
findExecutionAuditEntries(t, client, execID)) to break when either
hasEventType(entries, "EXECUTION_COMPLETED") OR hasEventType(entries,
"EXECUTION_FAILED") is true, leaving the subsequent assertions (require.NotEmpty
and assert.True checks) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c236ea50-03b4-4e84-8c46-ed6d1a8f3d93

📥 Commits

Reviewing files that changed from the base of the PR and between 668588d and ff14cca.

📒 Files selected for processing (23)
  • .gitignore
  • internal/adapters/http/in/middleware/apikey.go
  • internal/adapters/http/in/middleware/apikey_test.go
  • internal/adapters/http/in/middleware/auth_guard.go
  • internal/adapters/http/in/middleware/auth_guard_test.go
  • internal/adapters/http/in/middleware/jwt_claims.go
  • internal/adapters/postgresql/audit/repository.go
  • internal/adapters/postgresql/audit/repository_test.go
  • internal/bootstrap/config.go
  • internal/services/command/record_audit_event.go
  • internal/services/command/record_audit_event_test.go
  • migrations/000002_audit_actor_name.down.sql
  • migrations/000002_audit_actor_name.up.sql
  • pkg/constant/errors.go
  • pkg/contextutil/context.go
  • pkg/contextutil/context_test.go
  • pkg/model/audit_entry.go
  • pkg/model/audit_entry_output.go
  • pkg/model/audit_entry_test.go
  • tests/e2e/audit_actor_e2e_test.go
  • tests/integration/audit_actor_test.go
  • tests/integration/audit_repository_sql_test.go
  • tests/integration/main_test.go

Comment thread internal/adapters/http/in/middleware/apikey.go Outdated
Comment thread internal/services/command/record_audit_event.go Outdated
Comment thread migrations/000002_audit_actor_name.down.sql
Comment thread migrations/000002_audit_actor_name.up.sql
Comment thread tests/e2e/audit_actor_e2e_test.go
@alexgarzao alexgarzao changed the title fix(audit): record real authenticated actor identity (Taura finding) fix(service): record real authenticated actor identity (Taura finding) May 21, 2026
@alexgarzao
Copy link
Copy Markdown
Collaborator Author

🔍 Bug-Fix Evidence — Audit actor identity (Taura finding)

Evidence captured from a real end-to-end execution against Postgres + MongoDB testcontainers booted by this PR's integration harness. Reproduces the exact scenario described in the Taura security finding (GET /v1/audit-events) for two paths:

Path A — Unauthenticated request (system fallback, preserved)

Request:

POST /v1/workflows HTTP/1.1
Host: 127.0.0.1:59103
Content-Type: application/json

Resulting audit row:

{
  "eventType": "WORKFLOW_CREATED",
  "action": "CREATE",
  "result": "SUCCESS",
  "resourceId": "a7d432d7-ddf8-4866-9b27-67bf1585f90e",
  "resourceType": "workflow",
  "actor": {
    "type": "system",
    "id": "flowker",
    "name": "",
    "ipAddress": "127.0.0.1"
  }
}

Note: actor stays system/flowker (legitimate fallback for background events without an authenticated principal), but ipAddress is now the real client IP (127.0.0.1) instead of the hardcoded 0.0.0.0 from the Taura screenshot.

Path B — X-API-Key authenticated request (the bug fix in action)

Request:

POST /v1/workflows HTTP/1.1
Host: 127.0.0.1:59104
X-API-Key: test-api-key-fixed-value
Content-Type: application/json

Resulting audit row:

{
  "eventType": "WORKFLOW_CREATED",
  "action": "CREATE",
  "result": "SUCCESS",
  "resourceId": "f8dcddab-ba2c-4cde-af3f-d652774ea037",
  "resourceType": "workflow",
  "actor": {
    "type": "api_key",
    "id": "test-label",
    "name": "",
    "ipAddress": "127.0.0.1"
  }
}

Side-by-side comparison

Field BEFORE (Taura finding) AFTER — Path A (no auth) AFTER — Path B (X-API-Key)
actor.type system (always) system (legitimate fallback) api_key
actor.id flowker (always) flowker (canonical fallback) test-label (from API_KEY_LABEL) ✅
actor.ipAddress 0.0.0.0 (always) real client IP real client IP
actor.name (field did not exist) "" "" (would be preferred_username/email for JWT path)

Coverage of the third path (JWT user)

The JWT path (actor.type=user, actor.id=<sub>, actor.name=<preferred_username\|email>) is not exercised against a live IdP here because the integration harness runs with PLUGIN_AUTH_ENABLED=false. It is covered by the middleware unit tests:

  • internal/adapters/http/in/middleware/auth_guard_test.go::TestPrincipal_FromJWT_PreferredUsername
  • internal/adapters/http/in/middleware/auth_guard_test.go::TestPrincipal_FromJWT_EmailFallback
  • internal/adapters/http/in/middleware/auth_guard_test.go::TestPrincipal_FromJWT_NoSub_NoPrincipal
  • internal/adapters/http/in/middleware/auth_guard_test.go::TestPrincipal_FromJWT_MalformedToken_NoPrincipal

Persistent regression coverage

The same scenarios above are run automatically on every CI execution via:

  • tests/integration/audit_actor_test.go::TestAuditActor_APIKey_StampsAPIKeyActor — Path B
  • tests/integration/audit_actor_test.go::TestAuditActor_AllEventTypes_SystemFallback — Path A across CREATE/UPDATE/ACTIVATE/DEACTIVATE/DRAFT/DELETE
  • tests/integration/audit_actor_test.go::TestAuditActor_SystemFallback_WhenNoAuth — Path A end-to-end
  • tests/integration/audit_repository_sql_test.go::TestAuditActor_SQLRoundTrip_PreservesActorNameactor_name survives SQL Insert/FindByID
  • tests/e2e/audit_actor_e2e_test.go::TestAuditActor_WebhookExecution_SystemFallback — full webhook → execution → audit pipeline

Reproducing locally

make up   # boots Postgres + Mongo + Flowker via docker-compose
curl -X POST http://localhost:3000/v1/workflows \
     -H "X-API-Key: ${API_KEY}" \
     -H "Content-Type: application/json" \
     -d '{...}'
curl http://localhost:3000/v1/audit-events?limit=1 | jq '.items[0].actor'
# Expected: { "type": "api_key", "id": "<API_KEY_LABEL>", "ipAddress": "<real IP>", ... }

@alexgarzao
Copy link
Copy Markdown
Collaborator Author

🔍 Bug-Fix Evidence — JWT user actions (CREATE + UPDATE workflow)

Complementing the previous evidence comment which covered the system-fallback and X-API-Key paths, this captures real user actions authenticated via a Bearer JWT — the path Taura's finding most directly targets ("ações realizadas por usuários privilegiados").

The integration harness was extended in-test with a third Flowker server (PLUGIN_AUTH_ENABLED=true) pointed at a mock Access Manager (httptest.Server returning {"authorized": true} for POST /v1/authorize). The JWT is HMAC-signed with realistic OIDC claims (sub, preferred_username, email); lib-auth/v2 uses jwt.ParseUnverified internally to extract claims, so a real IdP token reaches the audit pipeline through the exact same code path.

Action 1 — Workflow CREATE

Request:

POST /v1/workflows HTTP/1.1
Host: 127.0.0.1:62041
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJlbWFpbCI6Imdh.....Zf8VffHIGtkj8ZNKja9m...
Content-Type: application/json

# JWT claims (decoded):
#   sub:                f0a8c1d2-3e4b-4f5a-9c6d-7e8f9a0b1c2d
#   preferred_username: alex.garzao
#   email:              garzao@lerian.studio

Resulting audit row (GET /v1/audit-events):

{
  "eventType":    "WORKFLOW_CREATED",
  "action":       "CREATE",
  "result":       "SUCCESS",
  "resourceId":   "7065e468-18d0-415c-a490-229718c3526d",
  "resourceType": "workflow",
  "actor": {
    "type":      "user",
    "id":        "f0a8c1d2-3e4b-4f5a-9c6d-7e8f9a0b1c2d",
    "name":      "alex.garzao",
    "ipAddress": "127.0.0.1"
  }
}

Action 2 — Workflow UPDATE (same Bearer token)

Request:

PUT /v1/workflows/7065e468-18d0-415c-a490-229718c3526d HTTP/1.1
Host: 127.0.0.1:62041
Authorization: Bearer <same JWT as above>
Content-Type: application/json

Resulting audit row:

{
  "eventType":    "WORKFLOW_UPDATED",
  "action":       "UPDATE",
  "result":       "SUCCESS",
  "resourceId":   "7065e468-18d0-415c-a490-229718c3526d",
  "resourceType": "workflow",
  "actor": {
    "type":      "user",
    "id":        "f0a8c1d2-3e4b-4f5a-9c6d-7e8f9a0b1c2d",
    "name":      "alex.garzao",
    "ipAddress": "127.0.0.1"
  }
}

Verdict

Field Value in audit row Source
actor.type "user" Stamped by auth_guard.go::extractPrincipalFromBearer after lib-auth/v2 validates the token
actor.id "f0a8c1d2-3e4b-4f5a-9c6d-7e8f9a0b1c2d" JWT sub claim — stable, unique, OIDC-standard
actor.name "alex.garzao" JWT preferred_username claim (falls back to email when absent)
actor.ipAddress "127.0.0.1" Real client IP from ClientIPMiddleware (no longer 0.0.0.0)

Scope of this evidence

  • WORKFLOW_CREATED — direct evidence above
  • WORKFLOW_UPDATED — direct evidence above
  • ✅ All other workflow lifecycle events (ACTIVATE, DEACTIVATE, DRAFT, DELETE) — go through the same recordEvent → resolveActor code path; covered structurally by TestAuditActor_AllEventTypes_SystemFallback and the unit tests on record_audit_event.go. Same JWT-stamped Principal in context produces the same actor.type=user audit row regardless of which command fires.
  • ✅ Execution events (EXECUTION_STARTED, EXECUTION_COMPLETED, EXECUTION_FAILED) — emitted from the same RecordAuditEventCommand via RecordExecutionEvent; identical attribution.
  • ✅ Provider config events (PROVIDER_CONFIG_*) — same pipeline via RecordProviderConfigEvent.

Closing the Taura finding

Before this PR, every audit row showed actor = { type:"system", id:"flowker", ipAddress:"0.0.0.0" } — what Taura flagged as the structural failure. After this PR:

Auth path actor.type actor.id Source
OIDC/JWT (management endpoints) user JWT sub This evidence ✅
API Key (webhook endpoints) api_key API_KEY_LABEL env var Previous evidence
Genuine background events (no HTTP) system flowker (canonical) Preserved fallback ✅

All three paths capture the real client IP via ClientIPMiddleware0.0.0.0 is no longer possible for any HTTP-originated audit row.

Per OIDC Core 1.0 sec.2 and RFC 9068 sec.2.2, 'sub' is REQUIRED on ID Tokens
and JWT access tokens (including client_credentials). A token that parses but
lacks 'sub' cannot be attributed to a principal, so accepting it would silently
record the audit row with the generic system/flowker actor — the structural
failure mode the Taura audit originally flagged.

extractPrincipalFromBearer now returns (rejected, err) so the outer guard can
short-circuit after the 401 response is written, instead of falling through to
lib-auth.Authorize. Behavior preserved for the 'no Bearer header' and
'malformed token' paths — lib-auth still handles those.
@alexgarzao
Copy link
Copy Markdown
Collaborator Author

🛡️ Closing scenario 2 (JWT sem `sub`) — strict reject

Após investigar a recomendação da spec OIDC/RFC 9068 e a postura de IdPs maduros (Keycloak, Auth0, Okta, Entra), o comportamento permissivo anterior (aceita token + log warning + fallback para `system/flowker`) recriava o exato anti-pattern que a Taura reportou. Endereçado neste mesmo PR antes do merge.

Commit: 8220710 — `fix(auth): reject JWT without 'sub' claim with 401`

Mudança de comportamento

Cenário Antes Depois
Bearer ausente lib-auth retorna 401 "Missing Token" Inalterado
Token malformado / corrompido lib-auth rejeita downstream Inalterado
Token válido sem claim `sub` ⚠️ Aceita, warning silencioso, audit cai em `system/flowker` 401 `UNAUTHORIZED_MISSING_SUB` antes mesmo de chegar no lib-auth
Token válido com `sub` Stampa Principal correto, lib-auth valida Inalterado

Justificativa (fontes autoritativas)

  • OIDC Core 1.0 §2: `sub` é REQUIRED em ID Tokens
  • RFC 9068 §2.2 (JWT Profile for OAuth 2.0 Access Tokens): `sub` é REQUIRED inclusive em fluxo `client_credentials` (M2M), onde SHOULD equivaler ao `client_id`
  • PCI-DSS v4.0 Req. 10.2.1.1: audit logs DEVEM associar acesso a usuários individuais — fallback genérico viola o requisito
  • OWASP Logging Cheat Sheet: falhas de atribuição de identidade DEVEM ser logadas como eventos de segurança, nunca silenciosamente degradadas
  • RFC 9068 §5: explicitamente desencoraja fallback para `client_id`/`azp` no resource server — colisões podem permitir client malicioso reivindicar `sub` de principal de alto privilégio

Cobertura de teste

Substituiu `TestPrincipal_FromJWT_NoSub_NoPrincipal` (que asseverava o comportamento antigo permissivo) por `TestPrincipal_FromJWT_NoSub_Rejects401`:

  • Asserta HTTP 401 ao invés de propagar
  • Asserta `code == "UNAUTHORIZED_MISSING_SUB"` no body para investigação operacional
  • Mock Access Manager retorna autorizado para isolar que a rejeição é nossa middleware, não do lib-auth

`make pre-merge` continua verde: 1589 unit tests, lint 0, sec 0.

Risco de availability (mitigado pelo contexto)

A pesquisa apontou risco de availability: se o IdP em produção parar de emitir `sub` em alguma classe de token (precedente real: Keycloak issue #31082), a aplicação rejeitaria em massa. Como o Flowker não tem dados em produção ainda, esse risco está mitigado por design — qualquer regressão futura do IdP seria caught antes do deploy.

Follow-up (escopo separado)

O padrão "service principal nomeado" via allow-list de `client_id` (alinhado ao que o Tracer já usa: `svc_tracer` / "Tracer Limit Manager") fica como melhoria futura quando/se houver M2M legítimo emitindo token sem `sub`. Não é necessário para fechar a finding da Taura — a rejeição estrita já elimina a ambiguidade no audit log.

Adds two persistent integration tests against a third Flowker server
booted in-test with PLUGIN_AUTH_ENABLED=true + mock Access Manager:

- TestAuditActor_JWT_WithSub_StampsUserActor — positive path proving
  a Bearer token with sub/preferred_username produces actor.type=user
  with the JWT identity in the audit row.

- TestAuditActor_JWT_WithoutSub_Returns401_NoAuditRow — strict reject
  path proving a Bearer without sub returns 401 UNAUTHORIZED_MISSING_SUB
  AND writes no audit row (no silent fallback to system/flowker).

Server is bootstrapped via sync.Once so multiple tests share the same
mock Access Manager + Flowker listener. Mongo + Postgres are reused
from the primary harness.

E2E is intentionally not extended: the JWT auth boundary is exercised
at the same layer (Fiber listener + lib-auth round-trip + audit DB)
that the new integration tests already cover. Adding E2E would
duplicate coverage without exercising a new layer.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/audit_actor_jwt_test.go`:
- Around line 73-111: Snapshotting of environment into the originals map is done
but restoration only occurs on the success path; to ensure env is always
restored even if bootstrap.InitServers() or waitForServer(...) fail, immediately
call a deferred restore right after creating the originals map (use defer to
iterate over originals and call os.Setenv for each key/value) so that regardless
of returns or errors in setEnvForApp, bootstrap.InitServers, or waitForServer
the environment is reverted; ensure the defer runs before any early returns and
keep am.Close() calls as needed in error paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5fb3d62d-c10c-4c17-8fbc-88233fd93b26

📥 Commits

Reviewing files that changed from the base of the PR and between 8220710 and e986ef1.

📒 Files selected for processing (1)
  • tests/integration/audit_actor_jwt_test.go

Comment thread tests/integration/audit_actor_jwt_test.go Outdated
…return paths

The JWT-server bootstrap helper snapshotted PLUGIN_AUTH_* and friends
before mutating them, but restored only on the success path. If
bootstrap.InitServers() or waitForServer() failed, the env stayed
mutated and subsequent integration tests would read a PLUGIN_AUTH_ADDRESS
pointing at a closed httptest.Server — silent test-suite poisoning.

Move the restore into a defer right after the snapshot so it runs on
every exit path, including the early returns on bootstrap/readiness
failure.
@alexgarzao alexgarzao merged commit 45e8c4a into develop May 21, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants