feat(roadmap-planner): W1 — member allowlist + GitLab instance-wide sweep#189
Conversation
🤖 AI Code Review
SummaryThis PR implements the W1 milestone for roadmap-planner: member allowlist filtering and GitLab instance-wide MR sweep. The implementation adds configuration options for Review Statistics
Warnings
Suggestions
Positive Feedback
Review completed by alaudabot ℹ️ About this reviewThis review was automatically generated using the
|
| @@ -371,6 +375,15 @@ func startGitLabSync(ctx context.Context, cfg *config.Config, store storage.Stor | |||
| syncer := glclient.NewSyncer(client, store, specs, glclient.DefaultLinker(projectKey), backfill) | |||
There was a problem hiding this comment.
Bug (bug/missing-feature): The GitLab syncer receives AllowedMemberIDs from the allowlist, but the GitHub syncer path in startGitHubSync has no equivalent allowlist filter. GitHub PRs authored by non-allowlisted members will still be inserted into pull_requests and rolled up — inconsistent dashboard counts when both sources are active.
| syncer := glclient.NewSyncer(client, store, specs, glclient.DefaultLinker(projectKey), backfill) | |
| // TODO(W2): wire GitHub syncer AuthorAllowlist to filter GitHub PRs the same way | |
| // Pass B filters GitLab MRs, so GitHub counts are consistent across sources. |
| @@ -335,6 +377,177 @@ func (s *Syncer) Sync(ctx context.Context) error { | |||
| return firstErr | |||
There was a problem hiding this comment.
Warning (performance/rate-limit): sweepInstanceByMember loops over every allowlisted member issuing up to 20 pages of ListInstanceMergeRequests each. With 20 members that's ≤400 API calls per sync cycle. A 429 rate-limit response sets firstErr and the sweep continues silently, meaning data loss is invisible to operators. Consider adding retry-with-backoff or at minimum a passBRateLimited log counter.
| // operator typo'd the denylist/prefill key. Either way Pass | ||
| // B has nothing to fetch for them; skip silently. | ||
| continue | ||
| } |
There was a problem hiding this comment.
Suggestion (performance/optimization): The note-fetching loop calls ListMRNotes per MR for every non-skipped MR. Members with hundreds of MRs trigger many sequential HTTP calls. Consider adding a MaxNotesRequests guard or batching the requests.
| continue | ||
| } | ||
| mrs, err := s.client.ListInstanceMergeRequests(ctx, ListInstanceMergeRequestsOptions{ | ||
| AuthorUsername: m.GitLabUsername, |
There was a problem hiding this comment.
Suggestion (refactor/observable): MRs with unreadable projectPath are silently skipped (Warn log only). The totalMRs counter only increments on successful upsert, so silent skips are invisible in the totals. Consider adding a passBSkipped counter surfaced in the run struct.
| placeholders := make([]string, len(ids)) | ||
| args := make([]any, len(ids)) | ||
| for i, id := range ids { | ||
| placeholders[i] = "?" |
There was a problem hiding this comment.
Warning (performance/scalability): The cleanup DELETE generates a query with len(ids) bound parameters via strings.Join(placeholders, ","). SQLite's SQLITE_MAX_VARIABLE_NUMBER defaults to 999 — a team with >999 allowlisted members will cause db.ExecContext to return SQLITE_TOOMUCH and the rebuild will fail. Consider batching the cleanup into multiple DELETE statements of ≤999 rows each.
| // Preserve the YAML-side pillar names (case-sensitive), | ||
| // but merge back the viper-loaded denylist + prefills | ||
| // so they aren't lost when the YAML file omits them. | ||
| if len(preserve.TeamAnalytics.MemberDenylist) == 0 { |
There was a problem hiding this comment.
Suggestion (refactor/robustness): Using len(...) == 0 for nil-slice safety. If struct fields can ever be nil (not just empty slices), consider if field == nil || len(field) == 0 for belt-and-suspenders robustness.
| on upgrade until the operator opts in): | ||
|
|
||
| ```yaml | ||
| gitlab: |
There was a problem hiding this comment.
Warning (docs/inconsistent): The member_denylist example uses GitHub-style names (gxjiao, lmhe) but the field holds Jira member ids (slugified emails — the same keys as the prefill maps). Operators copying the example will get no matches. Suggest Jira-id-shaped examples (e.g. gxjiao@alauda.cn) and a note that entries must match prefill-map keys exactly.
…weep
Implements the first workstream of the 2026-05-19 metrics audit follow-up
(PLAN.md → W1). Three behavioural changes, all gated behind config so
existing installs upgrade with no observable change until they opt in:
- GitLab `ListGroupProjects` now sends `with_shared=false`. The default
`true` was what surfaced `container-platform/*` and `alauda/artifacts`
MRs under the configured `devops/**` ingest in prod — root cause of
the audit's B1 finding (59% of merged MRs were out-of-scope noise).
- New `Syncer.MemberInstanceSweep` Pass B: for every member in the
derived allowlist, fetch `/api/v4/merge_requests?scope=all&author_username=<u>`
and ingest MRs that Pass A missed (members file in projects outside
`gitlab.groups`). De-dupes against Pass A via the existing
`<group/proj>!<iid>` storage PK.
- New `contributions.BuildAllowlist(cfg)`: union of the prefill maps
(`team_analytics.github_login_prefills` ∪ `gitlab_username_prefills`)
minus `team_analytics.member_denylist`. The synthetic `bot` member
is always included so W2's consolidation row stays visible. The
aggregator and the `/api/contributions/members` + `/team` handlers
both filter on the allowlist when it's enabled.
Activation rules:
- empty prefills + empty denylist → allowlist disabled, pre-W1 SQL
- any prefill configured → allowlist on, denylist subtracts
- denylist only → stays disabled (refuse to collapse to {bot})
Tests:
- `BuildAllowlist` table-driven coverage (empty, denylist subtraction,
case-folding, denylist-only safety).
- Aggregator integration test: 3 PRs / 3 authors, only allowlisted
author survives; toggle filter off → all three return.
- GitLab Pass B integration test (httptest): two members, one
allowlisted with 2 MRs across different namespaces; verifies
ingest + dedup on re-sync + no-op when sweep disabled.
Config additions (all default off):
gitlab.member_instance_sweep: false
team_analytics.member_denylist: []
Doc: `docs/team-analytics/CHANGES.md` documents the activation rules,
rollback paths, and audit cross-reference.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds an "allowlist cleanup" INFO log after the post-rebuild `DELETE FROM member_week_metrics WHERE member_id NOT IN (allowlist)` sweep so we can confirm whether the cleanup actually executes in prod and how many stale rows it removes per rebuild. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
cbad2e3 to
627a4c0
Compare
First workstream of the 2026-05-19 metrics audit follow-up — see #188 for the audit report and full plan.
Summary
with_shared=falseonListGroupProjects(root cause of audit finding B1 —container-platform/*,alauda/artifactsetc. shouldn't have been ingested underdevops/**).gitlab.member_instance_sweepso allowlisted members' work in projects outside the configuredgroupsfinally lands in the rollup.team_analytics.{github,gitlab}_username_prefills ∪ {bot}minus the newteam_analytics.member_denylist. The aggregator and/api/contributions/members+/teamboth filter on it when enabled. Activation is opt-in — empty config means pre-W1 behaviour.What's in this diff
internal/config/config.goTeamAnalytics.MemberDenylist []string,GitLab.MemberInstanceSweep boolinternal/contributions/allowlist.go(new)BuildAllowlist,Allowlist.Contains/IDs/Enabledinternal/contributions/aggregator.goSetAllowlist; per-INSERTAND <member> IN (?, …)fragment; post-rebuild cleanupinternal/api/handlers/contributions.go,internal/api/routes.goSetAllowlist+filterAllowlistinListMembers/TeamOverviewinternal/gitlab/client.gowith_shared=falseon group projects;ListInstanceMergeRequests;MergeRequest.ProjectPath()helperinternal/gitlab/sync.goMemberInstanceSweep,AllowedMemberIDs; newsweepInstanceByMemberPass Bcmd/server/main.goaggregator_test.go,allowlist_test.go,gitlab/sync_test.goconfig.example.yaml,docs/team-analytics/CHANGES.md(new)Activation rules
{bot}— operator must declare who counts)Visible effects when enabled in prod
zhwang,chaozhou,gxjiao,lmhe(denylist seeds from the audit) disappear from the team overview + every per-member chart.gitlab.member_instance_sweep: true, MRs filed by allowlisted members incontainer-platform/*,alauda/artifacts, etc. start landing under their properrepo_id.Rollback
gitlab.member_instance_sweep: falseto disable Pass B without touching the allowlist.Test plan
go test ./...frombackend/— all greengo vet ./...— cleanTestBuildAllowlistcovers empty / prefills-only / denylist-subtraction / case-fold / denylist-aloneTestAggregatorAllowlistproves the filter excludes non-allowlisted authors and the toggle-off sentinel restores pre-W1 shapeTestSyncPassBMemberSweepproves Pass B ingests two MRs from two namespaces, dedupes on re-sync, and no-ops whenMemberInstanceSweepis falseedge-int/devops-tooling/roadmap-planner-dev— to validate against the live prod-shape databasemember_instance_sweep: true, verify Pass B picks up allowlisted members' MRs from outsidedevops/**Sequencing
W1 is the foundation for W2 (bot consolidation) and W3 (already folded into this PR via the denylist). Per
docs/team-analytics/audit-2026-05-19/PLAN.md.🤖 Generated with Claude Code