Skip to content

release: develop → production (留言防治 + quote-wall + campaign-discussion)#4848

Draft
mashbean wants to merge 58 commits into
masterfrom
develop
Draft

release: develop → production (留言防治 + quote-wall + campaign-discussion)#4848
mashbean wants to merge 58 commits into
masterfrom
develop

Conversation

@mashbean

Copy link
Copy Markdown
Contributor

正式 production 部署。merge 此 PR 會觸發 Deploy workflow(environment=production)。

本次 release 內容(develop 領先 master 32 commits)

留言垃圾防治(本批主軸)

其他團隊功能(一併上 prod)

部署後待辦(ops)

  1. MATTERS_COMMENT_SPAM_AUTO_COLLAPSE=true(要啟用留言自動折疊才需要;不開則只打分不折疊)。
  2. 確認 db-migration 成功(report_reason constraint)。
  3. Telegram 檢舉/打掃通知 worker 隨 prod 部署建出(SSM 已設 report-alert 佇列 + bot token + chat id)→ 通知上線。
  4. 驗收:OSS 評論清單的「垃圾評論概率」改由留言模型產出;L2 開始寫 S3 訓練桶;海巡 bot 打掃照常(bot 獨立、不受此部署影響)。

⚠️ 注意

  • 此 PR 把整個 develop 上 prod,含上述其他團隊功能 + DB migration,請確認各功能 ready、release 時機 OK 再 merge。
  • 海巡 bot(留言打掃)是獨立 GitHub Actions、已在 prod 運轉,不靠此部署

🤖 Generated with Claude Code

mashbean and others added 30 commits June 4, 2026 11:17
Comments currently share the short-content (moment) spam model. Add a dedicated
commentSpamDetectionApiUrl (MATTERS_COMMENT_SPAM_DETECTION_API_URL) so comments
score against the comment-specific model (e5-small + logreg, recall 0.88 vs 0.68
on unseen templates). Falls back to the short-content URL when unset, so
behaviour is unchanged until ops points it at the new endpoint.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
feat(comment): dedicated comment spam model endpoint
…ssion board

- new COMMENT_TYPE campaignDiscussion bound to campaign via targetId/targetTypeId
- putComment: accept campaignId; only succeeded participants (campaign_user.state)
  or campaign organizers/managers may comment; cap content at 240 chars
- WritingChallenge.discussion / discussionCount: public read resolvers
- Comment.node resolves campaign comments to WritingChallenge
- campaignService.isParticipant helper

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- new quote table (content, article_id, campaign_id, user_id, state) +
  entity_type row; soft-delete states active/archived/banned
- putQuote mutation with full rule set:
  * quote must be an excerpt of the article (primary anti-abuse: no free
    typing; whitespace-normalized substring check against article content)
  * 80-char cap; license gate (ARR -> author only)
  * campaign-scoped: only campaign articles can be quoted onto the wall
  * caps: 5/day per user, 2 per article per user, exact-duplicate rejection
- deleteQuote (retraction): poster, source article author, or admin;
  soft delete, daily quota not refunded
- WritingChallenge.quotes (random sampling for shuffle) + quoteCount
- Quote type resolvers (article / poster via dataloaders)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- run codegen so schema.graphql / schema.d.ts include the new
  campaignDiscussion comment type, campaign discussion field and
  CommentInput.campaignId (CI relies on committed generated types)
- putComment: guard article/moment notification blocks with targetAuthor
  (now string|undefined since campaign discussions have no single author)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- codegen: add Quote mapper so resolver parent is the DB model
- regenerate schema.graphql / schema.d.ts (quote, campaignDiscussion)
- putComment: guard article/moment notification blocks with targetAuthor
- lint:fix import order in resolver index files

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…nd id validation

- add partial unique index on quote (user_id, article_id, content) for
  active quotes as a concurrency backstop, and surface PG unique
  violation as UserInputError in putQuote
- order campaign_article lookup by created_at desc so attribution is
  deterministic when an article belongs to multiple campaigns
- validate fromGlobalId type in putQuote and deleteQuote

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…archived campaign

- commentService.upvote & unvoteComment: add explicit campaignDiscussion
  branch with participant/organizer permission, skip blocked check (no
  single target author) instead of falling through to circle
- deleteComment: invalidate the Campaign node (not Circle) and allow
  campaign creator/organizers/managers to delete discussion comments
- togglePinComment: throw ForbiddenError for campaignDiscussion instead of
  loading it as a circle
- putComment: reject commenting on archived campaigns
- discussionCount: count via commentService.count (active + collapsed) to
  match the public discussion list

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…uote

- add `after` cursor to QuotesInput and wire the quotes resolver to the
  shared offset pagination util; `after` is ignored under `random`
- make deleteQuote idempotent: an already-retracted quote returns success,
  with the permission check kept ahead of the idempotent path so it cannot
  be used to probe quote existence

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Comments are scored by the dedicated comment spam model (#4838) but nothing
acts on the score yet. Articles already auto-demote via excludeSpam; this adds
the comment equivalent, using the softer 'collapsed' state (folded but still
expandable in-thread) per the '不刪除,只是不再被看見' governance principle.

When MATTERS_COMMENT_SPAM_AUTO_COLLAPSE=true, detectSpam collapses an active
comment whose spamScore reaches the tunable system spam threshold, skipping
authors on the bypassSpamDetection whitelist (same carve-out as articles).
Default off → scoring stays observe-only until ops opts in (zero-downtime,
same rollout pattern as #4838). Collapse is reversible; no deletion.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add integration tests for the campaign discussion comment feature
covering putComment (participant/organizer/manager permission matrix,
archived-campaign guard), vote/unvote (including the circle fallthrough
fix), deleteComment author/organizer permissions, togglePinComment
forbidden case, and discussion list/discussionCount consistency.

Also add the missing DB migration that extends the comment_type_check
constraint to allow 'campaign_discussion'. Without it, every attempt to
create a campaignDiscussion comment fails at the database layer with a
check-constraint violation, breaking the feature end-to-end. The feat
branch added the code, GraphQL schema, and resolvers for the new type
but never added this migration.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Cover the quote-wall feature's highest-regression-risk paths: dedupe,
per-article and daily caps, deleteQuote permission matrix (without
leaking existence of archived quotes), idempotent retraction, and the
campaign quotes query (active-only filtering, after pagination, random).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add error-path coverage for putComment campaignDiscussion: missing
campaignId, non-existing campaign, and over-length content; plus
organizer upvote and non-participant unvote-forbidden cases.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
feat(comment): add campaignDiscussion comment type for campaign discussion board
feat(quote): quote wall data layer for campaigns
The GraphQL ReportReason enum declares community_watch_porn_ad and
community_watch_spam_ad, but report_reason_check (from 20231221154057) was never
updated to permit them. submitReport inserts the raw reason, so any report with
those values failed in production with a report_reason_check violation
(INTERNAL_SERVER_ERROR) — surfaced by the coastguard bot's Tier-1 reports.

Realign the DB constraint with the schema. communityWatchRemoveComment is
unaffected (it syncs illegal_advertising, not these values).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Trigger develop deploy with base-repo secrets (the original fork-PR
merges could not access secrets). Deploys current develop HEAD which
includes #4841 (campaign discussion) and #4842 (quote wall) plus their
migrations.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
fix(report): allow community_watch_* reasons in DB check constraint
chore: redeploy develop to staging (seven-day-book)
L2 of the spam-data-retention roadmap: emit de-identified labeled samples to
SQS at the moderation boundary so the spam-model training signal survives later
content deletion that L1's passive DB extraction can't recover —
clearCommunityWatchOriginalContent nulls the snapshot, and account purge erases
content.

- common/notifications/spamSample.ts: enqueueSpamSample, mirrors
  enqueueReportAlert (best-effort SQS, never throws, no-op when unconfigured).
  Ids are HMAC-SHA256(salt) at emit so no raw user/content ids enter the queue;
  only the text the model trains on is carried verbatim.
- wired: communityWatchRemoveComment (confirmed spam at removal),
  clearCommunityWatchOriginalContent (capture before the snapshot is nulled;
  reversed action -> hard-negative ham).
- env: MATTERS_AWS_SPAM_SAMPLE_QUEUE_URL, MATTERS_SPAM_SAMPLE_HASH_SALT.

A separate Lambda worker consumes the queue and appends de-identified rows to
the S3 training bucket (see spam-detection-scaffold). Off until ops provisions
the queue + salt.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
CI lint failed: #-subpath/external/node: imports are one alphabetized group
with no blank lines. Reorder spamSample.ts imports accordingly.
…shot)

The clear mutation now snapshots the action before nulling (axis-2 L2), so its
test context must provide findCommunityWatchActionByUUID. enqueueSpamSample
no-ops without queue/salt env, so nothing is sent in tests.
Mirror reportAlert.test.ts: payload shape, HMAC de-identification (ids hashed,
never raw + deterministic), null score for ham, and no-op guards (queue unset /
salt unset / blank text) + AWS-error swallowing. Brings spamSample.ts diff
coverage to green.
Integration tests for the spam auto-collapse path: collapses an active comment
at/above the system threshold, leaves it active below threshold, and skips
bypassSpamDetection-whitelisted authors. Sets the spam_detection feature flag
and toggles commentSpamAutoCollapse around each case. Raises diff coverage.
CI test scripts only run build/{connectors,common/utils,routes,types}; the
common/notifications dir has no script, so the standalone spamSample.test.ts
never ran and spamSample.ts stayed at 38%. Remove that dead test and instead
exercise enqueueSpamSample's full body from communityWatchRemoveComment.test
(common/utils, which IS run): set the queue URL + hash salt, stub
aws.sqsSendMessage, and assert a de-identified sample (hashed ids) is enqueued
on removal.
feat(comment): auto-collapse spam comments behind env flag
…-sample-capture

# Conflicts:
#	.env.example
… project)

spamSample.ts was at 76.9% (lines 66, 84-85 uncovered). Add two removal cases:
aws throws -> removal still succeeds (covers the swallow/catch); removed comment
has blank content -> sample skipped (covers the blank-text guard). Brings the
file to ~full coverage so codecov/project no longer dips.
@mashbean

Copy link
Copy Markdown
Contributor Author

⏸️ 暫緩 merge / DO NOT MERGE — 七日書功能(#4841 campaign-discussion + #4842 quote-wall)尚未 ready,依決議等七日書 ready 後再整批 develop→master 一起上 prod。留言防治(#4838/#4843/#4844/#4846)一併隨此批部署。轉為 draft 防止誤觸。ready 後改回 Ready for review 即可 merge。

@mashbean

Copy link
Copy Markdown
Contributor Author

改走 plan B(2026-06-15):垃圾防治與七日書解耦。留言 spam 改由專屬 release #4849(spam-only,排除七日書)上 prod。本 PR(develop→master,含七日書)維持 draft,作為「七日書也 ready 時整批上」的備案;二擇一。

zeckli and others added 25 commits June 16, 2026 02:14
…velop

chore: back-merge master → develop (dedupe spam release commits)
…ify-only)

A high spam score alone can't separate true spam from false positives: on
matters_prod (7-day, >=0.94 band) precision is only ~60% — escort ads (0.996)
score the same as 中文 creative writing (0.992) and short genuine replies.
Account age doesn't separate either (an escort account was 818d old / 883
articles). What cleanly partitions them (ZERO false positives on the real
high-score set) is a compound gate:

  Tier A (auto):   score>=threshold AND contact-channel AND solicitation-keyword
                   → escort / paid-services / account-selling / betting promo.
  Tier B (ring):   author repeats near-identical content across comments.
  Tier C (review): high score but neither → surface to humans, never auto-act
                   (creative writing / opinions / replies land here).

This wires the gate into detectSpam and surfaces all three tiers to the admin
Telegram chat by reusing the existing report-alert SQS → reportTelegramAlert
pipeline (new source 'spam_detection'). NOTIFY-ONLY: it never hides a comment —
auto-action stays behind the separate, still-off commentSpamAutoCollapse flag —
so we validate the gate's precision in production before enabling enforcement.
Gated by MATTERS_COMMENT_SPAM_ALERT (default off).

Signal logic lives in a pure, fully unit-tested module (commentSpamSignals.ts);
the ring check is one bounded read of the author's recent comments, run only for
the rare high-score comments.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The ring near-duplicate check only stripped bare digits, so a rotated contact
token (sk3826, abc123) left a letter remnant (sk, abc) and otherwise-identical
spam templates failed to match. Drop whole alphanumeric tokens containing a
digit instead — the IDs/phone numbers spammers rotate — while keeping pure-letter
words so English templates still ring-match. Fixes the two failing ring tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…iering

feat(comment-spam): 3-tier moderation alerting to admin Telegram (notify-only)
feat(moments): change the decay factor of the moments feed
…ter-after-4852

# Conflicts:
#	src/connectors/commentService.ts
OSS (oss.matters.*) reuses the existing Google OAuth client to sign admins
in, but the server previously exchanged the auth code with a single hardcoded
redirect_uri (matters-web's), so an OSS-originated login could never complete
(Google requires the token-exchange redirect_uri to match the authorization
request). It also auto-created a Matters account for any Google login, which
is unacceptable for an internal admin tool.

- SocialLoginInput.redirectUri: optional OIDC redirect_uri for OSS SSO.
- exchangeGoogleToken / fetchGoogleUserInfo accept a redirect_uri override.
- socialLogin: when redirectUri is set it must be in the allowlist
  (MATTERS_OSS_GOOGLE_REDIRECT_URIS); the login is then treated as an OSS
  admin login — restricted to an existing admin account (matched by the
  Google-verified email) and never auto-creates a user.
- environment.ossGoogleRedirectUris: comma-separated allowlist.
- test: reject a non-allowlisted redirectUri.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
OSS comment/moment lists could only be ordered by id desc and the article
list had no spam-score sort, so admins had no way to surface the most likely
spam across articles/moments/comments.

- ArticlesSort.mostSpam: order scored articles by spam_score desc.
- OSSCommentsInput / OSSMomentsInput: add sort (OSSContentSpamSort) and a
  datetimeRange filter (for the 'last 7 days' window).
- comments/moments resolvers: build sortable/filterable queries via
  connectionFromQuery; commentService.findComments() query builder added.
- mostSpam excludes rows without a spam_score so unscored items don't pollute
  the ranking (and avoids NULLS-first ordering).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Raise patch coverage for the OSS social-login branch:
- non-allowlisted redirectUri is rejected
- non-admin / unknown account is rejected (no auto-create)
- existing admin account logs in successfully

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The oss.moments field input changed from ConnectionArgs to OSSMomentsInput;
update the system test query variable accordingly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Mock the Google token exchange to cover exchangeGoogleToken using the supplied
OSS redirect_uri (the non-e2e path), raising patch coverage for the SSO change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Exercise the mostSpam ranking and datetimeRange filter paths in the OSS
comment/moment list resolvers (and commentService.findComments) to raise
patch coverage.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(campaign): add enableQuoteWall flag to gate the quote wall

- add `enable_quote_wall` boolean column to campaign (default false),
  with a one-time backfill enabling it for existing 七日書 campaigns
  (name match used once at migration time; data-driven thereafter)
- expose `enableQuoteWall` on WritingChallenge + PutWritingChallengeInput,
  wired through the resolver, campaignService, and putWritingChallenge
- putQuote now rejects posting to a campaign whose quote wall is not
  enabled — server-side authoritative gate (the old check only required
  the article to belong to any campaign)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(quote): set enableQuoteWall on seed campaign; add gate-rejection test

The new putQuote gate rejects campaigns whose quote wall is disabled, so the
existing happy-path test (campaign defaulted to enable_quote_wall=false) broke.
Enable the wall on the seed campaign and add a test covering the rejection.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
feat(auth): support OSS Google SSO via allowlisted redirect_uri
feat(oss): rank content lists by spam score (last-N-days triage)
Relax the campaign-discussion comment permission from "only succeeded
participants (or organizers)" to any logged-in user. Basic user-state and
campaign-state guards are unchanged. Tests updated: non-participant / pending /
rejected applicants can now comment.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Interim: drop the 七日書-only restriction and open the quote wall for all
campaigns. Flip enable_quote_wall default to true and enable it on all existing
campaigns. The per-campaign flag is kept as a future hook (OSS toggle /
七日書-only restriction can be wired to it later without a schema change).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…short-content)

短內容共用模型對動態 ~90% 華語誤殺;專用動態模型重訓後驗收 @0.7 誤殺0.8%/召回93%
(vs 舊 93.9%/67%)。比照 #4838 留言拆專用:momentService.detectSpam 改用
momentSpamDetectionApiUrl,未設定時 fallback 短內容模型 → 零停機 opt-in。
ops 設 MATTERS_MOMENT_SPAM_DETECTION_API_URL 指向新 endpoint 即啟用。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-4852

chore: back-merge master → develop (dedup #4852 prod cherry-picks)
…ndpoint

feat(moment): 動態打分改用專用模型(fallback 短內容,零停機)
The spam_detection (A/B/C) alerts were the only report-alert source without a
"進 OSS 處理" link — moderators had to find the comment manually. Add an ossUrl
pointing at the OSS comments triage page with the comment's global id
(`${ossSiteDomain}/comments?id=<globalId>`), matching the link affordance the
direct/community_watch alerts already have. Uses the shared ossSiteDomain (now
correctly `oss.matters.town` in prod / `oss.matters.icu` in dev).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
feat(agent): add the basic rules for different models
…link

feat(comment-spam): add OSS deep-link to the 3-tier Telegram alerts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants