feat(comment-spam): 3-tier moderation alerting to admin Telegram (notify-only) by mashbean · Pull Request #4851 · thematters/matters-server

mashbean · 2026-06-16T10:15:37Z

What

Wires a 3-tier spam classification into detectSpam and surfaces all three tiers to the admin Telegram chat, reusing the existing report-alert SQS → reportTelegramAlert worker (new alert source spam_detection).

Notify-only. This never hides a comment — auto-action stays behind the separate, still-off commentSpamAutoCollapse. Gated by MATTERS_COMMENT_SPAM_ALERT (default off), so it's a no-op until ops opts in.

Why

A high spam score alone can't separate true spam from false positives. On matters_prod (7-day window, spamScore ≥ 0.94 band, the live system threshold) precision is only ~60%:

escort ads (0.996) score the same as 中文 creative writing (0.992), short genuine replies, and opinion comments.
account age doesn't separate either — an escort account was 818 days old with 883 articles.

What cleanly partitions them (zero false positives on the real high-score set) is a compound gate:

Tier	Rule	Catches
A — auto	score≥threshold ∧ contact-channel ∧ solicitation-keyword	escort / paid-services / account-selling / betting promo
B — ring	author repeats near-identical content (≥3 within 30d)	templated link-builder spam
C — review	high score, neither A nor B	creative writing / opinions / replies → human confirms, never auto-acted

How

commentSpamSignals.ts — pure, fully unit-tested gate (regexes + char-3gram Jaccard ring helpers). Validated against the real prod high-score set.
commentService.detectSpam → _alertSpamIfHighScore (tier + enqueue) and _isAuthorRepeating (one bounded read of the author's recent comments, only for the rare high-score comments).
reportTelegramAlert worker: new spam_detection source + per-tier reason labels.

Rollout

Ship behind the flag → turn on MATTERS_COMMENT_SPAM_ALERT in prod → watch the Telegram feed to confirm Tier A stays clean → only then consider enabling auto-action.

🤖 Generated with Claude Code

…ify-only) A high spam score alone can't separate true spam from false positives: on matters_prod (7-day, >=0.94 band) precision is only ~60% — escort ads (0.996) score the same as 中文 creative writing (0.992) and short genuine replies. Account age doesn't separate either (an escort account was 818d old / 883 articles). What cleanly partitions them (ZERO false positives on the real high-score set) is a compound gate: Tier A (auto): score>=threshold AND contact-channel AND solicitation-keyword → escort / paid-services / account-selling / betting promo. Tier B (ring): author repeats near-identical content across comments. Tier C (review): high score but neither → surface to humans, never auto-act (creative writing / opinions / replies land here). This wires the gate into detectSpam and surfaces all three tiers to the admin Telegram chat by reusing the existing report-alert SQS → reportTelegramAlert pipeline (new source 'spam_detection'). NOTIFY-ONLY: it never hides a comment — auto-action stays behind the separate, still-off commentSpamAutoCollapse flag — so we validate the gate's precision in production before enabling enforcement. Gated by MATTERS_COMMENT_SPAM_ALERT (default off). Signal logic lives in a pure, fully unit-tested module (commentSpamSignals.ts); the ring check is one bounded read of the author's recent comments, run only for the rare high-score comments. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The ring near-duplicate check only stripped bare digits, so a rotated contact token (sk3826, abc123) left a letter remnant (sk, abc) and otherwise-identical spam templates failed to match. Drop whole alphanumeric tokens containing a digit instead — the IDs/phone numbers spammers rotate — while keeping pure-letter words so English templates still ring-match. Fixes the two failing ring tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

codecov · 2026-06-16T10:33:44Z

Codecov Report

❌ Patch coverage is 94.20290% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.68%. Comparing base (571c047) to head (b12c8b7).
⚠️ Report is 19 commits behind head on develop.

Files with missing lines	Patch %	Lines
src/connectors/commentService.ts	86.66%	3 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #4851      +/-   ##
===========================================
+ Coverage    72.59%   72.68%   +0.08%     
===========================================
  Files         1067     1068       +1     
  Lines        21178    21246      +68     
  Branches      4623     4641      +18     
===========================================
+ Hits         15374    15442      +68     
+ Misses        5326     5325       -1     
- Partials       478      479       +1

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

mashbean requested a review from a team as a code owner June 16, 2026 10:15

mashbean merged commit ff85c5a into develop Jun 16, 2026
5 checks passed

This was referenced Jun 16, 2026

deploy(comment-spam): 3-tier telegram alerting to prod (notify-only, curated) #4852

Merged

chore: back-merge master → develop (dedup #4852 prod cherry-picks) #4854

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(comment-spam): 3-tier moderation alerting to admin Telegram (notify-only)#4851

feat(comment-spam): 3-tier moderation alerting to admin Telegram (notify-only)#4851
mashbean merged 2 commits into
developfrom
feat/comment-spam-telegram-tiering

mashbean commented Jun 16, 2026

Uh oh!

codecov Bot commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mashbean commented Jun 16, 2026

What

Why

How

Rollout

Uh oh!

codecov Bot commented Jun 16, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant