Skip to content

Reduce false positives: RN JSX, Platform.OS branches, NSPrivacyTracking correlation#4

Closed
kcck wants to merge 1 commit into
RevylAI:mainfrom
kcck:reduce-false-positives-rn-jsx-platform
Closed

Reduce false positives: RN JSX, Platform.OS branches, NSPrivacyTracking correlation#4
kcck wants to merge 1 commit into
RevylAI:mainfrom
kcck:reduce-false-positives-rn-jsx-platform

Conversation

@kcck
Copy link
Copy Markdown

@kcck kcck commented May 27, 2026

Continuing the false-positive reduction work from #1 (Amplitude, placeholder, HTTP namespace URIs). Five targeted fixes for over-broad patterns that fire on idiomatic React Native / Expo code.

What this changes

File Change
internal/privacy/scanner.go Tighten 6 SDK-detection regexes with word boundaries; fix NSPrivacyTracking false-correlation bug
internal/codescan/rules.go Add platform-reference ignorePatterns for RN/Expo idioms; add broader placeholder-content ignorePattern for JSX props; add JSX comment syntax ({/* */}) to comment-line skip

52 insertions, 8 deletions across the 2 files.

1. Tighten SDK-detection regexes (internal/privacy/scanner.go:101-111)

Patterns like (?i)(unity.*ads|UnityAds) are unanchored greedy substring matches. They fire on any single-line content that has unity somewhere followed by ads somewhere later. Real-world trigger from a production Expo app:

// Notify the community screen so its vote-board reloads.

matches as commUNITY ... rel-OADS and surfaces as a CRITICAL Unity Ads tracking SDK detected finding even though the project has zero Unity SDKs (verified: package.json, pnpm-lock.yaml, Podfile.lock all clean).

Fix: anchor with \b...\b and allow a small set of separator chars between the words:

{regexp.MustCompile(`(?i)(\bunity[\s._-]?ads\b|UnityAds)`), "Unity Ads"},
{regexp.MustCompile(`(?i)(\bgoogle[\s._-]?ads\b|GADMobileAds|admob)`), "Google Ads/AdMob"},
{regexp.MustCompile(`(?i)\bfirebase[\s._-]?analytics\b`), "Firebase Analytics"},
{regexp.MustCompile(`(?i)\bgoogle[\s._-]?analytics\b`), "Google Analytics"},
{regexp.MustCompile(`(?i)(fbsdk|\bfacebook[\s._-]?sdk\b)`), "Facebook SDK"},
{regexp.MustCompile(`(?i)\badjust[\s._-]?sdk\b`), "Adjust SDK"},

Word boundaries keep the SDK detection working for the real cases (UnityAds.framework, unity-ads, unity_ads, unity.ads, unity ads) while stopping the substring-match false positives.

2. Add platform-reference ignorePatterns (internal/codescan/rules.go:158-172)

The Reference to competing platform rule has a bare-word pattern \b(android|google\s*play|...)\b intentionally broad to catch JSX text nodes. But the same pattern fires on:

  • Platform.OS === 'android' (React Native runtime check)
  • Platform.select({ android: ..., ios: ... }) (RN platform helper)
  • { android: { ... } } (Expo app.json / app.config.js config object key)
  • 'android': ... (object literal key in quotes)
  • .android (property access: config.android, expo.android, os.android)
  • '.android.' (platform-extension file references)
  • os: Platform.OS (telemetry tag assignment)

None of these are user-visible references to a competing platform — they're internal platform branching that React Native / Expo apps cannot avoid. Added ignorePatterns covering each.

3. Add JSX placeholder-prop ignorePattern (internal/codescan/rules.go:186-188)

React Native (and any JSX UI library) uses placeholder="..." / placeholder={hint} as the standard TextInput hint prop. The existing Swift-targeted ignore placeholder\s*[:=]\s*[A-Z] expects an uppercase letter immediately after =, but JSX wraps values in { braces, and RN prop values are normally camelCase locals — both break the ignore.

Added a broader placeholder\s*[:=] ignore covering every JSX/JS placeholder-prop call site. True content matches (<Text>This is a placeholder.</Text>, string values that literally contain "placeholder", template literals containing the word, etc.) are still caught by the content-pattern regexes.

4. Fix NSPrivacyTracking false correlation (internal/privacy/scanner.go:258)

The current check:

strings.Contains(privacyContent, "NSPrivacyTracking") &&
strings.Contains(privacyContent, "<true/>")

does two independent substring lookups, so any manifest that legitimately has NSPrivacyTracking <false/> alongside NSPrivacyCollectedDataTypeLinked <true/> entries (the standard pattern when declaring data collection per ASC's "App Privacy" section — every Apple-compliant app that declares any user-linked data type does this) fires the INFO finding incorrectly.

Replaced with a regex that binds the key to its value:

var nsPrivacyTrackingTrueRe = regexp.MustCompile(`<key>\s*NSPrivacyTracking\s*</key>\s*<true/>`)

5. Skip JSX comment syntax {/* ... */}

The line-skip in PatternRule.Check trims and checks for //, /*, * prefixes. JSX comments use {/* ... */} — the canonical React syntax for inline comments inside JSX markup. Added that prefix to the skip list.

Verification

Tested against a React Native / Expo SDK 55 production codebase (~30k symbols, ~6900 communities by graphify count).

Before patches:

NOT READY — 1 critical issue(s) must be fixed
4 findings: 1 critical  3 warn

After patches:

GREENLIT — no critical issues found
2 findings: 2 warn

The 2 remaining warns are real source-string matches (a literal Google Play inside a string template, and the literal word placeholder inside a logger warning template) — those are correct Greenlight findings that need source-level fixes in the consumer codebase, not regex tweaks here.

False-negative check

No new false negatives introduced:

  • SDK detection still fires on properly-formed SDK references (UnityAds, unity-ads, unity_ads, unity.ads, unity ads)
  • The platform-reference rule still catches user-visible JSX text and string-literal mentions of competing platforms
  • The placeholder-content rule still catches strings containing placeholder / lorem ipsum / todo / coming soon / under construction / tbd
  • The privacy manifest tracking-vs-SDK check still fires correctly when NSPrivacyTracking is genuinely <true/>
  • JSX comment skip only triggers on lines that start with {/* after trimming — code lines with embedded comments still get scanned normally

🤖 Generated with Claude Code

…vacyTracking correlation

This continues the false-positive reduction work from PR #1 (Amplitude,
placeholder, HTTP namespace URIs). Five targeted fixes for over-broad
patterns that fire on idiomatic React Native / Expo code.

## 1. Tighten SDK-detection regexes with word boundaries

Patterns like `(?i)(unity.*ads|UnityAds)` are unanchored greedy substring
matches. They fire on any single-line content that has `unity` somewhere
followed by `ads` somewhere later. Example: a code comment like

  // Notify the community screen so its vote-board reloads.

matches as `commUNITY ... rel-OADS` and surfaces as a CRITICAL "Unity Ads
tracking SDK detected" finding even though the project has zero Unity SDKs.

Fix: anchor with `\b...\b` and allow a small set of separator chars in
between:

  (?i)(\bunity[\s._-]?ads\b|UnityAds)
  (?i)(\bgoogle[\s._-]?ads\b|GADMobileAds|admob)
  (?i)\bfirebase[\s._-]?analytics\b
  (?i)\bgoogle[\s._-]?analytics\b
  (?i)(fbsdk|\bfacebook[\s._-]?sdk\b)
  (?i)\badjust[\s._-]?sdk\b

Word boundaries keep the SDK detection working for the real cases
(`UnityAds.framework`, `unity-ads`, `unity_ads`, `unity.ads`, `unity ads`)
while stopping the substring-match false positives.

## 2. Add platform-reference ignorePatterns for RN/Expo idioms

The `Reference to competing platform` rule has a bare-word pattern
`\b(android|google\s*play|...)\b` that's intentionally broad to catch
JSX text nodes. But the same pattern fires on:

  - Platform.OS === 'android' (React Native runtime check)
  - Platform.select({ android: ..., ios: ... }) (RN platform helper)
  - { android: { ... } } (Expo config object key)
  - 'android': ... (object literal key)
  - .android (property access: config.android, expo.android)
  - '.android.' (platform-extension file references)
  - os: Platform.OS (telemetry tag assignment)

None of these are user-visible references to a competing platform — they're
internal platform branching that React Native / Expo apps cannot avoid.
Added ignorePatterns covering each of these idioms.

## 3. Add placeholder-content ignorePattern for JSX placeholder= props

React Native (and any JSX UI library) uses `placeholder="..."` /
`placeholder={hint}` as the standard TextInput hint prop. The existing
Swift-targeted ignore `placeholder\s*[:=]\s*[A-Z]` expects an uppercase
letter immediately after `=`, but JSX wraps values in `{` braces, and RN
prop values are normally camelCase locals — both break the ignore.

Added a broader `placeholder\s*[:=]` ignore covering every JSX/JS
placeholder-prop call site. True content matches (`<Text>This is a
placeholder.</Text>`, string values that literally contain "placeholder",
template literals containing the word, etc.) are still caught by the
content-pattern regexes.

## 4. Fix NSPrivacyTracking false correlation

The current check:

    strings.Contains(privacyContent, "NSPrivacyTracking") &&
    strings.Contains(privacyContent, "<true/>")

does two independent substring lookups, so any manifest that legitimately
has NSPrivacyTracking <false/> alongside NSPrivacyCollectedDataTypeLinked
<true/> entries (the standard pattern when declaring data collection per
ASC's "App Privacy" section — every Apple-compliant app does this) fires
the INFO finding incorrectly.

Replaced with a regex that binds the key to its value:

    <key>\s*NSPrivacyTracking\s*</key>\s*<true/>

## 5. Skip JSX comment syntax `{/* … */}`

The line-skip in PatternRule.Check trims and checks for `//`, `/*`, `*`
prefixes. JSX comments use `{/* … */}` — the canonical React syntax for
inline comments inside JSX markup. Added that prefix to the skip list so
bare-word matches inside JSX comments don't trip rules that otherwise
already skip line comments.

## Verification

Tested against a React Native / Expo SDK 55 production codebase
(~30k symbols, ~6900 communities by graphify count). Before patches:

  NOT READY — 1 critical issue(s) must be fixed
  4 findings: 1 critical  3 warn

After patches:

  GREENLIT — no critical issues found
  2 findings: 2 warn

The 2 remaining warns are real source-string matches (a literal "Google
Play" inside a string template, and the literal word "placeholder" inside
a logger warning template) — those are correct Greenlight findings that
need source-level fixes in the consumer codebase, not regex tweaks here.

No new false negatives introduced: the SDK detection still fires on
properly-formed SDK references (`UnityAds`, `unity-ads`, `unity_ads`,
`unity.ads`), the platform-reference rule still catches user-visible JSX
text and string-literal mentions, the placeholder-content rule still
catches strings containing "placeholder"/"lorem ipsum"/"todo"/etc., and
the privacy manifest tracking-vs-SDK check still fires correctly when
NSPrivacyTracking is genuinely `<true/>`.
@kcck kcck closed this by deleting the head repository May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant