Skip to content

Latest commit

 

History

History
428 lines (318 loc) · 22 KB

File metadata and controls

428 lines (318 loc) · 22 KB

BingeWatcher Security

The honest answer to two questions:

  1. What does this tool actually protect me from? (Part 1 — threat model)
  2. What did we deliberately NOT implement, and why? (Part 2 — intentional gaps)

The privacy stack itself (every preference, every cloak, every shutdown wipe) is described layer-by-layer in privacy.md. This document is the level above that: *how_ the stack maps to specific adversaries, and where the gaps are.

For responsible-disclosure / vulnerability reports, see the repo-root ../SECURITY.md.

Security-item IDs (S<n>) below are stable across the codebase — in-code comments like # T5/S22 in bw/hardened_profile.py point straight here.


Part 1 — Threat Model

What this tool protects against, per adversary class.

Status legend

Symbol Meaning
✅ covered Working defense, zero user action recommended
🟡 partial Some leakage remains; we mitigate but don't fully solve
⚠ opt-in Defense exists but is off by default (UX cost / niche use)
❌ not covered Out of scope; user must accept or mitigate elsewhere

Adversary classes (12)

1. ISP / network observer (passive) — between user and Tor guard

Status Detail
✅ covered (traffic content) Every byte to streaming sites + hosters goes through Tor. The ISP sees an encrypted Tor connection, not the URL or content.
🟡 partial (Tor usage itself) "User is talking to Tor" is visible. Hide via obfs4 / Snowflake bridges — opt-in via bridges.txt (T2).
🟡 partial (timing fingerprint) An ISP that records packet sizes + timings can sometimes correlate a Tor session to a known streaming-site signature (Schuster et al. 2017). We add ConnectionPadding 1 in torrc (cover traffic) which raises the bar; not perfect.

Implementation: Always-on Tor (Phase 0), Snowflake/obfs4 bridges (T2/S5), padding (S4 trade-off documented in code).


2. Streaming-Provider's own analytics — aniworld.to, s.to, filmpalast.to

Status Detail
✅ covered (IP) Provider sees a Tor exit IP, never the user's WAN IP.
✅ covered (cookie tracking) First-party-isolated cookies (dFPI, FPI) + storage wiped between sessions. No cross-site cookies.
✅ covered (canvas / WebGL fingerprint) Firefox RFP + our Canvas/WebGL/Audio cloaks (Phase 2).
✅ covered (navigator.webdriver) Hidden via Phase 3 navigator cloak.
⚠ opt-in (s.to exception) s.to bypasses Tor by default because Cloudflare Turnstile rejects every Tor exit IP. User sees a one-time modal explaining this. Disable via BW_STO_USE_TOR=1.

Caveats: aniworld and s.to share enough back-end infrastructure that an active adversary could deanonymise via timing if they really wanted to. Tor + RFP raise the bar to "they'd have to try". Not a defense against a determined provider-side adversary.


3. Hoster / CDN side — VOE, Doodstream, VEEV, VIDARA, etc.

Status Detail
✅ covered (IP) Same Tor coverage. Hoster sees a Tor exit IP.
✅ covered (AdBlock-detect) Our adblock cloak lies about ad-bait elements + short-circuits ad-network fetches.
🟡 partial (geo-block) Some hosters geo-block by IP. Tor exit happens to be in a blocked country → user can manually rotate via the Tor pill. We don't bias exit selection.
🟡 partial (HLS-chunk timing) Streaming session is fingerprintable by burst pattern (S15 — research-grade attack, not implemented; see Part 2).
❌ not covered (Cloudflare Turnstile) Turnstile rejects most Tor exits. On hosters protected by Turnstile, the user must solve a challenge manually. We don't bypass it (would defeat the point).

4. Cloudflare — fronts many streaming sites

Status Detail
🟡 partial Cloudflare sees the Tor exit IP + TLS-fingerprint. Our TLS 1.3-only stack + RFP normalise some signals; per-Tor-exit IP-reputation still differs widely. Best mitigation is exit rotation (Tor pill click).
❌ not covered (Turnstile challenge) Same as above.

5. Tor exit operator — sees decrypted HTTPS metadata

Status Detail
✅ covered (HTTPS content) Streaming hosters all use HTTPS; the exit sees only the encrypted tunnel and the destination hostname (from SNI).
✅ covered (OCSP leakage) OCSP stapling-only mode (S12); no per-connection cert-hash leak.
✅ covered (CRL fetching) security.use_external_pki=false (S24); no plaintext CRL pulls.
✅ covered (TLS-session-tickets) Disabled (S22) so the exit can't cross-link sessions via resumed-ticket IDs.
🟡 partial (per-domain Tor circuit) One Tor circuit per session, not per domain. An exit sees aniworld + its hoster iframe on the same circuit. dFPI (S23) is the cookie-level mitigation; pure circuit isolation per origin requires per-origin SOCKS auth (not implemented; cost > benefit for typical streaming).

6. Microsoft — Windows telemetry, NCSI, time-sync

Status Detail
✅ covered (with apply) The OS-Hardening wizard (Apply-OsHardening.ps1) disables DiagTrack, WER, Cortana-cloud, Defender-cloud-MAPS, NCSI active-probing (S25), Activity History, Inking telemetry.
✅ covered (hosts-file block) Block-TelemetryHosts.ps1 (S13) blackholes 20 MS telemetry / time-sync hosts via the hosts file.
⚠ opt-in The wizard needs Administrator. Default fresh-clone user has telemetry ON; the bot displays a banner pointing at the wizard. Most users don't run it.
🟡 partial (Defender on-access scanning) Stays on — that's anti-malware, not telemetry, and disabling it is a worse trade-off.

7. Local malware / co-process on the same Windows account

Status Detail
✅ covered (env-var leak) BW_* env vars (especially BW_FORENSIK_STASH absolute path) are popped from os.environ before any child process spawns (S31). Snapshot stays in-process for our own reads.
✅ covered (cmdline-arg leak) Stash path moved from env-var to settings-file route (S32). tasklist /v shows no BingeWatcher state paths.
🟡 partial (Marionette hijack) Marionette listens on 127.0.0.1:. Firefox/geckodriver authenticate via a per-session token, so a random local process can connect TCP but can't issue commands. Our firewall rule (S16) is best-effort — Windows Filtering Platform doesn't filter loopback by default.
🟡 partial (RAM scraping) Our state files in RAM-mode live in %TEMP%. A debugger attached to python.exe can read them. Mitigations: pagefile-disable (S17) prevents disk-paging of secrets; cold-boot RAM attacks (S49) are out of scope.
❌ not covered (Anti-Debug) We don't try to detect being debugged (S41 deliberately not implemented; see Part 2).

8. Forensic analyst with a disk image (offline)

Status Detail
✅ covered (state files) RAM mode (BW_FORENSIK_RAM=1) keeps state in tempdir, wiped on exit. Stash mode XORs state to a user-chosen dir.
✅ covered (Firefox profile) purge_firefox_profile_telemetry + wipe_tls_resumption_state shred cookies, formhistory, favicons, places.sqlite, sessionstore-backups, SiteSecurityServiceState.bin, AlternateServices.bin at session start AND shutdown.
✅ covered (Tor cache + log) AvoidDiskWrites 1 (S26) keeps the bulk of Tor's cache out of files; Log notice stdout (S27) sends logs to the launcher console instead of disk; on-shutdown wipe of cached-* + state files (S26).
✅ covered (Prefetch) firefox.exe-*.pf + python.exe-*.pf shredded at shutdown (S20).
✅ covered (JumpList / RecentDocs / Thumbnails) Wiped at shutdown via Cleanup-UserTraces.ps1 (S29 + S30).
✅ covered (crash dumps) MOZ_CRASHREPORTER_DISABLE=1 + WER registry tweaks + SetErrorMode (S6); existing dumps shredded at start.
✅ covered (geckodriver log) Silenced via Service(log_output=devnull) (S7); existing logs shredded at start.
⚠ opt-in (VSS snapshots) Cleanup-VolumeShadowCopies.ps1 (S18) deletes shadow copies <24h old. Needs Admin + explicit consent. Default-user case: snapshots may resurrect deleted state.
⚠ opt-in (Search Indexer) Exclude-FromSearchIndex.ps1 (S19) keeps our paths out of Windows Search indexing. Needs Admin.
⚠ opt-in (Pagefile / hibernation) Configure-PagefileAndHibernation.ps1 (S17) clears pagefile-on-shutdown + disables hibernation. Needs Admin + invasive.
❌ not covered (encrypted disk required) If the system disk isn't encrypted (BitLocker / VeraCrypt / FileVault), all of the above is meaningful but recoverable with enough effort. We recommend full-disk encryption as the FIRST defense.

9. Forensic analyst with a running system — laptop seized while alive

Status Detail
🟡 partial (process memory) Python + Firefox + tor.exe + ffmpeg.exe are all visible in Task Manager. tasklist /v shows command-lines (clean per S32). Process Explorer can dump heap — visible cookies / decrypted HLS chunks.
🟡 partial (network connections) netstat -ano shows the Tor SOCKS port (9050) + the Marionette debug port. The Tor circuit info is in tor/data/state — wiped at clean shutdown, present while alive.
❌ not covered (RAM forensics) Cold-boot / FireWire-DMA attacks on a running system can read RAM directly. App-layer can't prevent this; BitLocker + Power-on-Password is the OS-level fix (S49 / out of scope; see Part 2).
❌ not covered (live process inspection) A debugger attached to python.exe can read everything we have. Anti-debug (S41) deliberately not implemented.

Conclusion: A seized-running-laptop scenario is mostly a lost cause via this tool. Hibernate to encrypted disk + power off if you think this is your threat model.


10. Co-user on the same Windows account

Status Detail
✅ covered (file content) Stash + RAM modes hide the state from the filesystem. Default mode leaves files in SerienJunkie/, readable by any co-user.
✅ covered (history / cookies) Firefox profile is wiped between sessions; co-user can't open Firefox to see our cookies.
✅ covered (Start-menu search) Repo paths excluded from Windows Search Indexer (S19) — co-user typing "aniworld" in Start Menu finds nothing.
❌ not covered (filesystem ACL) All BingeWatcher files have the user's default ACL — any process running as the same Windows user can read them. The fix is multi-user Windows accounts (S48 / out of scope for a single-user app).

11. Public Wi-Fi adversary — coffee shop, hotel, conference

Status Detail
✅ covered (traffic content) Tor encrypts everything; WiFi adversary sees only Tor handshakes.
✅ covered (WiFi probe-request leak) Cleanup-WifiNetworks.ps1 (D.2) prunes saved SSIDs so the laptop doesn't broadcast every previous network when entering range.
✅ covered (NetBIOS / LLMNR / SSDP announcements) Configure-LanBroadcasts.ps1 (S14) disables NetBIOS over TCP/IP, LLMNR, and stops SSDP. WiFi neighbours don't see "DESKTOP-XXXX is online" or your machine's hostname.
✅ covered (TCP fingerprint) Configure-TcpStack.ps1 (S10) normalises DefaultTTL to 64 + window-scaling to Linux-typical so the WiFi sniffer sees a Tor-relay-shaped fingerprint.
🟡 partial (mDNS) mDNS / Bonjour from Apple devices that pair via Bluetooth isn't covered by us. Disable Bluetooth (D.3) to close that surface.

12. Global passive adversary — nation-state-scale traffic correlation

Status Detail
❌ not covered A nation-state observing both ends of a Tor circuit (entry guard + exit) can correlate timing. This is the well-known Tor weakness and there's no app-layer fix.
🟡 partial (vanguards) Vanguards-Tool (S28 / not implemented; see Part 2) protects against guard-discovery attacks. Cost: 500 LOC + complexity; benefit: only against the global-passive class.
✅ covered (multiple Tor exits) Per-session SOCKS-auth (bw_<16hex>) gives each bot session its own circuit. Episode-boundary NEWNYM rotates circuit. Reduces correlation surface vs. naive Tor use.

Realistic statement: If your adversary is a nation-state with ISP-level taps, this tool helps a little but is not sufficient. Use Tails or Whonix on top of an air-gapped boot environment instead.


Cross-reference: which S-items implement which defense

Adversary Items
ISP timing S4 cover traffic, T2 bridges, S5 Snowflake
Provider analytics S2 forensik default, S6 dumps, S7 geckolog, S8 storage, S12 OCSP, S21 0-RTT, S22 tickets, S23 FPI site-mode, S24 CRL, S35 CSP report, S36–S40 cloaks, RFP + Canvas/WebGL/Audio cloaks
Hoster geo-block / adblock-detect AdBlock-cloak, MapAddress in torrc for ad networks
Tor exit metadata S12, S22, S24
Microsoft S13 hosts-block, S25 NCSI off, Apply-OsHardening (D.1, D.3)
Local malware S16 Marionette firewall (best-effort), S31 env pop, S32 cmdline clean
Disk forensics S6, S7, S8, S17–S20, S22 wipe, S26 Tor cache, S29 JumpList, S30 thumbnails
Running-system forensics S31, S32 (limited; mostly out-of-scope)
Co-user same account S19 search exclude, forensik RAM mode
Public WiFi S10 TCP fingerprint, S14 LAN broadcasts, D.2 WiFi cleanup
Global passive Tor itself + NEWNYM rotation; nothing more

Part 2 — What we deliberately did NOT implement

These items were considered during the audit and chosen not to implement. They live here so future contributors don't re-litigate the same questions, and so users with stronger threat models can identify a specific gap and patch it themselves.

Excluded by design

S3 — (audit-internal placeholder)

Originally part of the P0 batch; merged into an adjacent item during the audit consolidation. Number kept reserved so the S-numbering stays stable across other docs.


Deferred — blocked by upstream

S9 — TLS-Fingerprint (JA3 / JA4) normalisation

The gap. Firefox 128 has a different cipher-suite order than Tor Browser 13. Cloudflare clusters our traffic trivially as "Firefox via SOCKS proxy to Tor" — a distinct behavioural class.

Why we're not fixing it. Deep research (2026-05-19) by a dedicated exploration agent concluded that JA3/JA4 parity between Firefox and Tor Browser is practically unreachable without an NSS-build-time patch:

  1. Cipher-suite order is hard-coded in NSS. No Firefox preference can re-order them; only on/off via security.ssl3.* prefs.
  2. NSS 3.84+ randomises TLS-extension order automatically (Bugzilla 1789436). Both Firefox 138 and Tor Browser 13 use it, which makes JA3 a less reliable differentiator over time.
  3. Stunnel / mitmproxy detour is high-friction with diminishing returns. Tor itself detects non-Tor ClientHellos at entry nodes; 2-3 week implementation buys <5 % signal reduction.

What we ship instead. TLS 1.2 minimum, 0-RTT off (S21), session-tickets + session-identifiers off (S22), OCSP stapling-only (S12), security.ssl.require_safe_negotiation = true, RFP + Canvas/WebGL/Audio cloaks for the JS layer. Plus a default Cloudflare-bypass path for s.to (which is the actual blocker that prompted the JA3 question).

Re-open when. Firefox / Mozilla add a security.tls.cipher_order pref, OR we allocate a maintainer for custom NSS builds, OR Cloudflare rolls out a JA3-specific detection layer that can't be worked around via IP-reputation.


Bewusst nicht implementiert — out-of-scope or net-negative

S15 — HLS-Chunk-Timing fingerprint

The gap. Even through Tor, HLS chunk sizes (4–6 s segments) are a strong per-episode fingerprint. Schuster et al. 2017 ("Beauty and the Burst") showed 99 % identification rates for streaming sessions despite Tor.

Why not. Mitigation would require a JS hook in every hoster iframe that randomly delays HLS chunks (50–300 ms before appendBuffer) plus decoy traffic with matching burst profiles. Complete defense is impossible; realistic improvement is maybe 20–30 % at the cost of streaming reliability. Research-grade adversary; typical users are not at risk.

S28 — Vanguards (guard-discovery protection)

The gap. The Tor Project's Vanguards tool protects against guard-discovery attacks (an adversary triangulates your entry guard via onion-service crawling). Our circuit_build_pattern is still attackable in theory.

Why not. Vanguards is ~500 LOC of Python that modifies torrc defaults for Layer-2 and Layer-3 guards. We don't use .onion services directly, so the attack surface is much smaller than Vanguards is designed for. The cost (added complexity + a long- lived background script) outweighs the benefit for our use case.

S41 — Anti-Debug / Process-Protection

The gap. Bot process can be attached to with WinDbg / x64dbg by user-context malware.

Why not. Anti-debug tricks (IsDebuggerPresent check, PEB flag, TLS-callback tricks) are a perpetual cat-and-mouse game. Crucially, they also signal "this is a bot" to consumer AV / EDR suites, which often flag anti-debug as ransomware-like behaviour. Net negative for typical users.

S42 — Memory-locking sensitive data

The gap. ALTCHA tokens, Cloudflare cookies, decrypted HLS chunks sit in a normal heap → could end up on the pagefile.

Why not. Pagefile-disable (S17 — Configure-PagefileAndHibernation) covers ~90 % of the threat. A VirtualLock / mlock equivalent for specific buffers would require deep changes to dependencies (Selenium, PyAudioWPatch) and the residual 10 % isn't worth that disruption.

S43 — Code-signing our binaries

The gap. python.exe (embedded runtime), our .bat launcher, the .pyc cache files are unsigned. Microsoft SmartScreen warns on first run; some AV produces false positives.

Why not. An Authenticode signing cert costs ~$200/year. This is a reputation concern, not a privacy one. The user's threat model doesn't change with signed binaries — only their UX (no SmartScreen warning). Out of scope for an open-source single-user tool.

S44 — Repo signed-commits verification

The gap. git pull from the F7 auto-update flow doesn't verify GPG signatures on commits. A repo compromise (maintainer's key stolen) would push without detection.

Why not. Requires maintainer-side workflow (every commit GPG- signed, key published) plus client-side git verify-commit integration in F7. User-facing benefit is minimal — the actual mitigation against "malicious upstream" is not pulling automatically, which F7 already makes a deliberate user action.

S45 — Reproducible builds

The gap. Embedded Python 3.12 + geckodriver in runtime/ are shipped from upstream binaries, not from source. A user can't independently verify they match the published source.

Why not. High implementation effort (fixed CFLAGS, deterministic Python build chain, hash-pinned wheel cache). Practical impact is low because (a) we ship SHA-256 manifests + bw.integrity verifies them on every launch, (b) the user can replace runtime/python/ with their own build if they don't trust ours via BW_PYTHON_EXE.

S46 — NoScript-extension auto-update

The gap. Our bundled NoScript XPI has its SHA-256 pinned in .integrity.json. No auto-update check from AMO.

Why not. This is deliberately the opposite of a gap: if AMO ever pushed a backdoor NoScript build, we'd be unaffected (the integrity check would reject it). Auto-update would re-open that attack vector in exchange for slightly faster security-patch delivery. The trade- off favours staying pinned + manual maintainer bumps.

S48 — Multi-user Windows-account isolation

The gap. If multiple Windows users share the same machine, our bot has no sandbox; one user's state is readable by anyone in the same Windows account or with admin.

Why not. Out of scope for a single-user app. The mitigation ("install per-user under %LOCALAPPDATA%, use Windows ACLs") is a documentation point, not code. Users with multi-user concerns should run separate Windows user accounts entirely — the OS-level fix is the right level for that defense.

S49 — Cold-boot RAM-attack defense

The gap. RAM forensics (cold-boot attack, FireWire DMA) can read our RAM-mode state files directly from DRAM before they're cleared.

Why not. App-layer can't reach this. The mitigation is OS-level: BitLocker full-disk encryption + power-on password so the attacker can't get into a running session, plus power-off-not-hibernate so RAM is cleared. Documented in Part 1 under "Forensic analyst with a running system".

S50 — WebRTC disable-lock

The gap. Our hardened profile sets WebRTC = off, but the env var BW_TEST_ENABLE_WEBRTC=1 can re-enable it globally for diagnostic runs. A user might leave the env var set by mistake.

Why not. This is a UX footgun, not a technical gap — the user explicitly chose to set the env var. Better mitigation: clear docs + the env var name has TEST_ENABLE in it (obvious sin). Adding a "production-build flag that strips diagnostic toggles" would just push the problem down a layer.


How to read this document

  • If you care about ad-network tracking, your ISP knowing what streaming sites you visit, or local co-users finding your watch history: this tool helps a lot.
  • If you care about Microsoft / your ISP knowing you use Tor at all: run the OS-Hardening wizard + use Snowflake bridges.
  • If you care about a forensic analyst with your seized disk: **enable forensik RAM/stash mode + Configure-PagefileAndHibernation
    • full-disk encryption**.
  • If your adversary is a nation-state with passive global observation: use Tails or Whonix on an air-gapped boot — this tool is not enough.

How to extend this document

When a new security audit identifies an item:

  • Implementing it: add a row to Part 1 under the relevant adversary class. Reference the code with the S<n> ID + module path.
  • NOT implementing it: add an entry to Part 2 with the threat in one paragraph + the reason in one paragraph + a "Re-open when" criterion if applicable.

Keep # T5/S<n> comments in the source code in sync — they're the canonical cross-reference into this document.