Premium TTS subscription (AWS Polly, Azure, Cartesia) by Saqoosha · Pull Request #73 · Saqoosha/HDZap

Saqoosha · 2026-05-20T14:48:06Z

Summary

Adds HDZap Premium subscription — StoreKit 2 + Cloudflare Worker proxy that streams natural-sounding lap announcer voices from AWS Polly, Microsoft Azure, and Cartesia, with iOS local disk cache and Worker-side R2 cross-user cache.

iOS: PremiumSpeechSynthesizer (raw PCM streaming, AVAudioConverter resampling for Polly 16 kHz), TTSCache (SHA-256 keyed v2 disk cache), SubscriptionManager (StoreKit 2 with monthly-first product order, JWS for Worker auth), PaywallView/PremiumVoicePickerView (locale-aware sample voices + EN/JA copy)
Worker hdzap-premium: two-tier auth (Apple JWS for subscribers + bake-in DEV_BEARER for fallback), per-IP daily KV rate limit (moved to waitUntil to keep hot path ~200 ms), R2 cross-user cache, three provider paths (Polly SigV4 raw PCM, Azure raw-24khz PCM, Cartesia SSE PCM)
Settings UI split into Announcement + Voice sections; Language picker lives under Voice and resets the Premium voice on change
Free tier (AVSpeechSynthesizer) unchanged — Premium adds a route, doesn't replace
Privacy policy rewritten EN + JA to disclose the Premium data flow (announcement text + JWS only, no PII, no race history, 30-day cache)
StoreKit IAP products created in ASC via API: monthly ¥450, yearly ¥4,000, worldwide availability, review screenshot + reviewNote attached

Test plan

Sandbox tester (created via web UI) signs in on iPhone Air, taps "Try Premium voices", picks a voice, hears it
Worker /tts accepts real Apple JWS (no ALLOW_XCODE_LOCAL_JWS bypass — secret deleted)
Cold TTFA ~200 ms Polly, ~300 ms Azure, ~400 ms Cartesia; warm hits hit local cache (~0 ms)
Language switch on Voice settings resets premiumLapVoiceId to default
Cancel via Settings → Apple ID → Subscriptions falls back to System voice
Privacy policy renders correctly at /privacy/ (EN) and /privacy/ja/ (JA)

🤖 Generated with Claude Code

Summary by CodeRabbit

リリースノート

新機能
- Premium音声合成機能を追加。複数のクラウドプロバイダーから高品質な音声を利用可能に。
- アプリ内購読機能（月額・年額）を実装。StoreKit2に対応。
- Premium音声選択UIと設定画面を拡張。言語・プロバイダーごとに音声を絞り込み可能に。
- 音声キャッシュ機能を追加。再生速度を改善。
改善
- オーディオ設定画面を再編。エンジン（System/Premium）の切り替え機能を追加。
ドキュメント
- プライバシーポリシーをPremium対応に更新。日本語版も新規追加。

- Python harness for side-by-side TTS provider audition (11 providers wired up) - HDZap-specific test phrases derived from LapAnnouncer.swift utterances - Streaming TTFA measurement script supporting direct + Worker proxy modes - Number-format A/B test (discovered: `、` before numbers fixes digit-by-digit reading) - 1Password-backed .env.op so secrets never hit disk Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Hono-based Worker on workers.dev (hdzap-premium.saqoosha.workers.dev) - POST /tts forwards to Cartesia /tts/sse with X-API-Key from Worker secret - Stub Bearer auth (DEV_BEARER) until StoreKit 2 JWS verification lands - 300-char transcript cap, allowlisted models/langs - Streams SSE through untouched; ~30ms median overhead vs direct (measured) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- /tts now accepts a `provider` field (cartesia | polly | azure) and dispatches to the matching upstream while passing the response body through untouched. Default stays "cartesia" so older iOS builds keep working unchanged. - Polly: signs with aws4fetch using temp credentials minted via the same public Cognito Identity Pool YourLabs uses. Worker caches the creds at module scope until ~60s before expiry so successive calls don't pay the 2-RTT Cognito handshake. Output is chunked mp3. - Azure: uses Ocp-Apim-Subscription-Key from the new AZURE_SPEECH_KEY Worker secret, japaneast region. Output is chunked mp3. - SSML envelope built per-provider; xml:gender derived from the Azure voice name regex so the SSML validator stops 400-ing for male voices. On-device TTFA measurements (iPhone Air, Japan East / Tokyo): Cartesia Sonic 3.5 (PCM/SSE): ~339 ms Polly Neural Takumi (mp3): ~56 ms Azure Neural Daichi (mp3): ~93 ms Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- New PremiumSpeechSynthesizer streams Cartesia SSE through the hdzap-premium Worker, decodes base64 PCM s16le 24kHz chunks, and schedules them on a dedicated AVAudioEngine + AVAudioPlayerNode. - PremiumVoiceCatalog enumerates all 22 native Japanese Cartesia voices plus 3 English picks (Scott Sportscaster + Tyler + Madison). - DEBUG-only test panel in AudioSettingsView: worker URL, bearer field, voice picker, TTFA readout. Production code path will replace this with StoreKit 2 entitlement gating. - Bugs fixed during real-device bring-up on iPhone Air (iOS 26.5): * AVAudioPCMBuffer.int16ChannelData returns nil for interleaved Int16; switched the format to non-interleaved Float32 and convert s16 → float32 on the way in (otherwise the scheduled buffer is silent). * URLSession.AsyncBytes.lines silently drops blank SSE event terminators on iOS; parse each `data:` line standalone instead of accumulating until a blank line, otherwise multiple JSON objects concatenate into one malformed blob and JSONSerialization fails for every chunk. * Added engine.prepare() to keep the first scheduleBuffer from racing engine.start(). Measured TTFA on iPhone Air: ~235-490ms (median 245ms after warm), well within the 500ms race-time budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Voice catalog grows 25 → 35: all 22 Cartesia JA voices, 3 Polly Neural JA voices (Takumi/Kazuha/Tomoko), 7 Azure JA Neural voices (Daichi/Keita/Naoki/Aoi/Mayu/Nanami/Shiori), plus 3 Cartesia EN. - PremiumVoiceProvider enum drives both Worker dispatch and on-device decode path: Cartesia stays on the SSE → PCM → AVAudioEngine route, Polly + Azure go through a new chunked-mp3 → AVAudioPlayer path. - speak() now takes a full PremiumVoiceOption instead of a bare ID so the synth reads provider + lang together; the Worker body includes the `provider` field and only sends `model` for Cartesia. - cancel() teardown extended to stop the mp3 player too. Real-device TTFA on iPhone Air (Japan East): Cartesia Sonic 3.5 (PCM/SSE): ~339 ms Polly Neural Takumi (mp3): ~56 ms Azure Neural Daichi (mp3): ~93 ms Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Add a `TTS Engine` picker (System / Premium) and route LapAnnouncer's speak() through PremiumSpeechSynthesizer when Premium is selected and a voice has been picked. If either condition fails the call falls through to AVSpeechSynthesizer so the operator never gets silence. - SPACE → 、 in the Japanese final-race summary so cloud voices read "12.34" as "じゅうにてんさんよん" instead of digit-by-digit. System TTS handles both forms identically, so the change is safe both ways. - New PremiumVoicePickerView (NavigationLink) with provider sections in AWS → Azure → Cartesia order, plus an inline ▶ preview button per row that swaps to a stop icon while the sample plays. The synth's `isPlaying` now reflects actual playback completion (AVAudioPlayer delegate for mp3, scheduleBuffer .dataPlayedBack for PCM) so the icon flips back at the right moment. - Rate + pitch sliders surface per provider capability: * Polly Neural: rate only (Neural rejects pitch — "Unsupported Neural feature") * Azure Neural: rate + pitch * Cartesia Sonic 3.5: neither (preview disabled prosody) Worker accepts the values in the request body and builds the right SSML per provider; iOS only sends the fields the voice supports. - HDZapApp registers the four new defaults so first-run state is sane. Verified on iPhone Air iOS 26.5: System → Premium → Polly Takumi / Azure Daichi / Cartesia Takeshi all play back at race time and the rate slider scales the cadence as expected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@Environment

- SubscriptionManager wraps StoreKit 2: loads the monthly + yearly products, listens to Transaction.updates, resolves the current entitlement off Transaction.currentEntitlements (including grace period for the billing-retry window), and exposes purchase / restore / isEntitled to the rest of the app via @Environment. - PaywallView ticks every box in App Store Review Guideline 3.1.2: prominent per-period price, auto-renew disclosure, Restore Purchases, Manage Subscription deep-link, Terms + Privacy links, free-trial period called out per product. - The Engine picker in AudioSettingsView shows "Premium — Subscribe ›" for non-entitled users; tapping it pops the paywall sheet and snaps the picker back to System until the purchase completes. The subscription state is also watched so a lapsed entitlement automatically rolls the engine back to System. - HDZapPremium.storekit ships a local sandbox configuration with the real product IDs (sh.saqoo.HDZap.premium.monthly + .yearly), JA/EN localizations, ¥450 / ¥4000 prices, and a 7-day free trial. The scheme in xcshareddata wires it up so Debug builds run against this config without an App Store Connect setup. - HDZapApp starts the manager via .task on the root WindowGroup, so the Transaction.updates listener is alive before any UI tries to read entitlement. Sandbox-tested in Xcode: purchase + restore + cancel paths flip isEntitled correctly and the Premium engine row gates on / off. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Paywall: show 3 sample voice rows (Cartesia/Azure/Polly) with inline play/stop buttons, modeled after the picker's audition flow. Lets the operator hear what they'd be paying for before tapping subscribe. - Paywall: 5-second product-load timeout + .empty state with a "Try again" button + explanation, so a paywall opened via devicectl (no StoreKit sandbox) doesn't spin forever. - Voice picker: surfaces a "Subscribers only" banner at the top for non-entitled operators, with a Subscribe CTA that opens the paywall sheet. Preview buttons stay free; row taps stash the voice ID and pop the paywall, then auto-commit on entitlement change. - Audio settings: replaces the locked engine tap's modal paywall with a push into the voice picker (the conversion surface lives there now). Adds a non-subscriber "Listen to Premium voices" entry below the Engine picker so the path to preview is one tap deep. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- SubscriptionManager exposes currentJWS (the active entitlement's jwsRepresentation). Captured during refreshEntitlement alongside status so the synth can read it on every call without re-walking Transaction.currentEntitlements. - PremiumSpeechSynthesizer reads auth via a closure-injected jwsProvider — wired in HDZapApp.task once both objects exist. Auth precedence: JWS (subscriber) → dev-panel bearer (runtime override) → baked-in BuildSecrets.workerBearer (preview default). Worker's verifyAppleJws now does the heavy lifting on the JWS path; the baked bearer path stays as the non-subscriber preview channel. - BuildSecrets.swift.tpl: 1Password-templated bearer (op://Personal/ HDZap Worker Dev Bearer/credential). The rendered .swift is gitignored — run `op inject -i …tpl -o …swift` before xcodegen. Stops fresh checkouts from shipping an empty bearer. - Add 11 Polly + 9 Azure English Neural voices alongside the existing 3 Cartesia EN picks. Operator now has US/UK/AU accents across all three providers for race calls + summaries. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Stacks below the Worker R2 cache: identical request → 10 ms file-disk read instead of 150 ms Worker+R2 round trip. Race-time fixed phrases ("Start", countdown, "Last lap") and voice-picker samples now play without any network on every replay across the same install. - TTSCache.swift: FileManager-backed disk cache under `<Caches>/HDZapTTS/`, hex SHA-256 key over the same canonical parameter tuple as the Worker (so the two layers point at the same logical entity). 50 MB cap with LRU eviction to half-cap on overflow. - PremiumSpeechSynthesizer.speak(): cache lookup at the top — hit routes straight to AVAudioPlayer (mp3) or AVAudioPlayerNode (PCM) with no streaming setup, no auth header construction, no URLSession. - Cache miss path saves the bytes that AVAudioPlayer accepted (mp3) or the concatenated decoded PCM accumulated across the SSE stream (Cartesia). Save only fires on clean stream completion so a cancelled-mid-flight call can't write a truncated entry. Verified on iPhone Air: 3 consecutive taps of the same voice produced 1 Worker request and 2 silent (local-cache=hit) calls. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Add appleJws.ts: full Apple StoreKit2 JWS verification — ES256 signature check via x5c[0] leaf, cert chain walked back to embedded Apple Root CA G3 (fingerprint pinned), payload checks for bundleId, productId, and expiry honouring Apple's 16-day billing-retry grace period. - Two-tier auth in /tts: shape-detect JWS (3 base64url segments) → Apple verification path with 300-char text cap; opaque bearer → shared-dev preview path with 60-char cap. A leaked bearer can't be turned into a free TTS-as-a-service when every preview is sub-60-char. - Detect Xcode-local .storekit JWS via kid="Apple_Xcode_Key" and accept the self-signed single-cert chain ONLY when ALLOW_XCODE_LOCAL_JWS=true (env-gated). Real Apple Sandbox/TestFlight/Production JWS keeps going through the full chain check. Production deploys leave the var unset. - Diagnostic console.log/console.error around auth + JWS verify so the observability dashboard shows authMode + failure code per request. Deps: jose@6 for JWS parsing + signature verify; @types/node for the node:crypto X509Certificate chain walker (Workers nodejs_compat). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The Cognito identity-pool dance made sense for YourLaps because that app hits Polly directly from the iOS client and needs short-lived public credentials. For HDZap the Worker is itself a trusted server, so a purpose-scoped IAM user (`hdzap-premium-polly`, policy: SynthesizeSpeech only) is the simpler and cheaper path — no 2-RTT GetId/GetCredentials handshake on the hot path, no cross-project credential sharing. - Drop COGNITO_POOL_ID + getCognitoCreds + creds cache. - Take POLLY_ACCESS_KEY_ID and POLLY_SECRET_ACCESS_KEY from wrangler secrets (injected via stdin from 1Password "HDZap Polly IAM User"). - proxyPolly signs SigV4 directly with the long-lived keys, no session token. Rotation = `wrangler secret put` + 1Password update. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

KV-backed counter at `rl:<ip>:<YYYY-MM-DD>`, 48h TTL so yesterday's entries self-evict and the namespace stays small. Cap is 1000/day for both auth tiers — IP rate limit is a coarse guardrail given residential NAT (carriers, Apple Private Relay, family WiFi all share IPs across many users), so a tighter cap would punish legit first-time-audition sessions (~150 previews across 55 voices). Stops single-host curl loops without hurting real users; determined attackers rotating residential proxies aren't blocked by this layer — the TTS-provider spending caps and the Apple-signed JWS are the real ceilings. 429 response carries Retry-After (seconds-to-UTC-midnight) so an iOS client could surface "try again tomorrow" without polling. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Same params from any two callers now reuse one generation. First caller pays the provider; everyone after them streams from R2 with the same content type and X-HDZap-Cache=hit header. - Bind R2 bucket `hdzap-tts-cache` (binding name TTS_CACHE). - buildCacheKey: hex SHA-256 of provider|voice|lang|rate|pitch|model|text so rate/pitch are part of the key (a user who customised those gets their own cache lane shared with everyone else on the same custom settings, not collisions with default-settings users). - Cache hit path: stream R2 body straight back with provider's content- type (httpMetadata). - Cache miss path: tee the upstream body — one branch streams to client immediately so audio playback isn't held up, the other branch drains into ArrayBuffer (R2 rejects unknown-length ReadableStreams) and waitUntil-puts to R2 in the background. - Error path: provider 4xx/5xx is passed through unchanged, never cached (otherwise a transient outage would poison the entry). Verified end-to-end: identical request returns 0.85s cache hit on 2nd call vs 1.22s cold miss. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…cret The flag was previously declared in [vars] with value "true", which broadcast to anyone reading the repo that the Worker was accepting self-signed Xcode-local JWS (kid: "Apple_Xcode_Key"). Combined with the Worker URL being baked into the iOS app, that was enough for an attacker to skip the subscription gate by crafting their own .storekit-signed JWS. Moving to a Worker secret (set via `wrangler secret put ALLOW_XCODE_LOCAL_JWS`) removes the dev-acceptance signal from the public surface. The production attack surface is unchanged either way — real Apple-signed JWS still goes through the full chain check to Apple Root CA G3 — but an attacker now has to actively probe the live Worker instead of reading the config to know whether Xcode-local JWS is accepted. Delete the secret before App Store release (`wrangler secret delete ALLOW_XCODE_LOCAL_JWS`) to fully close the path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Polly Neural and Azure Neural both stream chunked output natively, but the Worker was asking for mp3 — iOS then buffered the entire mp3 before handing it to AVAudioPlayer. Result: race-time TTFA ~1 s instead of the ~60-90 ms each provider's API actually delivers. Switching to raw s16le PCM lets iOS schedule each chunk on AVAudioPlayerNode the moment it arrives, same path Cartesia's SSE PCM has used since day one. - proxyPolly: OutputFormat="pcm" SampleRate="16000" (Neural's PCM cap; iOS upsamples 16 kHz → 24 kHz via AVAudioConverter). - proxyAzure: outputFormat="raw-24khz-16bit-mono-pcm" (matches the engine's native rate, no resample needed). - responseHeadersFor: Content-Type "audio/pcm", X-HDZap-Format "pcm-raw", X-HDZap-SampleRate. Cartesia keeps "text/event-stream" + "pcm-sse". - buildCacheKey: prefix "v2" so pre-PCM-migration entries (Polly/Azure mp3 cached as audio/pcm after the format switch) become orphans instead of being replayed as malformed PCM and producing noise. KV rate limit moved to c.executionCtx.waitUntil(...) so a 600-800 ms read+write doesn't sit on the request's critical path. The actual cap on abuse is the per-provider monthly budget; the per-IP counter is just a guardrail against a single-host curl loop, where a few bonus requests past the threshold don't materially change the bill. Per-phase timing instrumentation (`mark()` + `X-HDZap-Timings` header) plus `?nocache=1` query param to skip R2 entirely — measured Polly provider TTFA at 37 ms (matches Apple's 60 ms doc), confirming the remaining e2e latency is network RTT + provider, not Worker overhead. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Matches the Worker swap from mp3 to raw s16le PCM. Every provider now flows through the same `schedulePCM → AVAudioPlayerNode` path Cartesia has used since day one, so first audio plays the moment the first chunk lands instead of waiting for the full body. Race-time TTFA on iPhone Air drops from ~1 s to ~300 ms. - playPCMFromStream: drains the chunked octet-stream, schedules each chunk as a PCMBuffer. First chunk threshold dropped to 1 KB (~30 ms audio) so audio plays as fast as the network can deliver bytes; subsequent chunks ramp to 4 KB to amortise scheduling overhead. - buildBufferResampled: Polly's 16 kHz s16le → 24 kHz Float32 via a per-utterance AVAudioConverter. Converter state survives across the utterance's chunks so the polyphase resampler's filter taps don't produce audible seams at chunk boundaries. - buildBuffer24kHz: kept as the native-rate fast path (Cartesia + Azure) to avoid converter overhead when no resample is needed. - TTSCache: file extension `.pcm` for every provider — all three layers now store raw s16le PCM at the provider's native sample rate, so the on-disk format matches what's on the wire. - buildCacheKey: "v2" prefix matching the Worker's bump so iOS local and R2 layers stay aligned (orphan pre-PCM `.mp3` entries evict via LRU). - mp3Player + AVAudioPlayerDelegate plumbing removed since no provider emits compressed audio anymore. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The `onChange(of: ttsLanguageRaw)` handler only cleared the System voice identifier; the Premium voice ID survived the language switch. A user who picked Polly Takumi (ja-JP) and then switched the language picker to English kept Takumi as their Premium voice — and LapAnnouncer faithfully sent English-text "Lap 3, best lap" to a Japanese voice, producing Japanese-accented mangled English. Clear `premiumLapVoiceId` alongside `voiceIdentifier`. Empty Premium ID makes LapAnnouncer fall back to the System engine for the new language until the operator drills into the Premium voice picker again, which is the same UX the System voice has had since day one. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The picker preview was sending a hardcoded JA phrase ("ラップ3、12.34、ベストラップ") to every voice regardless of its `lang`, so tapping ▶ on an English voice like Polly Matthew produced phonetic-approximation gibberish ("rappu-san...") — English-Neural synths trying to render Japanese script with no language model for it. Switch to a per-language helper that picks the right script: "Lap 3, 12.34, best lap" for `en` voices, the original JA phrase for `ja` voices. Now the preview audibly demonstrates each voice doing the language it was actually trained for. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The single Section was a long scroll of unrelated decisions: the master toggle, language, best-lap, countdown, engine, voice picker, two sets of rate/pitch sliders, and the test/reset buttons all stacked together. Grouping reads more like a checklist now: "Announcement" controls when and what the synth says; "Voice" controls which synth says it. - Section 1 "Announcement": master toggle, language, best-lap, countdown. All of these change race-time behaviour and are language/voice-agnostic. - Section 2 "Voice" (only when announce is on): engine picker, voice selection (System voice or Premium drill-in), per-engine rate/pitch sliders, test/reset buttons. Footer about audio session + iOS voice install moves here since it's voice-specific. No behaviour changes; pure layout refactor. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Language belongs with voice configuration: changing it filters the voice catalog AND resets both the System and Premium voice IDs to defaults (otherwise an old-language voice would render new-language text). Having the Picker live next to the things it directly cascades into is clearer than splitting "what to speak" from "what voice to use" — the operator's mental model is "pick a language, then pick a voice in that language". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The marketing strings still referenced placeholder numbers and a smaller voice catalogue from when we had Cartesia-only Japanese. Several of the captions also leaned on "TTFA" — a TTS-engineering term most operators won't recognise. - Paywall: catalogue size 35+ → 50+ (22 JA + 33 EN actually ship); "Sub-100 ms first-audio" → "Streaming playback" + the real measured Polly ~40 ms / Azure ~90 ms / Cartesia ~400 ms numbers (Cartesia never was sub-100 ms, calling the section that was misleading). - Picker footer hints drop "TTFA" in favour of plain "first-audio delay" / "response time", and Cartesia's "Native Japanese voices" claim is corrected to "JA + EN" since we ship three Cartesia English voices. - DEBUG smoke-test footer no longer says it'll "be replaced by the StoreKit-gated SpeechRouter in the next phase" — that path has been in production for a while now. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The English paywall was speaking Japanese at users: three JA-voice samples ("Listen first" → 「ラップ3、12.34、ベストラップ」), a value prop about Japanese pronunciation with Japanese characters embedded, and a header subtitle leaning on "TTS" as a marketing term. - sampleVoices is now a computed property that picks 3 voices in the user's UI language — Matthew / Davis / Scott for English locales, Takumi / Daichi / Takeshi for Japanese. Ordering matches the picker (Polly → Azure → Cartesia, fastest-response first). - sampleText() switches script per voice's `lang`, mirroring the picker helper, so a row's audio always matches the row's label. - Listen-first caption is language-neutral ("a sample race call-out") rather than embedding the JA phrase. - Value prop renamed "Natural Japanese pronunciation" → "Natural number reading", with an example that doesn't require the reader to parse Japanese characters to understand the benefit. - Header subtitle drops "cloud TTS" in favour of "AI voice synthesis". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Both strings now switch on the UI locale so English readers see "English and Japanese" (their own language first), Japanese readers see "日本語と英語" with the example phrase in Japanese characters. The previous single-string version always led with "Japanese and English" and embedded Japanese script in the example, which felt off-brand on an English iPhone. Reaffirmed Cartesia's ~400 ms response time with five back-to-back warm-up calls: 356-424 ms range, no warmup drop-off. That's the real Tokyo Worker → US Cartesia round trip on top of Cartesia's documented ~340 ms inference latency — not something we can shave on our side. Leaving the paywall number as ~400 ms. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The marketing strings were anchored on TTFA numbers (Polly ~40 ms vs Cartesia ~400 ms) that (a) most operators don't have a frame of reference for and (b) make Cartesia look bad next to Polly even though they're picking each provider for character, not speed. All three providers respond fast enough for race-time use; the differentiator worth surfacing is what the voices sound like. - Paywall ValueRow: "Streaming playback" + ms-list → "Broadcast-grade audio" + a description of the sound character, no timing numbers. - Picker footerHint per provider: drop "~40 ms" / "~90 ms" / "~400 ms" in favour of pure quality descriptors (most expressive / clear newscaster / broadcast-style). - providerHint pill on the paywall sample rows: "AWS Polly · lowest latency" → "AWS Polly · clear newscaster" so the row label is consistent with the picker footer below. DEBUG-only smoke-test panel still shows "First audio: %.0f ms" because that's a development diagnostic, not a marketing string — `#if DEBUG` keeps it out of release builds. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Most of the gaps were on the paywall + premium voice picker strings we added during this branch, plus a handful of older app-wide labels (race result, share, status, etc.) that never had JA entries. A Japanese iPhone now sees Japanese throughout the Premium flow instead of mixed-language UI. Skipped: pure symbols (·, •, ★, #, 0), bare format placeholders (%@, .%@), and proper-noun-only strings (M5StickS3, hdzap) — those don't need translation. The stale "35+ voices" header subtitle stays untranslated since the live code now reads from the locale-aware `headerSubtitle` helper, not the xcstring. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Three issues caught on a Japanese iPhone walkthrough: - "実走" reads as awkward jargon in Japanese — the operator just says "レースで" for the same idea. Rewrites two strings ("Free preview" subtitle + "Subscribers only" body) to use レースで. - The Kyoko / Otoya / O-ren install tip is obsolete with Premium voices available, and clutters the System-engine path with a paragraph of "go to iOS Settings → …" detail most operators don't act on. Drop the Text entirely — the existing "No voices installed" orange banner still covers the genuinely-empty case. - PremiumVoicePickerView's provider footer hints + PaywallSampleRow's providerHint pill were hardcoded English. Localised inline so the JA build reads the entire picker + paywall in Japanese, including the one-liner under each provider section header and the sub-label below each sample voice's name. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Five paywall surfaces still rendered in English on Japanese iPhones: the four ValueRow title/subtitle pairs, the "Tap a voice…" prompt under "Listen first", the Terms/Privacy links, the auto-renew disclosure, and the per-product price denominator + free-trial label. Routed all of them through a `Self.isJa` locale gate so the JA build reads end-to-end Japanese now. Also added a fifth ValueRow flagging the internet requirement: cloud TTS needs a network round-trip, so weak-signal areas may stutter or drop. The app falls back to the System voice automatically when the Worker call fails, but operators headed to remote race sites deserve to know up front before subscribing. Small JA copy nit: "じゅうにてんさんよん" → "じゅうにいてんさんよん" in the "natural number reading" example, matching how the cloud voices actually pronounce 12 (longer "ni" sound). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The new "Announcement" section header had no JA entry, so JP iPhones saw it in English next to the already-translated "音声" (Voice) section. Adds 「実況設定」 and tightens "Lap announcer" navigation title from the older 「ラップ読み上げ」 to 「ラップ実況」 so the noun matches the rest of the FPV race vocabulary used in the app. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The previous wording promised clarity "over Bluetooth to goggles", but HDZero goggles don't accept Bluetooth audio — the iPhone speaker (or the operator's paired Bluetooth headphones) is the actual playback path. The JA "ゴーグル接続の Bluetooth" was an even rougher rendering of the same wrong claim. Now matches reality: "Clear over the iPhone speaker or Bluetooth headphones" / 「iPhone のスピーカーでも Bluetooth ヘッドホンでもクリアに聞こえます」. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Surface the monthly card first now. ¥450/month is a lower-friction entry than asking the operator to commit ¥4000 up front; the yearly card sits underneath so the savings story is still visible. - Drop the 7-day free trial from both products in the local .storekit config and stop mentioning it in the auto-renew disclosure. We can reintroduce a trial later if conversion data suggests we need one; shipping without buys us the cleanest TestFlight → App Store path. - "Restore Purchases" button reads 「購入の復元」 on JA builds. Apple Guideline 3.1.2 requires the affordance to be visible; the button was visible on JA already, just labelled in English. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The old Privacy Policy claimed "HDZap does not collect, transmit, or share any personal data" — true for free-tier users, but actively misleading for Premium subscribers, whose lap-call text is shipped to our Cloudflare backend and forwarded to AWS Polly / Azure / Cartesia for synthesis. App Store Review Guideline 3.1.2 + the App Privacy questionnaire require us to spell out exactly what leaves the device and who processes it. - Rewrote the Privacy Policy with a "Free tier — no data leaves your device" section that still describes the unchanged free-tier behaviour, and a "HDZap Premium — what we send and to whom" section that names the three TTS providers, lists exactly what is and isn't in the payload (lap text, JWS subscription token; not name, email, Apple ID, or location), states the 30-day backend audio cache policy, and links each provider's own privacy page. - Created the Japanese translation under /privacy/ja/ with the same structure and a language switcher at the top of both pages. - CI deploy stages both EN and JA privacy under _site/privacy/ + _site/privacy/ja/ for both develop and main. - PaywallView's Privacy link now picks the correct localised URL (https://hdzap.saqoo.sh/privacy/ja/ for JA, /privacy/ for EN) and fixes the stale `/privacy.html` typo that pointed at a 404. Terms of Use stays as Apple's standard EULA per Guideline 3.1.2 — linking the stdeula is the supported way for subscription apps that don't have a separate Terms doc. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

coderabbitai · 2026-05-20T14:48:21Z

Warning

Rate limit exceeded

@Saqoosha has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 39 minutes and 6 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f0fa828a-4466-4080-ac53-0ea85e3e6006

📥 Commits

Reviewing files that changed from the base of the PR and between cf5cdd0 and 4b1d944.

📒 Files selected for processing (2)

.github/workflows/flasher.yml
app/project.yml

Walkthrough

このPRはHDZapに完全なプレミアム購読ベースのクラウド音声合成機能を追加します。ユーザーがSubscriptionManagerを通じてStoreKit 2で購読管理を行い、PremiumSpeechSynthesizerでCartesia/Polly/Azureから音声を取得してローカルキャッシュし、Cloudflare WorkerはApple JWSを検証してTTSプロバイダをプロキシしながらR2にキャッシュします。10個のクラウドプロバイダを比較検証する包括的なPython テストスイートと、プライバシーポリシーの更新が含まれています。

Changes

Premium TTS Subscription and Cloud Speech Synthesis

Layer / File(s)	Summary
Subscription State Management and App Integration `app/HDZap/Models/IAP/SubscriptionManager.swift`, `app/HDZap/HDZapApp.swift`	StoreKit 2 subscription product definitions (monthly/yearly), SubscriptionStatus enum (unknown/none/active/inGracePeriod), MainActor Observable SubscriptionManager with product loading, purchase flow, entitlement refresh, JWS token management. HDZapApp integrates subscription into environment and wires JWS provider to PremiumSpeechSynthesizer.
Premium Speech Synthesizer and Local Cache `app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift`, `app/HDZap/Models/Speech/TTSCache.swift`	PremiumVoiceProvider enum with rate/pitch support flags, PremiumVoiceOption and PremiumVoiceCatalog (Cartesia/Polly/Azure hand-curated voices in ja/en), full synthesizer with AVAudioEngine + AVAudioPlayerNode, SSE/PCM streaming support, resampling to 24kHz, pending buffer tracking, isPlaying state. TTSCache singleton manages ~Library/Caches/HDZapTTS disk storage, SHA-256 cache keys, LRU eviction up to 50MB cap.
LapAnnouncer Premium Integration and Routing `app/HDZap/Models/LapAnnouncer.swift`	LapAnnouncerDefaults extended with Premium engine key, voice ID, rate, pitch settings. speak() routes to PremiumSpeechSynthesizer.speakAsync when currentPremiumVoiceIfActive() returns a voice; cancel() cancels both system and premium paths. Japanese lap summary separator changed from space to punctuation mark。
Paywall and Product Management UI `app/HDZap/Views/Settings/PaywallView.swift`	PaywallView manages product loading (5s timeout), displays ProductCard per product, handles purchase with SubscriptionManager, auto-dismisses on isEntitled change. Voice preview samples (3 providers) playable inline. Locale-aware pricing, trial descriptions, restore button, legal links, auto-renew disclosure in ja/en. Helper views: ValueRow, ProductCard period/trial formatting, PremiumVoiceOption providerHint.
Premium Voice Picker and Selection UI `app/HDZap/Views/Settings/PremiumVoicePickerView.swift`	Organized by provider (Polly/Azure/Cartesia), separate select and preview buttons. Non-subscribers see banner with lock icon and CTA; preview allowed but selection deferred until paywall purchase. Playback state synchronized with isPlaying. Provider display names and hints localized ja/en.
Audio Settings Premium Engine and Paywall Integration `app/HDZap/Views/Settings/AudioSettingsView.swift`	Engine selector (System/Premium), AppStorage for ttsEngine, premiumLapVoiceId, premiumRate, premiumPitch. Language change clears both System and Premium defaults. Premium unavailable without subscription (displays "Premium—Subscribe ›"). Rate/Pitch sliders only shown in System mode. navigationDestination to voice picker, paywall sheet, onChange to rollback Premium if subscription expires.
Xcode Project and Build Configuration `app/HDZap.xcodeproj/project.pbxproj`, `app/HDZap.xcodeproj/xcshemes/HDZap.xcscheme`, `app/project.yml`	PBXBuildFile/PBXFileReference entries for SubscriptionManager, PremiumSpeechSynthesizer, TTSCache, PaywallView, PremiumVoicePickerView, AudioSettingsView, BuildSecrets, HDZapPremium.storekit. PBXGroup tree with new IAP, Speech folders under Models, Resources/StoreKit under HDZap. PBXResourcesBuildPhase and PBXSourcesBuildPhase updated. Xcode scheme points to local StoreKit config. project.yml build integration.
Cloudflare Worker TTS Proxy and Caching `workers/hdzap-premium/src/index.ts`, `workers/hdzap-premium/src/appleJws.ts`	Hono app with POST /tts endpoint: extracts JWS or bearer token, verifies with verifyAppleJws, applies character limits (JWS 500, dev bearer 1000), validates/clamps provider/voice/lang/rate/pitch, SHA-256 cache key. R2 cache hit returns with Content-Type/sample-rate headers; miss proxies upstream (Cartesia SSE, Polly SigV4 SSML, Azure SSML) and tees to R2. Daily IP-based ratelimit in KV with waitUntil background execution. verifyAppleJws checks alg=ES256, x5c chain, App Root CA G3 fingerprint, payload bundleId/productId/revocationDate/expiresDate with gracePeriodMs tolerance.
TTS Provider Library and Individual Implementations `tts-quality-test/providers/base.py`, `tts-quality-test/providers/*.py` (10 modules)	Voice, Result, Provider Protocol, ProviderBase with env_key availability checks and voices_for() filtering. `@timed` decorator measures latency, retries HTTPStatusError 429/500-504 max 5x (Retry-After / exponential backoff+jitter). ElevenLabs (Flash/Multilingual), OpenAI (gpt-4o-mini-tts), AWS Polly (Neural), Google Cloud (Chirp3 HD), Gemini (2.5 Flash, base64 PCM→WAV), Cartesia (Sonic 3.5), fish.audio (S2Pro), Inworld (1.5 mini), Murf (Falcon), xAI Grok (grok-tts-1). Each implements voice list and synthesize().
TTS Quality Test Suite and Comparison Tools `tts-quality-test/tts_test.py`, `tts-quality-test/compare.html`, `tts-quality-test/scripts/*`	tts_test.py loads phrases.json, filters by --lang/--category/--provider/--voice-limit, skips unavailable providers, runs synthesize() via ThreadPoolExecutor, saves audio, aggregates to results.json. compare.html fetches results.json, filters language/category, displays phrase/candidate grid, 1–5 rating buttons stored to localStorage, CSV export. Utility scripts: measure_streaming_latency.py (direct/proxy mode, TTFA/chunk arrival gap stats, WAV output), test_number_formats.py (Japanese number format variants, MP3 synthesis, HTML comparison).
Test Phrases and Configuration `tts-quality-test/phrases.json`, `tts-quality-test/.env.example`, `tts-quality-test/.env.op`, `tts-quality-test/.gitignore`, `tts-quality-test/pyproject.toml`	phrases.json defines lap_short/lap_best/fixed_cues/final_summary categories with ja/en phrases. .env.example template for 10 provider API keys. .env.op references 1Password credentials. .gitignore excludes .env, outputs/, cache, virtual env, generated reports. pyproject.toml declares project metadata, Python >=3.11, dependencies (httpx, python-dotenv, boto3, google-genai, google-cloud-texttospeech).
TTS Quality Test Documentation `tts-quality-test/README.md`	Setup (uv sync), .env configuration (override warnings, empty-string skip), per-provider requirements table, execution examples (all/filtered runs), browser HTTPServer, rating UI (1–5 buttons, localStorage, CSV export), evaluation axes (number reading, intonation, naturalness), troubleshooting per provider, result interpretation (score average → production adoption).
Privacy Policy and GitHub Actions Updates `docs/manual/_pages/privacy/index.html`, `docs/manual/_pages/privacy/ja/index.html`, `.github/workflows/flasher.yml`, `.gitignore`	English privacy policy updated with free/Premium tier distinction, cloud provider list (Polly/Azure/Cartesia), data flow (Premium sends phrase + voice ID + JWS to backend), non-sent data (history), cache retention (30d), App Store billing (developer doesn't see payment info), Bluetooth (BLE direct, no third-party), children <13 excluded, policy update cadence. New Japanese privacy policy page (ja-JP) with same content translated. flasher.yml expanded to `mkdir -p dest/privacy/ja` and copy ja/index.html. .gitignore ignores BuildSecrets.swift (op injected).

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Saqoosha/HDZap#12: Both PRs modify lap-time TTS infrastructure in LapAnnouncer.swift; retrieved adds system AVSpeechSynthesizer baseline; main extends with Premium synthesizer routing and cancellation paths.
Saqoosha/HDZap#67: Both modify LapAnnouncer.swift speech/audio control; main adds Premium synth switching + cancellation; retrieved refactors speak session handling for countdown and final lap.

Poem

🐰 Premium voices now sing through clouds so high,
Subscriptions blooming, no subscription sigh,
From Cartesia, Polly, Azure they fly,
Cached locally so latency won't lie,
A quality test suite—ten providers unite! 🎵

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/premium-audio

coderabbitai

Actionable comments posted: 20

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

app/HDZap/Views/Settings/AudioSettingsView.swift (1)
349-364: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reset が Premium 設定を戻していません。

ここで戻しているのは system 側のキーだけで、今回追加した ttsEngine / premiumLapVoiceId / premiumRate / premiumPitch は残ります。ボタンの意味どおりに初期状態へ戻すなら、Premium 系の @AppStorage もここで一緒にリセットしないと設定が半分だけ残ります。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/HDZap/Views/Settings/AudioSettingsView.swift` around lines 349 - 364, The
Reset button currently restores only the system Audio `@AppStorage` keys; also
reset the Premium AppStorage keys inside the same Button("Reset") closure:
assign ttsEngine, premiumLapVoiceId, premiumRate, and premiumPitch back to their
respective defaults (use the corresponding LapAnnouncerDefaults entries or the
canonical nil/default values you registered) so Premium settings are cleared
alongside lapTTSEnabled, ttsRate, ttsPitch, ttsLanguageRaw, voiceIdentifier,
announceBest, countdownEnabled and countdownStartSeconds.

🧹 Nitpick comments (1)

tts-quality-test/providers/base.py (1)
45-46: ⚡ Quick win

ProviderBase の可変クラス属性は共有状態バグの原因になります。

voices / env_keys はインスタンス間・サブクラス間で意図せず共有されうるため、不変コンテナ（tuple）へ変更するか、各サブクラスで不変に定義してください。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tts-quality-test/providers/base.py` around lines 45 - 46, ProviderBase
currently defines mutable class attributes voices: list[Voice] = [] and
env_keys: list[str] = [] which can be accidentally shared between
instances/subclasses; change these to immutable defaults (e.g. voices:
tuple[Voice, ...] = () and env_keys: tuple[str, ...] = ()) or move them to
instance attributes in ProviderBase.__init__ (self.voices = tuple(...) /
self.env_keys = tuple(...)) and update any subclasses to override with immutable
tuples; ensure type annotations match (tuple[...] with ellipsis) and remove
in-place mutations of these attributes elsewhere.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/HDZap.xcodeproj/project.pbxproj`:
- Line 293: BuildSecrets.swift.tpl is a template and must not be bundled in the
app; remove its PBXBuildFile reference from the Resources build phase so it is
not included in the app bundle. Locate the PBXBuildFile / PBXResourcesBuildPhase
entry that references "BuildSecrets.swift.tpl" (the identifier shown as
"BuildSecrets.swift.tpl in Resources") in the project.pbxproj and remove that
build-file reference or clear its target membership, and ensure the file remains
in the repository (or a templates folder) but is not listed under any Resources
build phase for the app target.

In `@app/HDZap/Models/IAP/SubscriptionManager.swift`:
- Around line 87-102: Clear the lingering error by resetting lastError to nil
when operations succeed (or at start) in SubscriptionManager: set lastError =
nil either at the top of loadProducts() before the do block or immediately after
successfully assigning products (the products property) so previous errors are
not shown; apply the same fix to the corresponding restore/load methods
mentioned around lines 131-141 (the restore/restorePurchases-like method) to
ensure lastError is cleared on successful completion.

In `@app/HDZap/Models/Speech/BuildSecrets.swift.tpl`:
- Around line 8-20: Add a compile-time guard at the end of the
BuildSecrets.swift.tpl template to fail the build if the template was not
rendered: inside the same file (reference enum BuildSecrets and static let
workerBearer) add an `#if` !BUILD_SECRETS_RENDERED / `#error` (or Swift's
equivalent) block that emits a clear compile error message instructing to run
the renderer and not to ship the placeholder bearer; ensure the renderer defines
BUILD_SECRETS_RENDERED so the guard is bypassed once properly rendered.

In `@app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift`:
- Around line 309-323: The current cancel call in speakAsync only invokes
currentTask?.cancel(), which leaves the previous playerNode's audio buffers
playing and allows a cancelled task to later overwrite lastError/isPlaying;
after currentTask?.cancel() explicitly stop and reset the audio player state
(call the playerNode’s stop/reset method and set isPlaying = false and clear
lastError on the MainActor) before starting the new Task, and inside the new
Task’s catch treat CancellationError as a no-op (if error is CancellationError
return) while only updating lastError/isPlaying for non-cancellation errors;
reference speakAsync, currentTask, playerNode, lastError, isPlaying, and
speak(text:lang:voice:) when making these changes.

In `@app/HDZap/Resources/StoreKit/HDZapPremium.storekit`:
- Around line 68-74: Update the Japanese "description" string in
HDZapPremium.storekit so the discount percentage matches the actual price math:
change "月額より 27% お得。" to "月額より 約26% お得。" (or equivalent wording) for the entry
whose "displayName" is "HDZap Premium 年額" to reflect ¥4,000 vs ¥5,400
accurately; ensure the English entry remains unchanged.

In `@app/HDZap/Views/Settings/PaywallView.swift`:
- Around line 233-244: Replace hard-coded English strings in PaywallView (e.g.,
the Button label Text("Manage Subscription") and the "free trial included."
string referenced around the later block) with localized keys following the
app's EN/JA localization pattern; update Text usages to use LocalizedStringKey
or NSLocalizedString (e.g., use Text(LocalizedStringKey("manage_subscription"))
or NSLocalizedString("manage_subscription", comment: "")) and add the
corresponding keys/values to the Localizable.strings for both English and
Japanese; apply the same change to the other occurrence mentioned (around lines
432-435) so all UI text follows the same localization approach.
- Around line 79-87: loadProductsWithTimeout currently awaits both the load task
and the timeout, so it doesn't race them; change it to race the loader and a
timeout so the spinner can stop after 5s. Implement a winner pattern inside
loadProductsWithTimeout: spawn two Tasks (one that runs
subscription.loadProducts() + subscription.refreshEntitlement() and one that
does Task.sleep(for: .seconds(5))), await the first task to complete (e.g.,
using a TaskGroup and calling group.next() to get the first result), cancel the
loser task, and then set productLoadState = subscription.products.isEmpty ?
.empty : .loaded; reference the function loadProductsWithTimeout and the methods
subscription.loadProducts() and subscription.refreshEntitlement() when making
the changes.

In `@docs/manual/_pages/privacy/index.html`:
- Around line 86-91: Update the "What is NOT sent" section to clarify IP
handling: change the list item "Location, IP geolocation, or device identifiers"
to state that IP addresses may be processed temporarily on the worker for
operational purposes such as IP-based rate limiting but are not used for
geolocation, long‑term storage, or to identify individuals; keep the assurance
that no persistent location or PII is sent or retained. Locate and edit the
heading "What is NOT sent" and the specific list item text to reflect this
distinction.

In `@docs/manual/_pages/privacy/ja/index.html`:
- Around line 86-91: Update the wording under the "送信されない内容" heading: replace
the absolute statement "位置情報、IP ジオロケーション、デバイス ID" with language that reflects
current implementation (e.g. note that IP may be transmitted via Premium
routes/Cloudflare Worker and is used for rate limiting, security, or delivery)
and add a brief clause about purpose and retention policy for IPs; edit the list
item text so it no longer asserts IPs are never sent and instead indicates
when/why IPs may be collected and how long they are retained.

In `@tts-quality-test/compare.html`:
- Around line 134-135: The code sets sel.innerHTML using cats.map(...) which
injects unescaped values into HTML and creates an XSS risk; change to building
option elements via DOM APIs instead: for each category in cats create an Option
or document.createElement("option"), set its value with setAttribute or
option.value and set its label/text with textContent (not innerHTML), then
appendChild to sel; apply the same fix to the other similar spots referenced
(the block around lines 193-199 that constructs provider/audio_path options).

In `@tts-quality-test/providers/aws_polly.py`:
- Around line 39-47: The synthesize_speech response handling in aws_polly.py
currently reads resp["AudioStream"] without closing it and relies on dict lookup
for LanguageCode that can raise KeyError; update the code around the
synthesize_speech call so you validate lang against the allowed map first and
raise a clear ValueError if invalid, then open and read the StreamingBody using
a context manager (e.g., with resp["AudioStream"] as stream: audio =
stream.read()) or ensure stream.close() in a finally block to prevent HTTP
connection leaks, and return Result(audio=audio, audio_format="mp3",
latency_ms=0) after the stream is closed.

In `@tts-quality-test/providers/gemini.py`:
- Around line 59-67: The code currently places GEMINI_API_KEY in the request URL
query string; change it to send the key in the header instead by removing the
?key=... from the url construction (the variable url) and adding
"x-goog-api-key": os.environ["GEMINI_API_KEY"] to the headers dict used in the
client.post call; ensure the headers dict (in the post call in
providers/gemini.py) still includes "Content-Type": "application/json" and that
os.environ access is preserved or validated before the request.

In `@tts-quality-test/providers/murf.py`:
- Around line 47-53: Validate and handle audio response more robustly in the
block that reads payload["audioFile"]/["encodedAudio"]: ensure you check
presence of either field (audio_url = payload.get("audioFile") or
payload.get("encodedAudio")) and raise a clear exception if neither is present;
when fetching via client.get(audio_url) call raise_for_status() before using
.content to avoid returning an error page as MP3; when audio_url is not an http
URL assume it's base64 but validate decode errors and surface a clear exception
rather than returning broken data; update references in this logic (audio_url,
payload, client.get, Result) to implement these checks and error messages.

In `@tts-quality-test/README.md`:
- Around line 34-36: The fenced code block showing the environment variable
example currently uses plain backticks; change the opening fence to include a
language specifier (e.g., replace ``` with ```bash) so the block becomes a
bash-marked code fence, and scan README.md for any other fenced blocks to add
appropriate language tags to satisfy MD040 and improve rendering and lint
stability.

In `@tts-quality-test/scripts/test_number_formats.py`:
- Around line 5-6: Update the explanatory text to match the actual number of
variants (VARIANTS = 11): locate the VARIANTS constant and any user-facing
strings or docstrings that currently say "4 candidate" or "8 spellings" (and the
similar "150-150" occurrence) and change them to reflect "11" (e.g., "11
candidates" or "11 spellings") so the description matches the real data; ensure
all occurrences in test_number_formats.py referencing VARIANTS or those literal
phrases are updated consistently.

In `@tts-quality-test/tts_test.py`:
- Around line 173-176: The as_completed loop calls f.result() without handling
exceptions so one failing API call in synth_one aborts the whole run; modify the
as_completed loop (the futures list and for f in as_completed(futures) block) to
catch exceptions per-future and append an error row to rows (including
provider/voice/phrase or other identifying info) instead of raising, or
alternately wrap synth_one with try/except to return an error-indicating result;
ensure rows.append is used for both success and error cases so the worker
continues processing remaining futures.

In `@workers/hdzap-premium/package.json`:
- Around line 10-13: Update the "hono" dependency in package.json from "^4.6.0"
to at least ">=4.12.18" to address known vulnerabilities; change the version
string for the "hono" entry, run your package manager (npm install or yarn) to
regenerate the lockfile, and verify the app by running the test suite and any
integration checks to ensure no breaking changes; keep "aws4fetch" and "jose"
unchanged.

In `@workers/hdzap-premium/src/index.ts`:
- Around line 449-463: The upstream fetch to "https://api.cartesia.ai/tts/sse"
(the const resp = await fetch(...) call) has no timeout and can hang; wrap this
fetch (and the other similar fetches at the indicated locations) with an
AbortController and a short configurable timeout (e.g., 3–10s), pass
controller.signal into fetch, clear the timer on success, and catch the
abort/timeout error to convert it into an explicit 502 response from your
handler (rather than letting the request hang). Locate the fetch calls by the
resp variable assignment (and the same pattern around lines 521-533 and 589-600)
and implement consistent timeout handling + mapping of timeout/AbortError to a
502 error response.
- Around line 157-164: The current code only logs a non-ok result from
consumeRateLimitToken inside c.executionCtx.waitUntil, so requests are never
blocked; change to call and await consumeRateLimitToken (the same function) in
the main request flow before proceeding and, if result.ok is false, immediately
return a 429 response (or otherwise terminate the request) instead of
continuing; keep any background waitUntil usage only for non-critical async
bookkeeping, but enforce the rate-limit check synchronously using
consumeRateLimitToken result to reject the request when needed.
- Around line 557-562: The azureGenderFor function incorrectly infers gender
(treating everything not matching /Daichi|Keita|Naoki/ as "Female") causing many
male voices from PremiumVoiceCatalog (e.g., en-US-DavisNeural, en-US-TonyNeural,
en-US-GuyNeural, en-US-JasonNeural, en-GB-RyanNeural) to be mis-tagged; replace
this heuristic by either (A) removing the xml:gender attribute entirely when
building SSML in the code paths that call azureGenderFor (since xml:gender is
optional) or (B) deriving gender from a concrete source such as the
PremiumVoiceCatalog mapping (lookup by voiceId) and returning the correct
"Male"/"Female" for azureGenderFor; update any SSML construction points to use
the catalog lookup or omit xml:gender, and remove the misleading comment that
states xml:gender is required.

---

Outside diff comments:
In `@app/HDZap/Views/Settings/AudioSettingsView.swift`:
- Around line 349-364: The Reset button currently restores only the system Audio
`@AppStorage` keys; also reset the Premium AppStorage keys inside the same
Button("Reset") closure: assign ttsEngine, premiumLapVoiceId, premiumRate, and
premiumPitch back to their respective defaults (use the corresponding
LapAnnouncerDefaults entries or the canonical nil/default values you registered)
so Premium settings are cleared alongside lapTTSEnabled, ttsRate, ttsPitch,
ttsLanguageRaw, voiceIdentifier, announceBest, countdownEnabled and
countdownStartSeconds.

---

Nitpick comments:
In `@tts-quality-test/providers/base.py`:
- Around line 45-46: ProviderBase currently defines mutable class attributes
voices: list[Voice] = [] and env_keys: list[str] = [] which can be accidentally
shared between instances/subclasses; change these to immutable defaults (e.g.
voices: tuple[Voice, ...] = () and env_keys: tuple[str, ...] = ()) or move them
to instance attributes in ProviderBase.__init__ (self.voices = tuple(...) /
self.env_keys = tuple(...)) and update any subclasses to override with immutable
tuples; ensure type annotations match (tuple[...] with ellipsis) and remove
in-place mutations of these attributes elsewhere.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b77a1089-4553-4f6e-8d2f-334382637155

📥 Commits

Reviewing files that changed from the base of the PR and between c1883db and cf5cdd0.

⛔ Files ignored due to path filters (2)

tts-quality-test/uv.lock is excluded by !**/*.lock
workers/hdzap-premium/bun.lock is excluded by !**/*.lock

📒 Files selected for processing (46)

.github/workflows/flasher.yml
.gitignore
app/HDZap.xcodeproj/project.pbxproj
app/HDZap.xcodeproj/xcshareddata/xcschemes/HDZap.xcscheme
app/HDZap/HDZapApp.swift
app/HDZap/Localizable.xcstrings
app/HDZap/Models/IAP/SubscriptionManager.swift
app/HDZap/Models/LapAnnouncer.swift
app/HDZap/Models/Speech/BuildSecrets.swift.tpl
app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift
app/HDZap/Models/Speech/TTSCache.swift
app/HDZap/Resources/StoreKit/HDZapPremium.storekit
app/HDZap/Views/Settings/AudioSettingsView.swift
app/HDZap/Views/Settings/PaywallView.swift
app/HDZap/Views/Settings/PremiumVoicePickerView.swift
app/project.yml
docs/manual/_pages/privacy/index.html
docs/manual/_pages/privacy/ja/index.html
tts-quality-test/.env.example
tts-quality-test/.env.op
tts-quality-test/.gitignore
tts-quality-test/README.md
tts-quality-test/compare.html
tts-quality-test/phrases.json
tts-quality-test/providers/__init__.py
tts-quality-test/providers/aws_polly.py
tts-quality-test/providers/base.py
tts-quality-test/providers/cartesia.py
tts-quality-test/providers/elevenlabs.py
tts-quality-test/providers/fish_audio.py
tts-quality-test/providers/gemini.py
tts-quality-test/providers/google_cloud.py
tts-quality-test/providers/inworld.py
tts-quality-test/providers/murf.py
tts-quality-test/providers/openai_tts.py
tts-quality-test/providers/xai_grok.py
tts-quality-test/pyproject.toml
tts-quality-test/scripts/measure_streaming_latency.py
tts-quality-test/scripts/test_number_formats.py
tts-quality-test/tts_test.py
workers/hdzap-premium/.gitignore
workers/hdzap-premium/package.json
workers/hdzap-premium/src/appleJws.ts
workers/hdzap-premium/src/index.ts
workers/hdzap-premium/tsconfig.json
workers/hdzap-premium/wrangler.toml

coderabbitai · 2026-05-20T14:59:08Z

 			buildActionMask = 2147483647;
 			files = (
 				29915B1AC56F75DC768200C1 /* Assets.xcassets in Resources */,
+				6A349E284EC8FC622DBD89B6 /* BuildSecrets.swift.tpl in Resources */,


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

BuildSecrets.swift.tpl を Resources に含めるのは避けた方がよいです。

テンプレートは実行時不要なので、アプリバンドル同梱対象から外してください（サイズ削減と情報露出面の抑制）。

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/HDZap.xcodeproj/project.pbxproj` at line 293, BuildSecrets.swift.tpl is a template and must not be bundled in the app; remove its PBXBuildFile reference from the Resources build phase so it is not included in the app bundle. Locate the PBXBuildFile / PBXResourcesBuildPhase entry that references "BuildSecrets.swift.tpl" (the identifier shown as "BuildSecrets.swift.tpl in Resources") in the project.pbxproj and remove that build-file reference or clear its target membership, and ensure the file remains in the repository (or a templates folder) but is not listed under any Resources build phase for the app target.

coderabbitai · 2026-05-20T14:59:08Z

+    func loadProducts() async {
+        do {
+            let fetched = try await Product.products(for: SubscriptionProductID.all)
+            // Surface monthly first — it's the lower-friction entry point (¥450 vs ¥4000)
+            // and the yearly card sits underneath it where the discount story is still
+            // visible without forcing the bigger commitment to be the first thing the
+            // operator reads.
+            products = fetched.sorted { lhs, rhs in
+                lhs.id == SubscriptionProductID.monthly && rhs.id == SubscriptionProductID.yearly
+            }
+            log.notice("loaded \(self.products.count, privacy: .public) products: \(self.products.map { $0.id }.joined(separator: ", "), privacy: .public)")
+        } catch {
+            log.error("loadProducts failed: \(error.localizedDescription, privacy: .public)")
+            lastError = "Couldn't load subscription products: \(error.localizedDescription)"
+        }
+    }


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

成功時に lastError をクリアしないため、古いエラー表示が残ります。

ロード/復元が成功しても前回失敗時の文言が残るので、UI が誤って失敗状態を表示し続けます。処理開始時または成功確定時に lastError = nil を入れてください。

修正案（例）

func loadProducts() async { + lastError = nil do { let fetched = try await Product.products(for: SubscriptionProductID.all) products = fetched.sorted { lhs, rhs in lhs.id == SubscriptionProductID.monthly && rhs.id == SubscriptionProductID.yearly } + lastError = nil log.notice("loaded \(self.products.count, privacy: .public) products: \(self.products.map { $0.id }.joined(separator: ", "), privacy: .public)") } catch { log.error("loadProducts failed: \(error.localizedDescription, privacy: .public)") lastError = "Couldn't load subscription products: \(error.localizedDescription)" } } func restore() async { + lastError = nil do { try await AppStore.sync() await refreshEntitlement() if !isEntitled { lastError = "No active subscription found on this Apple ID." + } else { + lastError = nil } } catch { log.error("restore failed: \(error.localizedDescription, privacy: .public)") lastError = "Restore failed: \(error.localizedDescription)" } }

Also applies to: 131-141

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/HDZap/Models/IAP/SubscriptionManager.swift` around lines 87 - 102, Clear the lingering error by resetting lastError to nil when operations succeed (or at start) in SubscriptionManager: set lastError = nil either at the top of loadProducts() before the do block or immediately after successfully assigning products (the products property) so previous errors are not shown; apply the same fix to the corresponding restore/load methods mentioned around lines 131-141 (the restore/restorePurchases-like method) to ensure lastError is cleared on successful completion.

coderabbitai · 2026-05-20T14:59:08Z

+/// stub at the bottom (`#if !BUILD_SECRETS_RENDERED`) is a compile guard so a fresh checkout
+/// fails fast with a clear message instead of silently shipping an empty bearer.
+///
+/// Long-term plan: replace this baked-in bearer with per-user JWS verification in the
+/// Worker (see [`workers/hdzap-premium/src/index.ts`](../../../../workers/hdzap-premium/src/index.ts)
+/// for the planned `verifyAppleJws` hook). Until that lands, every subscriber ships with
+/// the same dev bearer — fine for TestFlight, not fine for App Store release.
+enum BuildSecrets {
+    /// Bearer the Worker validates via `Authorization: Bearer <value>`. Currently the Worker
+    /// gates `/tts` on a single shared dev bearer; once JWS verification is in place this
+    /// becomes irrelevant (the StoreKit JWS replaces it on the wire).
+    static let workerBearer = "{{ op://Personal/HDZap Worker Dev Bearer/credential }}"
+}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

コメントで説明している「未レンダリング検知ガード」が実装されていません

Line 8-9 で compile guard がある前提になっていますが、実際のテンプレート末尾に存在しません。レンダリング工程の不備をビルド時に確実に止めるガードを実装してください。

修正案（例）

import Foundation @@ enum BuildSecrets { @@ static let workerBearer = "{{ op://Personal/HDZap Worker Dev Bearer/credential }}" } + +#if !BUILD_SECRETS_RENDERED +#error("BuildSecrets.swift is not rendered. Run: op inject -i HDZap/Models/Speech/BuildSecrets.swift.tpl -o HDZap/Models/Speech/BuildSecrets.swift") +#endif

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/HDZap/Models/Speech/BuildSecrets.swift.tpl` around lines 8 - 20, Add a compile-time guard at the end of the BuildSecrets.swift.tpl template to fail the build if the template was not rendered: inside the same file (reference enum BuildSecrets and static let workerBearer) add an `#if` !BUILD_SECRETS_RENDERED / `#error` (or Swift's equivalent) block that emits a clear compile error message instructing to run the renderer and not to ship the placeholder bearer; ensure the renderer defines BUILD_SECRETS_RENDERED so the guard is bypassed once properly rendered.

coderabbitai · 2026-05-20T14:59:08Z

+    /// Fire-and-forget version of `speak(text:lang:voice:)` for `Button` action callbacks.
+    func speakAsync(text: String, lang: String, voice: PremiumVoiceOption) {
+        log.notice("speakAsync invoked: text=\"\(text, privacy: .public)\" provider=\(voice.provider.rawValue, privacy: .public) voice=\(voice.id, privacy: .public) lang=\(voice.lang, privacy: .public)")
+        currentTask?.cancel()
+        currentTask = Task { [weak self] in
+            do {
+                try await self?.speak(text: text, lang: voice.lang, voice: voice)
+            } catch {
+                await MainActor.run { [weak self] in
+                    self?.lastError = error.localizedDescription
+                    self?.isPlaying = false
+                    log.error("speak failed: \(error.localizedDescription, privacy: .public)")
+                }
+            }
+        }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

新しい再生開始前に旧再生を止め切れていません。

ここは currentTask?.cancel() だけなので、前回の playerNode に積まれたバッファは鳴り続けます。連続タップで音声が重なりますし、古いタスクの CancellationError が lastError / isPlaying を後から上書きできます。開始前は cancel() で再生状態ごとリセットし、CancellationError は通常フローとして無視した方が安全です。

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift` around lines 309 - 323, The current cancel call in speakAsync only invokes currentTask?.cancel(), which leaves the previous playerNode's audio buffers playing and allows a cancelled task to later overwrite lastError/isPlaying; after currentTask?.cancel() explicitly stop and reset the audio player state (call the playerNode’s stop/reset method and set isPlaying = false and clear lastError on the MainActor) before starting the new Task, and inside the new Task’s catch treat CancellationError as a no-op (if error is CancellationError return) while only updating lastError/isPlaying for non-cancellation errors; reference speakAsync, currentTask, playerNode, lastError, isPlaying, and speak(text:lang:voice:) when making these changes.

coderabbitai · 2026-05-20T14:59:08Z

+              "description" : "Cloud-powered realistic AI announcer voices for lap times. Save 27% versus monthly billing.",
+              "displayName" : "HDZap Premium Yearly",
+              "locale" : "en_US"
+            },
+            {
+              "description" : "クラウド駆動のリアルな AI 実況音声でラップタイム読み上げ。月額より 27% お得。",
+              "displayName" : "HDZap Premium 年額",


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

年額の割引率表示「27%」が価格と一致していません。

月額¥450×12=¥5,400に対して年額¥4,000は約25.9%オフなので、文言は「約26%」などに合わせるのが正確です。

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/HDZap/Resources/StoreKit/HDZapPremium.storekit` around lines 68 - 74, Update the Japanese "description" string in HDZapPremium.storekit so the discount percentage matches the actual price math: change "月額より 27% お得。" to "月額より約26% お得。" (or equivalent wording) for the entry whose "displayName" is "HDZap Premium 年額" to reflect ¥4,000 vs ¥5,400 accurately; ensure the English entry remains unchanged.

coderabbitai · 2026-05-20T14:59:09Z

+        futures = [ex.submit(synth_one, prov, voice, phrase) for prov, voice, phrase in jobs]
+        for f in as_completed(futures):
+            rows.append(f.result())
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

1件のAPI失敗で全ジョブが中断されます。

f.result() の例外未処理により、1セルの失敗で全体が落ちます。セル単位で error 行を残して続行するようにしてください（synth_one 内で try/except、または as_completed 側で捕捉）。

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tts-quality-test/tts_test.py` around lines 173 - 176, The as_completed loop calls f.result() without handling exceptions so one failing API call in synth_one aborts the whole run; modify the as_completed loop (the futures list and for f in as_completed(futures) block) to catch exceptions per-future and append an error row to rows (including provider/voice/phrase or other identifying info) instead of raising, or alternately wrap synth_one with try/except to return an error-indicating result; ensure rows.append is used for both success and error cases so the worker continues processing remaining futures.

coderabbitai · 2026-05-20T14:59:09Z

+  "dependencies": {
+    "aws4fetch": "^1.0.20",
+    "hono": "^4.6.0",
+    "jose": "^6.2.3"


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # 説明: npm依存パッケージの脆弱性チェック cd workers/hdzap-premium # package.jsonが存在することを確認 if [ ! -f "package.json" ]; then echo "package.json not found" exit 1 fi # 各パッケージの脆弱性をGitHub Advisory DBで確認 for pkg in aws4fetch hono jose; do echo "Checking $pkg..." gh api graphql -f query=" { securityVulnerabilities(first: 5, ecosystem: NPM, package: \"$pkg\") { nodes { advisory { summary severity publishedAt } vulnerableVersionRange firstPatchedVersion { identifier } } } }" done

Repository: Saqoosha/HDZap

Length of output: 2771

Honoを4.12.18以上にアップグレードしてください

以下の既知の脆弱性が検出されました：

hono 4.6.0 に5件の脆弱性が存在します

CSS Declaration Injection (MODERATE)

NumericDate claims検証不足 (LOW)

Cache Middleware による cross-user キャッシュ漏洩 (MODERATE)

bodyLimit() のバイパス (MODERATE)

JSXタグ名のHTML インジェクション (MODERATE)

aws4fetch と jose は現在のバージョンで脆弱性なし

Honoを4.12.18以上に更新してください。

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/hdzap-premium/package.json` around lines 10 - 13, Update the "hono" dependency in package.json from "^4.6.0" to at least ">=4.12.18" to address known vulnerabilities; change the version string for the "hono" entry, run your package manager (npm install or yarn) to regenerate the lockfile, and verify the app by running the test suite and any integration checks to ensure no breaking changes; keep "aws4fetch" and "jose" unchanged.

coderabbitai · 2026-05-20T14:59:09Z

+  c.executionCtx.waitUntil((async () => {
+    const result = await consumeRateLimitToken(c.env.RATELIMIT, ip, cap);
+    if (!result.ok) {
+      console.warn("rate-limit-exceeded (background, not enforced this request)", {
+        ip, authMode, count: result.count, cap,
+      });
+    }
+  })());


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

レート制限が実際には一度も拒否されていません。

consumeRateLimitToken() の結果を waitUntil 側で警告ログに使っているだけなので、ok == false でも当該リクエストは常に通ります。今の挙動は「少し超過を許す」ではなく「超過を観測するだけ」なので、漏えいした bearer や単一 IP の連打を止められません。

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/hdzap-premium/src/index.ts` around lines 157 - 164, The current code only logs a non-ok result from consumeRateLimitToken inside c.executionCtx.waitUntil, so requests are never blocked; change to call and await consumeRateLimitToken (the same function) in the main request flow before proceeding and, if result.ok is false, immediately return a 429 response (or otherwise terminate the request) instead of continuing; keep any background waitUntil usage only for non-critical async bookkeeping, but enforce the rate-limit check synchronously using consumeRateLimitToken result to reject the request when needed.

coderabbitai · 2026-05-20T14:59:09Z

+  const resp = await fetch("https://api.cartesia.ai/tts/sse", {
+    method: "POST",
+    headers: {
+      "X-API-Key": apiKey,
+      "Cartesia-Version": "2024-11-13",
+      "Content-Type": "application/json",
+    },
+    body: JSON.stringify({
+      model_id: model,
+      transcript: text,
+      voice: { mode: "id", id: voiceId },
+      output_format: { container: "raw", encoding: "pcm_s16le", sample_rate: 24000 },
+      language: lang,
+    }),
+  });


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

upstream 呼び出しにタイムアウトがありません。

3 プロバイダとも fetch / aws.fetch が無期限待ち寄りなので、上流がハングした時に /tts 自体が長くぶら下がります。レース中の読み上げ経路としては、短めの timeout と明示的な 502 変換を入れておかないと UX がかなり不安定です。

Also applies to: 521-533, 589-600

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/hdzap-premium/src/index.ts` around lines 449 - 463, The upstream fetch to "https://api.cartesia.ai/tts/sse" (the const resp = await fetch(...) call) has no timeout and can hang; wrap this fetch (and the other similar fetches at the indicated locations) with an AbortController and a short configurable timeout (e.g., 3–10s), pass controller.signal into fetch, clear the timer on success, and catch the abort/timeout error to convert it into an explicit 502 response from your handler (rather than letting the request hang). Locate the fetch calls by the resp variable assignment (and the same pattern around lines 521-533 and 589-600) and implement consistent timeout handling + mapping of timeout/AbortError to a 502 error response.

coderabbitai · 2026-05-20T14:59:09Z

+/** Azure JA-Neural voices are gender-tagged by name suffix — Daichi/Keita/Naoki are male, the
+ *  rest of the catalog is female. The SSML `xml:gender` attribute is required for some voices,
+ *  otherwise Azure returns 400. */
+function azureGenderFor(voiceId: string): "Male" | "Female" {
+  return /Daichi|Keita|Naoki/.test(voiceId) ? "Male" : "Female";
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

cd workers/hdzap-premium && find . -name "index.ts" -o -name "*catalog*" | head -20

Repository: Saqoosha/HDZap

Length of output: 73

🏁 Script executed:

cd workers/hdzap-premium/src && wc -l index.ts

Repository: Saqoosha/HDZap

Length of output: 71

🏁 Script executed:

cd workers/hdzap-premium/src && sed -n '550,590p' index.ts

Repository: Saqoosha/HDZap

Length of output: 1773

🏁 Script executed:

cd workers/hdzap-premium/src && rg -i "PremiumVoiceCatalog" --max-count 5

Repository: Saqoosha/HDZap

Length of output: 40

🏁 Script executed:

cd workers/hdzap-premium/src && sed -n '575,590p' index.ts

Repository: Saqoosha/HDZap

Length of output: 1034

🏁 Script executed:

cd workers/hdzap-premium/src && sed -n '575,595p' index.ts

Repository: Saqoosha/HDZap

Length of output: 1181

🏁 Script executed:

cd workers/hdzap-premium/src && rg -i "davis|voice" -A 2 -B 2 | head -50

Repository: Saqoosha/HDZap

Length of output: 2783

🏁 Script executed:

cd workers/hdzap-premium/src && rg -i "catalog|voices|voice.*catalog" --max-count 20

Repository: Saqoosha/HDZap

Length of output: 567

🏁 Script executed:

cd workers/hdzap-premium/src && sed -n '580,585p' index.ts

Repository: Saqoosha/HDZap

Length of output: 314

🏁 Script executed:

cd workers/hdzap-premium && find . -type f $ -name "*.ts" -o -name "*.js" -o -name "*.json" $ | xargs rg -l "DavisNeural|catalog|en-US.*voice" 2>/dev/null | head -10

Repository: Saqoosha/HDZap

Length of output: 73

🏁 Script executed:

cd workers/hdzap-premium && rg -i "gender|male|female" --max-count 30

Repository: Saqoosha/HDZap

Length of output: 575

🏁 Script executed:

cd workers/hdzap-premium && find . -type f $ -name "*.ts" -o -name "*.js" -o -name "*.json" $ | xargs rg -l "voice" 2>/dev/null

Repository: Saqoosha/HDZap

Length of output: 73

🏁 Script executed:

cd workers/hdzap-premium && rg "DavisNeural|Norton|en-US.*Neural"

Repository: Saqoosha/HDZap

Length of output: 40

🏁 Script executed:

find . -path ./node_modules -prune -o -type f $ -name "*.ts" -o -name "*.js" $ -print | xargs rg -l "PremiumVoiceCatalog" 2>/dev/null

Repository: Saqoosha/HDZap

Length of output: 42

🏁 Script executed:

cd / && find . -path ./node_modules -prune -o -type f $ -name "*.ts" -o -name "*.js" $ -print 2>/dev/null | xargs rg -l "PremiumVoiceCatalog|DavisNeural" 2>/dev/null | head -20

Repository: Saqoosha/HDZap

Length of output: 40

🏁 Script executed:

rg "DavisNeural|Norton|Male.*voice|voice.*Male" --type ts --type js | head -30

Repository: Saqoosha/HDZap

Length of output: 255

🏁 Script executed:

rg -i "en-us.*neural|azure.*voice" --max-count 20

Repository: Saqoosha/HDZap

Length of output: 2159

🏁 Script executed:

cd workers/hdzap-premium/src && rg -B5 -A5 "gender" | head -60

Repository: Saqoosha/HDZap

Length of output: 1735

🏁 Script executed:

cat app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift | grep -A 1 -B 1 "en-US.*Neural"

Repository: Saqoosha/HDZap

Length of output: 992

🌐 Web query:

Azure SSML voice element xml:gender attribute required

💡 Result:

The Azure SSML element attribute xml:gender is not required. Microsoft’s Azure Speech SSML documentation lists attributes (name, xml:lang, role, effect, etc.) and shows xml:gender in examples, but it is not marked as a required attribute; the only required SSML root attributes for the document are version, xmlns, and xml:lang, and the element must contain at least one element [1]. In addition, Microsoft examples explicitly use while still treating other selection attributes like role as optional, indicating gender is used as an optional qualifier rather than a mandatory field [2]. So: provide xml:gender only if you need to request a preferred voice gender; otherwise you can omit it (select the voice via name, and set xml:lang for language) [2][1].

Citations:

1: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-structure

2: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-voice

🏁 Script executed:

cd workers/hdzap-premium/src && sed -n '557,625p' index.ts | head -80

Repository: Saqoosha/HDZap

Length of output: 2612

Azure の gender 推定がカタログと食い違っています。また、xml:gender は実は必須ではありません。

PremiumVoiceCatalog には en-US-DavisNeural、en-US-TonyNeural、en-US-GuyNeural、en-US-JasonNeural、en-GB-RyanNeural など複数の英語男性 Azure 音声が定義されていますが、azureGenderFor() は Daichi|Keita|Naoki 以外を全て Female 扱いするため、これらの男性音声が女性として送信されてしまいます。

また、コメントで「xml:gender は必須」と述べられていますが、Microsoft の SSML ドキュメントでは xml:gender は実はオプション属性です。<voice> 要素は name で声を指定し、xml:lang で言語を設定すれば十分です。gender を省略することも、カタログから正しく取得することも可能なので、現在の推測ベースのアプローチより安全です。

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/hdzap-premium/src/index.ts` around lines 557 - 562, The azureGenderFor function incorrectly infers gender (treating everything not matching /Daichi|Keita|Naoki/ as "Female") causing many male voices from PremiumVoiceCatalog (e.g., en-US-DavisNeural, en-US-TonyNeural, en-US-GuyNeural, en-US-JasonNeural, en-GB-RyanNeural) to be mis-tagged; replace this heuristic by either (A) removing the xml:gender attribute entirely when building SSML in the code paths that call azureGenderFor (since xml:gender is optional) or (B) deriving gender from a concrete source such as the PremiumVoiceCatalog mapping (lookup by voiceId) and returning the correct "Male"/"Female" for azureGenderFor; update any SSML construction points to use the catalog lookup or omit xml:gender, and remove the misleading comment that states xml:gender is required.

The Web Flasher workflow checks out both main and develop on every push, then composes a single Pages artefact. The JA privacy page was added on this PR (targeting develop) but doesn't exist on main yet, so `cp` failed when staging the main side. Guard the JA-privacy copy with a file check and only mkdir the destination when the source is present — main's slot omits /privacy/ja/ until the next release rolls develop into main. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

MARKETING_VERSION 1.0.1 → 1.1.0 (new Premium subscription feature) CURRENT_PROJECT_VERSION 11 → 12 (unique build per Apple) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Critical regressions from the previous fix commit: - Premium speakAsync re-entry races: the catch handler called `self.cancel()` unconditionally, which would tear down the NEW currentTask if a fresh speakAsync replaced the cancelled one before its catch ran. Catch now captures the generation token at speakAsync entry and bails when the generation has been bumped (i.e., we've been superseded). - Inflight counter leak: `notifyEnd`'s `guard isPlaying` swallowed the end notification when `speak()` threw BEFORE setting `isPlaying = true` (`bearer.isEmpty`, bad URL, `configureSession()` failure). The companion fix moves `isPlaying = true` to `speakAsync` BEFORE spawning the Task so any early error still produces a clean end notification, and notifyEnd no longer guards on isPlaying — it fires pendingOnEnd at most once per speakAsync invocation regardless of whether playback actually started. - Picker / paywall preview counter underflow: PaywallView and PremiumVoicePickerView call `announcer.premiumSynth.speakAsync(...)` directly, bypassing `LapAnnouncer.speak`'s inflight increment. The global `onUtteranceEnd` callback that fix-PR-#73 wired into LapAnnouncer would have fired `utteranceDidEnd()` on every preview end, driving the counter negative (debug crash via underflow assertion; release lifecycle corruption). Replaced the global callback with a per-call `onEnd:` parameter on speakAsync — LapAnnouncer.speak passes a decrement closure, preview surfaces don't, so previews neither increment nor decrement. - Premium end now also calls `deactivateSession()` symmetric with the System path's `didFinish` / `didCancel` delegate — PR-#73's wiring only decremented the counter, leaving the audio session active until the next System utterance happened to deactivate it. Worker: - `provider` defaulting to `"cartesia"` on missing field forwarded malformed requests to a real (billable) upstream call. Reject with `400 missing-provider` instead. - Content-type guard moved INSIDE each proxy function so it checks the actual provider response, not Worker-wrapped headers. The previous placement read our own normalized Content-Type and was a no-op against the original concern (Azure maintenance HTML, Cartesia plaintext quota notice). Polly accepts `audio/*` or `application/*; ...pcm...` because AWS has tweaked the type string in the past. - Cert-chain DER comparison uses `Buffer.from(...).equals(...)` so it works whether `X509Certificate.raw` is a Buffer or an ArrayBuffer (Workers `nodejs_compat` has varied between releases). Indexing a raw ArrayBuffer with `[i]` returns undefined and would have made every Premium JWS reject. Misc: - TTSCache capacity-estimate comment matched the new raw-PCM layout (32-48 KB/sec depending on sample rate) instead of the old mp3 sizing. - Comment polish per the comment-analyzer (drop dead `_ = error`, trim "previous behavior" historical reference in refreshEntitlement, tighten verifyChain doc parenthetical, remove `void userId` dead binding — userId is genuinely used in the auth log line above). - `cancel()` doc calls out the `&+=` overflow-wrap choice explicitly so a future reader doesn't read it as a hand-rolled bug. Verification: `xcodebuild build -configuration Debug` succeeds locally. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Fix multi-agent review findings from PR #73 Aggregated Critical/High findings from /total-review (7 agents) on the Premium TTS subscription PR. Bundled into one atomic fix commit. Worker (workers/hdzap-premium): - appleJws.ts: anchor the x5c chain to the embedded Apple Root CA G3 by DER equality (not just fingerprint match), and check every cert's validity window so expired intermediates can't pass. Closes the bypass where an attacker-controlled chain with the public Apple Root tacked on top would satisfy the previous fingerprint-only check. - index.ts: drop the wrong "X-HDZap-SampleRate: 22050" header on the Polly miss path (Polly Neural PCM is always 16 kHz — the 22050 leftover would pitch-shift cache hits 37% if iOS ever started honouring the header). - index.ts: reject upstream 200 responses whose Content-Type isn't audio/* (Polly/Azure) or text/event-stream (Cartesia). Without this an Azure maintenance page or Cartesia plaintext quota notice would be served to iOS as PCM and play as static. - Drop stale "backwards compat with iOS build that doesn't yet send provider" comment + clarify the deferred `userId` binding. iOS Speech (app/HDZap/Models/Speech, app/HDZap/Models): - PremiumSpeechSynthesizer: speakAsync() now tears down audio via cancel() before spawning a new task. Previously only Task.cancel() ran, which left the player node + AVAudioEngine running and the prior utterance's tail bled into the new voice's head. - PremiumSpeechSynthesizer: cancel() now stops the engine and bumps a per-utterance generation counter; scheduled-buffer completion callbacks ignore decrements whose captured generation no longer matches, so a stale callback from a superseded utterance can't flip isPlaying false on a fresh one. cancel() also resets accumulatedPCM so a previous partial stream can't bleed into the next entry. - PremiumSpeechSynthesizer: new onUtteranceEnd callback fires exactly once per utterance end (drain / cancel / error). LapAnnouncer wires it to utteranceDidEnd() so Premium playback now participates in the same inflight-utterance bookkeeping the System path always had — the warm- keeper engine and session-hold lifecycle were silently broken on the Premium branch (inflightUtteranceCount stayed at 0 forever). - PremiumSpeechSynthesizer: trim odd trailing byte before writing PCM to TTSCache (the cached file would otherwise have a 0.5-sample garbage tail that played back as a click on every cache hit). - PremiumSpeechSynthesizer: throw streamFailure when Cartesia SSE drains with zero audio chunks despite non-zero lines — provider schema drift would otherwise silently fall back to System voice with no signal. - LapAnnouncer: increment inflightUtteranceCount on the Premium branch and configure the shared audio session up front; Premium now matches the System path's lifecycle. iOS IAP (app/HDZap/Models/IAP): - SubscriptionManager: pick the strongest entitlement (active > grace, then latest expiry) instead of whichever transaction the loop visited last. Apple can hand back multiple entitlements on monthly→yearly upgrades. - SubscriptionManager: clear currentJWS once a grace-period transaction is more than 16 days past its expiry so we stop pinging the Worker with a token it'll reject. SubscriptionStatus.inGracePeriod now carries a non-optional Date (grace by definition implies a known expiry). - SubscriptionManager: purchase() populates lastError before re-throwing so the paywall can surface the error without callers plumbing it. - SubscriptionManager: .unverified branch sets lastError so a tampered transaction shows a banner instead of leaving the operator wondering why Subscribe did nothing. iOS Views (app/HDZap/Views/Settings): - PaywallView: drop the misleading comment that claimed SubscriptionManager.lastError was set inside purchase() (it now is). - AudioSettingsView: drop stale "before that wiring lands" reference. Privacy + CI (docs, .github): - Fix the broken Cartesia privacy link in both EN and JA pages (https://cartesia.ai/privacy → /legal/privacy.html — the old URL returns 404, which is an App Privacy review smell). - Guard each manual-page cp in flasher.yml so a page added on one branch but not the other doesn't break the composite build. Misc: - Sync app/HDZap.xcodeproj/project.pbxproj to the 1.1.0/12 versions in project.yml (xcodegen drift after PR #73 merged). - Drop iOS 26 typo + 2026-05-19 datestamp + Phase 2 SpeechRouter references that all rotted after PR #73 landed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Address iteration-1 review (7 agents + 5 CodeRabbit PR threads) Critical regressions from the previous fix commit: - Premium speakAsync re-entry races: the catch handler called `self.cancel()` unconditionally, which would tear down the NEW currentTask if a fresh speakAsync replaced the cancelled one before its catch ran. Catch now captures the generation token at speakAsync entry and bails when the generation has been bumped (i.e., we've been superseded). - Inflight counter leak: `notifyEnd`'s `guard isPlaying` swallowed the end notification when `speak()` threw BEFORE setting `isPlaying = true` (`bearer.isEmpty`, bad URL, `configureSession()` failure). The companion fix moves `isPlaying = true` to `speakAsync` BEFORE spawning the Task so any early error still produces a clean end notification, and notifyEnd no longer guards on isPlaying — it fires pendingOnEnd at most once per speakAsync invocation regardless of whether playback actually started. - Picker / paywall preview counter underflow: PaywallView and PremiumVoicePickerView call `announcer.premiumSynth.speakAsync(...)` directly, bypassing `LapAnnouncer.speak`'s inflight increment. The global `onUtteranceEnd` callback that fix-PR-#73 wired into LapAnnouncer would have fired `utteranceDidEnd()` on every preview end, driving the counter negative (debug crash via underflow assertion; release lifecycle corruption). Replaced the global callback with a per-call `onEnd:` parameter on speakAsync — LapAnnouncer.speak passes a decrement closure, preview surfaces don't, so previews neither increment nor decrement. - Premium end now also calls `deactivateSession()` symmetric with the System path's `didFinish` / `didCancel` delegate — PR-#73's wiring only decremented the counter, leaving the audio session active until the next System utterance happened to deactivate it. Worker: - `provider` defaulting to `"cartesia"` on missing field forwarded malformed requests to a real (billable) upstream call. Reject with `400 missing-provider` instead. - Content-type guard moved INSIDE each proxy function so it checks the actual provider response, not Worker-wrapped headers. The previous placement read our own normalized Content-Type and was a no-op against the original concern (Azure maintenance HTML, Cartesia plaintext quota notice). Polly accepts `audio/*` or `application/*; ...pcm...` because AWS has tweaked the type string in the past. - Cert-chain DER comparison uses `Buffer.from(...).equals(...)` so it works whether `X509Certificate.raw` is a Buffer or an ArrayBuffer (Workers `nodejs_compat` has varied between releases). Indexing a raw ArrayBuffer with `[i]` returns undefined and would have made every Premium JWS reject. Misc: - TTSCache capacity-estimate comment matched the new raw-PCM layout (32-48 KB/sec depending on sample rate) instead of the old mp3 sizing. - Comment polish per the comment-analyzer (drop dead `_ = error`, trim "previous behavior" historical reference in refreshEntitlement, tighten verifyChain doc parenthetical, remove `void userId` dead binding — userId is genuinely used in the auth log line above). - `cancel()` doc calls out the `&+=` overflow-wrap choice explicitly so a future reader doesn't read it as a hand-rolled bug. Verification: `xcodebuild build -configuration Debug` succeeds locally. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Resolves conflicts against develop's StoreKit2 paywall (PR #73 + #74) on top of the maintainer's review-feedback follow-ups for PR #60. Subscription gate: - TimerView.publishWatchSnapshot sends hapticsEnabled = (watchHapticsEnabled && subscription.isEntitled), so the watch only acts on what the iPhone says. Single gate — no parallel check on the wrist. - PublishWatchSnapshotModifier picks up subscription.isEntitled as a tracked input, and the .onChange handler clears watchHapticsEnabled when entitlement lapses (mirrors the TTS-engine rollback in AudioSettingsView). - WatchSettingsView replaces the toggle (and the Try-haptics audition section) with a "Subscribe to enable" paywall card when not entitled. PaywallView presented via .sheet. - SettingsView's row summary shows "Premium" for non-subscribers. Conflict resolution notes: - project.pbxproj: took ours (xcodegen will regenerate from project.yml after merge). - Localizable.xcstrings: took theirs to keep develop's full Premium TTS translation catalog. Watch-specific keys regenerate on next build.

Saqoosha and others added 30 commits May 19, 2026 18:35

Saqoosha and others added 2 commits May 20, 2026 17:47

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Saqoosha and others added 2 commits May 21, 2026 00:05

Bump to 1.1.0 (build 12) for first TestFlight beta with Premium TTS

4b1d944

MARKETING_VERSION 1.0.1 → 1.1.0 (new Premium subscription feature) CURRENT_PROJECT_VERSION 11 → 12 (unique build per Apple) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Saqoosha merged commit 1b42073 into develop May 20, 2026
3 checks passed

Saqoosha mentioned this pull request May 20, 2026

Fix multi-agent review findings from PR #73 #74

Merged

4 tasks

coderabbitai Bot mentioned this pull request May 22, 2026

Remove Cartesia provider — keep AWS Polly + Microsoft Azure only #80

Merged

6 tasks

Conversation

Saqoosha commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

リリースノート

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Saqoosha commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading