Remove Cartesia provider — keep AWS Polly + Microsoft Azure only#80
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
WalkthroughPremium TTS 提供元から Cartesia サポートを削除し、Polly(AWS)と Azure(Microsoft)のみに統一。TTSキャッシュキー契約を v5→v6(Worker は v2→v3)に更新して model パラメータを廃止。iOS と Worker のストリーミング経路を生 PCM 前提で整理し、LapAnnouncer のプリウォーム並行制御・キャンセル再開ロジックを追加。 ChangesCartesia 削除と Premium TTS Polly/Azure 統一
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/HDZap/Models/LapAnnouncer.swift (1)
1025-1034:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPremium を外したときも既存 prewarm を止めてください。
Line 1026 の
guardが先にreturnするので、設定変更で system に戻したり voice を未選択にした直後でも、前回のcurrentPrewarmTaskは走り続けます。不要な prefetch が残ると upstream への負荷と 429 リスクをまた増やすので、キャンセルはguardより前で必ず実行した方がいいです。💡 修正案
func prewarmFixedPhrases() { + currentPrewarmTask?.cancel() + currentPrewarmTask = nil guard let voice = currentPremiumVoiceIfActive() else { return } let phrases = fixedPrewarmPhrases(for: currentLanguage()) let synth = premiumSynth let lang = voice.lang - currentPrewarmTask?.cancel() - currentPrewarmTask = Task.detached { `@MainActor` in + currentPrewarmTask = Task.detached { `@MainActor` in🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/HDZap/Models/LapAnnouncer.swift` around lines 1025 - 1034, In prewarmFixedPhrases(), always cancel any existing currentPrewarmTask before the early exit: move the currentPrewarmTask?.cancel() call to the top of the function (before the guard let voice = currentPremiumVoiceIfActive() else { return }) so switching off premium or deselecting voice stops any running prewarm; keep the rest of the logic (computing phrases, synth, lang, and creating Task.detached) unchanged and ensure currentPrewarmTask is then set to the new Task only when a new prewarm starts.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/HDZap/Models/LapAnnouncer.swift`:
- Around line 355-367: premiunUtteranceEnded currently always calls
prewarmFixedPhrases after utteranceDidEnd/deactivateSession, which can restart
prewarm while a replacement utterance is still playing; modify
premiumUtteranceEnded to only call prewarmFixedPhrases when there are no active
utterances (check inflightUtteranceCount == 0) after invoking utteranceDidEnd(),
keeping deactivateSession() as-is; reference premiumUtteranceEnded,
utteranceDidEnd, prewarmFixedPhrases, deactivateSession, and
inflightUtteranceCount when making the change.
In `@app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift`:
- Around line 578-590: The retry path swallows Task cancellation because `try?
await Task.sleep(...)` ignores CancellationError and causes a retry even after
prewarm was cancelled; in PremiumSpeechSynthesizer's prewarm code replace the
`try? await Task.sleep(nanoseconds: 1_000_000_000)` with a
cancellation-respecting approach (e.g. use `try await Task.sleep(...)` and let
CancellationError propagate, or explicitly check `Task.isCancelled` and abort
before re-sending the request) so the subsequent `(data, response) = try await
URLSession.shared.data(for: req)` is not executed when the task has been
cancelled.
---
Outside diff comments:
In `@app/HDZap/Models/LapAnnouncer.swift`:
- Around line 1025-1034: In prewarmFixedPhrases(), always cancel any existing
currentPrewarmTask before the early exit: move the currentPrewarmTask?.cancel()
call to the top of the function (before the guard let voice =
currentPremiumVoiceIfActive() else { return }) so switching off premium or
deselecting voice stops any running prewarm; keep the rest of the logic
(computing phrases, synth, lang, and creating Task.detached) unchanged and
ensure currentPrewarmTask is then set to the new Task only when a new prewarm
starts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5245d31b-7132-4dfb-b85b-56f7e20b7927
⛔ Files ignored due to path filters (1)
workers/hdzap-premium/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (10)
app/HDZap.xcodeproj/project.pbxprojapp/HDZap/Models/LapAnnouncer.swiftapp/HDZap/Models/Speech/PremiumSpeechSynthesizer.swiftapp/HDZap/Models/Speech/TTSCache.swiftapp/HDZap/Resources/StoreKit/HDZapPremium.storekitapp/HDZap/Views/Settings/AudioSettingsView.swiftapp/HDZap/Views/Settings/PaywallView.swiftapp/HDZap/Views/Settings/PremiumVoicePickerView.swiftapp/project.ymlworkers/hdzap-premium/src/index.ts
78ff8aa to
f0cf7ba
Compare
- PremiumVoiceProvider enum loses the .cartesia case. PremiumVoiceCatalog drops 25 Cartesia voices (22 JA + 3 EN); Polly (14) + Azure (16) = 30 voices total, plenty for race-announcer / friendly-narrator personas across US/UK/AU accents
- Worker drops the proxyCartesia function, the CARTESIA_API_KEY env binding, the ALLOWED_CARTESIA_MODELS allow-list, the cartesia branch in the /tts handler, and the cartesia / SSE special-cases in contentTypeFor / sampleRateFor / responseHeadersFor. The Cartesia model field is removed from buildCacheKey; R2 key prefix bumps v2 -> v3 to invalidate the old entries
- iOS drops parseSSE + handleEventJSON (SSE was Cartesia-only — Polly + Azure stream raw PCM directly). prefetch + sendAndStream collapse to a single Polly/Azure raw-PCM path. accumulatedPCM, currentSampleRate, and the resampler logic stay (Polly is still 16 kHz, needs upsample). TTSCache key drops the trailing model segment and bumps v5 -> v6
- All Cartesia comments / labels / hint strings removed from PaywallView, PremiumVoicePickerView, AudioSettingsView, LapAnnouncer, and the HDZapPremium.storekit product descriptions ('30+ voices across AWS Polly and Microsoft Azure'). Sample voice teaser on paywall is now 2-row (Polly + Azure) instead of 3-row
- Motivation: Cartesia's per-IP rate limit on parallel SSE bursts kept dropping prewarm prefetches (observed 8 of 14 phrases 429'd at countdownStartSeconds=15 with the 3-concurrent cap from PR #79). Polly + Azure are both more permissive on TPS, so removing Cartesia eliminates the rate-limit class of bugs entirely while retaining a strong race-announcer voice catalog
- Code simplification: ~450 lines removed across Worker + iOS, no SSE parsing, no model-field special-cases, no Cartesia-specific UI disclaimers. Lower-surface-area Premium TTS for the remaining roadmap work
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
f0cf7ba to
25234da
Compare
…lyphase tail Polly 16 kHz countdown utterances were truncating ~10-20 ms at the end because the one-shot resampler input block returned .noDataNow instead of .endOfStream. The polyphase upsampler's FIR-filter tail stayed buffered inside AVAudioConverter and never reached the output buffer. Azure (24 kHz native) bypasses the converter so was unaffected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- PremiumVoicePickerView: selecting a premium voice now also flips `ttsEngine` to "premium" (both on direct tap when entitled and on post-purchase auto-commit). Previously the selectedId was written but the router kept routing through System voice, so the picker looked like it did nothing. - AudioSettingsView: the Voice section's Reset button no longer touches Announcement-section keys (master toggle, announce-best, countdown enable/start). It now also resets ttsEngine and the premium voice / rate / pitch so a single tap returns the entire Voice section to defaults, while leaving the Announcement section untouched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
25234da to
01ba91c
Compare
…rsions Two defensive hardenings in `buildOverlapBuffer` that protect against the exact silent-failure mode the prior `.endOfStream` fix targeted, in case AVAudioConverter behaves at the edges of its documented contract. - Raise `outputCapacity` headroom from +64 to +1024 frames. The FIR group delay for 16 kHz → 24 kHz upsample is ~300-600 output samples; +64 (~2.7 ms) leaves no margin if Apple's converter ever needs slightly more room after `.endOfStream` than the prior frame ratio suggested. +1024 (~43 ms) costs ~2 KB per utterance and gives a 2× safety factor. - Throw on `outputBuffer.frameLength == 0`. `convert()` can return a non-error status while still producing zero frames; scheduling such a buffer on AVAudioPlayerNode is a silent no-op (no completion callback path that signals failure). The throw lets `speakOverlap`'s catch set `lastError` so the missing audio is visible in the Settings banner / dev panel instead of an unexplained silent countdown beat. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ning Build-number-only bump — MARKETING_VERSION stays at 1.1.0 so the build ships to existing 1.1.0 beta testers via Apple's fast-track build review, without re-running beta-review approval. Includes PR #80 in develop: Cartesia provider removed, Polly countdown tail truncation fixed (.endOfStream flush), voice picker engine flip on selection, Voice section Reset scope narrowed, AVAudioConverter defensive zero-frame guard + wider FIR tail headroom. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
PremiumVoiceProviderloses the.cartesiacase.PremiumVoiceCatalogdrops 25 Cartesia voices (22 JA + 3 EN). Polly (14) + Azure (16) = 30 voices still covers US/UK/AU + JA/EN with race-announcer and friendly-narrator personas.proxyCartesiadeleted,CARTESIA_API_KEYbinding removed,ALLOWED_CARTESIA_MODELSallow-list gone, cartesia branches removed fromcontentTypeFor/sampleRateFor/responseHeadersFor.buildCacheKeydrops themodelfield; R2 key prefix bumpsv2 → v3.parseSSE+handleEventJSONdeleted (SSE was Cartesia-only — Polly + Azure stream raw PCM directly).prefetch+sendAndStreamcollapse to a single Polly/Azure raw-PCM path.TTSCache.keydrops themodelfield; local cache prefix bumpsv5 → v6.PaywallView,PremiumVoicePickerView,AudioSettingsView,LapAnnouncer, and theHDZapPremium.storekitproduct descriptions ("30+ voices across AWS Polly and Microsoft Azure"). Paywall sample teaser is now 2-row (Polly + Azure) instead of 3-row.Why
TestFlight Build 17 tester report: Cartesia's per-IP rate limit on parallel SSE bursts kept dropping prewarm prefetches — even with the 3-concurrent cap added in PR #79, 8 of 14 phrases returned 429 at
countdownStartSeconds=15and Start audio failed to play because the user-visible request hit the rate-limit ceiling. Polly + Azure are both materially more permissive on TPS, so removing Cartesia eliminates the entire rate-limit class of bugs while retaining a strong race-announcer voice catalogue.Bonus: ~450 lines removed across Worker + iOS, no more SSE parsing, no model-field special-cases, no Cartesia-specific UI disclaimers. Lower-surface-area Premium TTS for the remaining roadmap work.
Test plan
.cartesiareferences) — already done locally,** BUILD SUCCEEDED **.wrangler deploy) and confirm a/ttsPOST withprovider=pollyand one withprovider=azureeach return audio. A POST withprovider=cartesiashould now returnbad-provider(400).premiumLapVoiceIdUUID), the voice falls back to the System path becausecurrentPremiumVoiceIfActive()returns nil (voice ID no longer in catalog). Re-pick a Polly or Azure voice.Summary by CodeRabbit
Chores
Improvements
UI Updates