Remove Cartesia provider — keep AWS Polly + Microsoft Azure only by Saqoosha · Pull Request #80 · Saqoosha/HDZap

Saqoosha · 2026-05-22T03:47:07Z

Summary

Drop the Cartesia provider entirely from the Premium TTS path on iOS + Worker. PremiumVoiceProvider loses the .cartesia case. PremiumVoiceCatalog drops 25 Cartesia voices (22 JA + 3 EN). Polly (14) + Azure (16) = 30 voices still covers US/UK/AU + JA/EN with race-announcer and friendly-narrator personas.
Worker: proxyCartesia deleted, CARTESIA_API_KEY binding removed, ALLOWED_CARTESIA_MODELS allow-list gone, cartesia branches removed from contentTypeFor / sampleRateFor / responseHeadersFor. buildCacheKey drops the model field; R2 key prefix bumps v2 → v3.
iOS: parseSSE + handleEventJSON deleted (SSE was Cartesia-only — Polly + Azure stream raw PCM directly). prefetch + sendAndStream collapse to a single Polly/Azure raw-PCM path. TTSCache.key drops the model field; local cache prefix bumps v5 → v6.
All Cartesia labels / hint strings / comments removed from PaywallView, PremiumVoicePickerView, AudioSettingsView, LapAnnouncer, and the HDZapPremium.storekit product descriptions ("30+ voices across AWS Polly and Microsoft Azure"). Paywall sample teaser is now 2-row (Polly + Azure) instead of 3-row.

Why

TestFlight Build 17 tester report: Cartesia's per-IP rate limit on parallel SSE bursts kept dropping prewarm prefetches — even with the 3-concurrent cap added in PR #79, 8 of 14 phrases returned 429 at countdownStartSeconds=15 and Start audio failed to play because the user-visible request hit the rate-limit ceiling. Polly + Azure are both materially more permissive on TPS, so removing Cartesia eliminates the entire rate-limit class of bugs while retaining a strong race-announcer voice catalogue.

Bonus: ~450 lines removed across Worker + iOS, no more SSE parsing, no model-field special-cases, no Cartesia-specific UI disclaimers. Lower-surface-area Premium TTS for the remaining roadmap work.

Test plan

Build the iOS app and confirm it compiles (no leftover .cartesia references) — already done locally, ** BUILD SUCCEEDED **.
Deploy the Worker to staging / dev (wrangler deploy) and confirm a /tts POST with provider=polly and one with provider=azure each return audio. A POST with provider=cartesia should now return bad-provider (400).
Open Settings → Audio → Premium voice picker, confirm only Polly + Azure sections render and the catalogue lists 30 voices.
If the operator previously had a Cartesia voice selected (premiumLapVoiceId UUID), the voice falls back to the System path because currentPremiumVoiceIfActive() returns nil (voice ID no longer in catalog). Re-pick a Polly or Azure voice.
Race with Azure Daichi at 1.45× / 0.0st pitch — countdown should now play every number without the rate-limit drops the user saw with Cartesia Ayumi.
Confirm the v6 local cache prefix invalidates earlier entries: first race after this deploy fetches fresh from Polly / Azure, subsequent races hit local cache.

Summary by CodeRabbit

Chores
- ビルドバージョンを 16 から 17 に更新しました。
Improvements
- Premium音声を AWS Polly と Microsoft Azure に統合しました（Cartesia を除外）。
- 音声キャッシュ仕様を更新し、既存のキャッシュキーは無効化されます。
- Premium音声のプリウォーム処理と並行実行制御を強化しました。
- 起動時に存在しない保存済みプレミアム音声IDをリセットするようにしました。
UI Updates
- 設定画面とペイウォールのプロバイダ表記・文言を Polly/Azure 前提へ更新しました。

coderabbitai · 2026-05-22T03:47:18Z

Warning

Rate limit exceeded

@Saqoosha has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 10 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 89c85f52-976f-4a48-8480-5ec7e73d3ef1

📥 Commits

Reviewing files that changed from the base of the PR and between 25234da and 3f3328c.

📒 Files selected for processing (3)

app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift
app/HDZap/Views/Settings/AudioSettingsView.swift
app/HDZap/Views/Settings/PremiumVoicePickerView.swift

Walkthrough

Premium TTS 提供元から Cartesia サポートを削除し、Polly（AWS）と Azure（Microsoft）のみに統一。TTSキャッシュキー契約を v5→v6（Worker は v2→v3）に更新して model パラメータを廃止。iOS と Worker のストリーミング経路を生 PCM 前提で整理し、LapAnnouncer のプリウォーム並行制御・キャンセル再開ロジックを追加。

Changes

Cartesia 削除と Premium TTS Polly/Azure 統一

Layer / File(s)	Summary
TTS キャッシュキー契約の更新（v5→v6、model パラメータ削除） `app/HDZap/Models/Speech/TTSCache.swift`, `app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift`, `workers/hdzap-premium/src/index.ts`	TTSCache のカノニカル文字列を更新し、key から `model` を削除。iOS と Worker のキャッシュ呼び出し/説明を合わせて変更し、既存キャッシュとの互換性を断つ。
Worker 側 Cartesia 削除と環境・ルーティング統一 `workers/hdzap-premium/src/index.ts`	CARTESIA_API_KEY バインディングと `model` フィールド、proxyCartesia を削除。許可プロバイダを `polly`/`azure` のみとし、キャッシュプレフィックスを v2→v3 に更新。レスポンスヘッダと R2 書き込み前説明を Polly/Azure 前提に統一。
iOS PremiumSpeechSynthesizer の Polly/Azure 統一化 `app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift`	`PremiumVoiceProvider` から `.cartesia` を削除しカタログを再編。ストリーミングを常に生 PCM 経路へ統一、SSE パースと関連デバッグカウンタを除去、prefetch に 429 リトライ追加、AVAudioConverter 終端挙動を `.endOfStream` に調整。キャッシュキー呼び出しで `model` を渡さない。
LapAnnouncer プリウォーム並行制御とキャンセル管理 `app/HDZap/Models/LapAnnouncer.swift`	`currentPrewarmTask` を導入。Premium 発話直前に進行中のプリウォームを cancel() して nil にし、発話終了時に `inflightUtteranceCount == 0` の場合のみ `prewarmFixedPhrases()` を再開。`prewarmFixedPhrases()` を `withTaskGroup`（maxConcurrent=3）へ変更。
ビルド番号更新と UI/StoreKit 表記の Cartesia 削除 `app/HDZap.xcodeproj/project.pbxproj`, `app/project.yml`, `app/HDZap/Resources/StoreKit/HDZapPremium.storekit`, `app/HDZap/Views/Settings/AudioSettingsView.swift`, `app/HDZap/Views/Settings/PaywallView.swift`, `app/HDZap/Views/Settings/PremiumVoicePickerView.swift`	Xcode と project.yml の `CURRENT_PROJECT_VERSION` を 16→17 に更新。StoreKit と UI 表示を Cartesia 除外（35+→30+）、Premium ボイス一覧/ヒント/レート・ピッチ表記を Polly/Azure 前提へ整理。
アプリ初期化: 保存済みプレミアム音声ID の整合性チェック `app/HDZap/HDZapApp.swift`	起動時に UserDefaults の保存プレミアム音声IDが現在の `PremiumVoiceCatalog.voices` に存在しない場合、デフォルト ID にリセットするガード処理を追加。

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Saqoosha/HDZap#12: LapAnnouncer 周り（TTS 発話/統合）を変更しているため重複箇所がある可能性があります.
Saqoosha/HDZap#78: LapAnnouncer.prewarmFixedPhrases の呼び出しタイミング変更と実装面で直接関連します.
Saqoosha/HDZap#74: PremiumSpeechSynthesizer / LapAnnouncer の inflight 発話管理に関する変更が重複しています.

Poem

🐰 Cartesia はお別れ、Polly と Azure で
キャッシュは v6 にぴょんと跳び
三つずつプリウォーム、焦らずに歌い
PCM の流れがすっきり整い
さあ、30+ の声で新しい朝へ

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Remove Cartesia provider — keep AWS Polly + Microsoft Azure only' directly and clearly summarizes the main objective of the pull request, which is to remove the Cartesia TTS provider while retaining AWS Polly and Microsoft Azure.
Docstring Coverage	✅ Passed	Docstring coverage is 86.96% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch remove-cartesia-provider

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

app/HDZap/Models/LapAnnouncer.swift (1)

1025-1034: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Premium を外したときも既存 prewarm を止めてください。

Line 1026 の guard が先に return するので、設定変更で system に戻したり voice を未選択にした直後でも、前回の currentPrewarmTask は走り続けます。不要な prefetch が残ると upstream への負荷と 429 リスクをまた増やすので、キャンセルは guard より前で必ず実行した方がいいです。

💡 修正案

     func prewarmFixedPhrases() {
+        currentPrewarmTask?.cancel()
+        currentPrewarmTask = nil
         guard let voice = currentPremiumVoiceIfActive() else { return }
         let phrases = fixedPrewarmPhrases(for: currentLanguage())
         let synth = premiumSynth
         let lang = voice.lang
-        currentPrewarmTask?.cancel()
-        currentPrewarmTask = Task.detached { `@MainActor` in
+        currentPrewarmTask = Task.detached { `@MainActor` in

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/HDZap/Models/LapAnnouncer.swift` around lines 1025 - 1034, In
prewarmFixedPhrases(), always cancel any existing currentPrewarmTask before the
early exit: move the currentPrewarmTask?.cancel() call to the top of the
function (before the guard let voice = currentPremiumVoiceIfActive() else {
return }) so switching off premium or deselecting voice stops any running
prewarm; keep the rest of the logic (computing phrases, synth, lang, and
creating Task.detached) unchanged and ensure currentPrewarmTask is then set to
the new Task only when a new prewarm starts.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/HDZap/Models/LapAnnouncer.swift`:
- Around line 355-367: premiunUtteranceEnded currently always calls
prewarmFixedPhrases after utteranceDidEnd/deactivateSession, which can restart
prewarm while a replacement utterance is still playing; modify
premiumUtteranceEnded to only call prewarmFixedPhrases when there are no active
utterances (check inflightUtteranceCount == 0) after invoking utteranceDidEnd(),
keeping deactivateSession() as-is; reference premiumUtteranceEnded,
utteranceDidEnd, prewarmFixedPhrases, deactivateSession, and
inflightUtteranceCount when making the change.

In `@app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift`:
- Around line 578-590: The retry path swallows Task cancellation because `try?
await Task.sleep(...)` ignores CancellationError and causes a retry even after
prewarm was cancelled; in PremiumSpeechSynthesizer's prewarm code replace the
`try? await Task.sleep(nanoseconds: 1_000_000_000)` with a
cancellation-respecting approach (e.g. use `try await Task.sleep(...)` and let
CancellationError propagate, or explicitly check `Task.isCancelled` and abort
before re-sending the request) so the subsequent `(data, response) = try await
URLSession.shared.data(for: req)` is not executed when the task has been
cancelled.

---

Outside diff comments:
In `@app/HDZap/Models/LapAnnouncer.swift`:
- Around line 1025-1034: In prewarmFixedPhrases(), always cancel any existing
currentPrewarmTask before the early exit: move the currentPrewarmTask?.cancel()
call to the top of the function (before the guard let voice =
currentPremiumVoiceIfActive() else { return }) so switching off premium or
deselecting voice stops any running prewarm; keep the rest of the logic
(computing phrases, synth, lang, and creating Task.detached) unchanged and
ensure currentPrewarmTask is then set to the new Task only when a new prewarm
starts.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5245d31b-7132-4dfb-b85b-56f7e20b7927

📥 Commits

Reviewing files that changed from the base of the PR and between 62b9a96 and 3763fe1.

⛔ Files ignored due to path filters (1)

workers/hdzap-premium/package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (10)

app/HDZap.xcodeproj/project.pbxproj
app/HDZap/Models/LapAnnouncer.swift
app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift
app/HDZap/Models/Speech/TTSCache.swift
app/HDZap/Resources/StoreKit/HDZapPremium.storekit
app/HDZap/Views/Settings/AudioSettingsView.swift
app/HDZap/Views/Settings/PaywallView.swift
app/HDZap/Views/Settings/PremiumVoicePickerView.swift
app/project.yml
workers/hdzap-premium/src/index.ts

- PremiumVoiceProvider enum loses the .cartesia case. PremiumVoiceCatalog drops 25 Cartesia voices (22 JA + 3 EN); Polly (14) + Azure (16) = 30 voices total, plenty for race-announcer / friendly-narrator personas across US/UK/AU accents - Worker drops the proxyCartesia function, the CARTESIA_API_KEY env binding, the ALLOWED_CARTESIA_MODELS allow-list, the cartesia branch in the /tts handler, and the cartesia / SSE special-cases in contentTypeFor / sampleRateFor / responseHeadersFor. The Cartesia model field is removed from buildCacheKey; R2 key prefix bumps v2 -> v3 to invalidate the old entries - iOS drops parseSSE + handleEventJSON (SSE was Cartesia-only — Polly + Azure stream raw PCM directly). prefetch + sendAndStream collapse to a single Polly/Azure raw-PCM path. accumulatedPCM, currentSampleRate, and the resampler logic stay (Polly is still 16 kHz, needs upsample). TTSCache key drops the trailing model segment and bumps v5 -> v6 - All Cartesia comments / labels / hint strings removed from PaywallView, PremiumVoicePickerView, AudioSettingsView, LapAnnouncer, and the HDZapPremium.storekit product descriptions ('30+ voices across AWS Polly and Microsoft Azure'). Sample voice teaser on paywall is now 2-row (Polly + Azure) instead of 3-row - Motivation: Cartesia's per-IP rate limit on parallel SSE bursts kept dropping prewarm prefetches (observed 8 of 14 phrases 429'd at countdownStartSeconds=15 with the 3-concurrent cap from PR #79). Polly + Azure are both more permissive on TPS, so removing Cartesia eliminates the rate-limit class of bugs entirely while retaining a strong race-announcer voice catalog - Code simplification: ~450 lines removed across Worker + iOS, no SSE parsing, no model-field special-cases, no Cartesia-specific UI disclaimers. Lower-surface-area Premium TTS for the remaining roadmap work Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…lyphase tail Polly 16 kHz countdown utterances were truncating ~10-20 ms at the end because the one-shot resampler input block returned .noDataNow instead of .endOfStream. The polyphase upsampler's FIR-filter tail stayed buffered inside AVAudioConverter and never reached the output buffer. Azure (24 kHz native) bypasses the converter so was unaffected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- PremiumVoicePickerView: selecting a premium voice now also flips `ttsEngine` to "premium" (both on direct tap when entitled and on post-purchase auto-commit). Previously the selectedId was written but the router kept routing through System voice, so the picker looked like it did nothing. - AudioSettingsView: the Voice section's Reset button no longer touches Announcement-section keys (master toggle, announce-best, countdown enable/start). It now also resets ttsEngine and the premium voice / rate / pitch so a single tap returns the entire Voice section to defaults, while leaving the Announcement section untouched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…rsions Two defensive hardenings in `buildOverlapBuffer` that protect against the exact silent-failure mode the prior `.endOfStream` fix targeted, in case AVAudioConverter behaves at the edges of its documented contract. - Raise `outputCapacity` headroom from +64 to +1024 frames. The FIR group delay for 16 kHz → 24 kHz upsample is ~300-600 output samples; +64 (~2.7 ms) leaves no margin if Apple's converter ever needs slightly more room after `.endOfStream` than the prior frame ratio suggested. +1024 (~43 ms) costs ~2 KB per utterance and gives a 2× safety factor. - Throw on `outputBuffer.frameLength == 0`. `convert()` can return a non-error status while still producing zero frames; scheduling such a buffer on AVAudioPlayerNode is a silent no-op (no completion callback path that signals failure). The throw lets `speakOverlap`'s catch set `lastError` so the missing audio is visible in the Settings banner / dev panel instead of an unexplained silent countdown beat. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ning Build-number-only bump — MARKETING_VERSION stays at 1.1.0 so the build ships to existing 1.1.0 beta testers via Apple's fast-track build review, without re-running beta-review approval. Includes PR #80 in develop: Cartesia provider removed, Polly countdown tail truncation fixed (.endOfStream flush), voice picker engine flip on selection, Voice section Reset scope narrowed, AVAudioConverter defensive zero-frame guard + wider FIR tail headroom. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

coderabbitai Bot reviewed May 22, 2026

View reviewed changes

Comment thread app/HDZap/Models/LapAnnouncer.swift

Comment thread app/HDZap/Models/Speech/PremiumSpeechSynthesizer.swift

Saqoosha force-pushed the remove-cartesia-provider branch 2 times, most recently from 78ff8aa to f0cf7ba Compare May 22, 2026 04:00

Saqoosha force-pushed the remove-cartesia-provider branch from f0cf7ba to 25234da Compare May 22, 2026 08:21

Saqoosha and others added 2 commits May 22, 2026 17:51

Saqoosha force-pushed the remove-cartesia-provider branch from 25234da to 01ba91c Compare May 22, 2026 08:52

Saqoosha merged commit eaef49a into develop May 22, 2026
1 check passed

Saqoosha deleted the remove-cartesia-provider branch May 22, 2026 09:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Cartesia provider — keep AWS Polly + Microsoft Azure only#80

Remove Cartesia provider — keep AWS Polly + Microsoft Azure only#80
Saqoosha merged 4 commits into
developfrom
remove-cartesia-provider

Saqoosha commented May 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 22, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Saqoosha commented May 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Saqoosha commented May 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 22, 2026 •

edited

Loading