Skip to content

✨(feat) POC for audio prompting with Albert API#414

Open
natoromano wants to merge 4 commits into
mainfrom
nr/audio
Open

✨(feat) POC for audio prompting with Albert API#414
natoromano wants to merge 4 commits into
mainfrom
nr/audio

Conversation

@natoromano
Copy link
Copy Markdown
Collaborator

@natoromano natoromano commented Apr 15, 2026

Purpose

Wanted to build a small POC to see if audio prompting with Albert API (whisper large v3) worked well as-is, turns out it does. So I figured I'd make it a PR, but do whatever you want with it!

cc @eliott07

SCR-20260415-qjhj SCR-20260415-qjir

Summary by CodeRabbit

  • New Features

    • Audio recording and transcription in chat with start/cancel/confirm controls and live waveform.
    • Server-side audio transcription endpoint and configurable transcription backend; UI toggles via config flag.
    • Multilingual UI strings and transcription error messaging.
  • Tests

    • End-to-end and unit tests added for transcription backend and API endpoint.
  • Chores

    • Release notes/changelog entry and minor UI/CSS/formatting adjustments.

@natoromano natoromano requested a review from qbey April 15, 2026 16:57
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

Warning

Rate limit exceeded

@natoromano has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 51 minutes and 35 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 51 minutes and 35 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 197d1e4d-44e1-41a8-b156-ce0bd0564a57

📥 Commits

Reviewing files that changed from the base of the PR and between 74204d2 and 7f313bb.

📒 Files selected for processing (1)
  • src/frontend/apps/conversations/next-env.d.ts

Walkthrough

Adds end-to-end audio transcription: backend API endpoint and settings, abstract and Albert transcription backends, frontend recording hook and UI components, types/translations/styles, and tests for backend and frontend flows.

Changes

Cohort / File(s) Summary
Backend Exceptions & Settings
src/backend/chat/exceptions.py, src/backend/conversations/settings.py
Adds TranscriptionError and new settings: AUDIO_TRANSCRIPTION_BACKEND, AUDIO_MAX_SIZE, ALBERT_API_ASR_MODEL, ALBERT_API_ASR_LANGUAGE, plus REST throttle rate "transcribe"
Backend Audio Backend Impl
src/backend/chat/input/base_audio_backend.py, src/backend/chat/input/albert_audio_backend.py
Adds BaseAudioBackend interface and AlbertAudioBackend which POSTs multipart audio to Albert API with Bearer auth, trims text from JSON, and raises TranscriptionError on request failures
Backend View & Routing
src/backend/chat/views.py, src/backend/core/urls.py
Adds AudioTranscriptionView (multipart POST, size/auth checks, dynamic backend instantiation, maps transcription errors to 502) and registers .../transcribe/ route
Backend Tests
src/backend/chat/tests/input/test_albert_audio_backend.py, src/backend/chat/tests/views/test_audio_transcription.py, src/backend/core/tests/test_api_config.py
New tests for Albert backend request/response handling, endpoint auth/validation/error mapping, and config response field audio_transcription_enabled
Frontend Config & Types
src/frontend/apps/conversations/src/core/config/api/useConfig.tsx, src/frontend/apps/conversations/src/features/chat/types.tsx
Adds audio_transcription_enabled to config response and new TranscribeAudioResponse union type
Frontend Recording Hook
src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx
New useAudioRecording hook + RecordingState type: manages MediaRecorder, Web Audio visualization, uploads FormData to transcribe/, and invokes transcription callbacks
Frontend UI Components
src/frontend/apps/conversations/src/features/chat/components/AudioRecordButton.tsx, .../InputChatAction.tsx, .../InputChat.tsx
Adds AudioRecordButton waveform UI; integrates recording controls into InputChat/InputChatAction, keyboard handling during recording, and insertion of transcribed text into textarea
Frontend i18n, CSS & Config
src/frontend/apps/conversations/src/i18n/translations.json, src/frontend/apps/conversations/src/pages/globals.css, src/frontend/apps/conversations/tsconfig.json, CHANGELOG.md
Adds recording/transcription strings, updates button sizing styles, reformats tsconfig arrays, and updates CHANGELOG entry
Frontend Tests Removed / Minor Cleanups
src/frontend/apps/conversations/src/features/chat/components/__tests__/InputChatAction.test.tsx, src/frontend/apps/conversations/src/features/header/..., .../LanguagePicker.tsx
Removes InputChatAction test suite; minor import/spacing reorganizations in a few components

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Browser
    participant Frontend as InputChat / useAudioRecording
    participant API as AudioTranscriptionView
    participant Albert as Albert API

    User->>Frontend: Click record
    Frontend->>Browser: request microphone access
    Browser-->>Frontend: MediaStream + audio samples
    Frontend->>Frontend: start MediaRecorder & visualize volume
    User->>Frontend: Confirm recording
    Frontend->>Browser: stop MediaRecorder, build Blob
    Frontend->>API: POST /api/.../transcribe/ (FormData with audio)
    API->>API: validate auth & size
    API->>Albert: POST /v1/audio/transcriptions (multipart, Bearer)
    Albert-->>API: { "text": "..." }
    API-->>Frontend: { "text": "..." }
    Frontend->>Frontend: insert text into textarea
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

enhancement, frontend, backend

Suggested reviewers

  • qbey
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing audio prompting (transcription) functionality with the Albert API. It directly references the core feature and includes context about the integration.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch nr/audio

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: natoromano <nathanaelpromano@gmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (2)
src/backend/chat/input/base_audio_backend.py (1)

14-19: Fix transcribe contract doc mismatch.

Line 18 documents a language argument that is not part of the method signature on Line 10. Please align docstring and signature to avoid backend implementation drift.

🛠️ Suggested doc fix
     def transcribe(self, file_name: str, file_content: bytes, content_type: str) -> str:
         """
         Transcribe the given audio file to text.
@@
         Args:
             file_name (str): The original file name of the audio file.
             file_content (bytes): The raw audio file content.
             content_type (str): The MIME type of the audio file.
-            language (str): The language to transcribe the audio content in.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/input/base_audio_backend.py` around lines 14 - 19, The
docstring for the transcribe method in BaseAudioBackend mismatches its
signature: it documents a "language" parameter that does not exist. Update the
transcribe docstring to reflect the actual signature of
BaseAudioBackend.transcribe by removing the "language" entry (or if you intend
to support language, add a language parameter to the method signature and update
implementations accordingly); ensure the Args section only lists file_name,
file_content, and content_type to match the method name transcribe and its
parameters.
src/backend/chat/tests/input/test_albert_audio_backend.py (1)

49-63: Assert the configured ASR fields too.

This test only proves the auth header and raw bytes survive the request. A regression that drops model or language from the multipart form would still pass, even though both are now part of the backend configuration added in this PR. Please assert those fields in the outbound request as well.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/tests/input/test_albert_audio_backend.py` around lines 49 -
63, Update the test_transcribe_sends_correct_request test to also assert that
the configured ASR form fields are sent: after calling backend.transcribe and
grabbing request = responses.calls[0].request, check the multipart request.body
contains the form field names and values for the ASR configuration (e.g. that
b'name="model"' and the expected model value, and b'name="language"' and the
expected language value, are present) in addition to the existing Authorization
and audio-content assertions so regressions that drop model or language will
fail.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/backend/chat/input/albert_audio_backend.py`:
- Around line 57-59: The code currently treats a missing or empty "text" field
as success by returning ""—change this to treat missing/invalid transcription
payloads as failures: after response.raise_for_status(), parse json =
response.json(), then validate that "text" exists and is a non-empty string; if
not, raise an exception (e.g., requests.RequestException or RuntimeError) that
includes the full response content for debugging instead of returning an empty
string; update the except block handling requests.RequestException accordingly
so transcription failures propagate.
- Around line 24-27: In the __init__ of AlbertAudioBackend, validate required
config before building _transcriptions_endpoint and _headers: check
settings.ALBERT_API_KEY (and optionally settings.ALBERT_API_URL) are
set/non-empty and raise a clear configuration exception (e.g., ValueError or a
ConfigError) if missing so you don't create _headers = {"Authorization":
f"Bearer None"}; construct _transcriptions_endpoint and _headers only after
validation so failures fail fast during initialization.

In `@src/backend/chat/views.py`:
- Around line 758-763: Add a ScopedRateThrottle to the transcribe view: import
ScopedRateThrottle and set throttle_classes = [ScopedRateThrottle] and
throttle_scope = "transcribe" on the same view/class that currently defines
permission_classes and parser_classes (so the ASR-triggering endpoint is
rate-limited). Then add a matching DEFAULT_THROTTLE_RATES entry (e.g.
"transcribe": "X/min" or whatever quota you choose) in the REST_FRAMEWORK
settings so the scope is enforced. Ensure you choose an appropriate rate and
update any tests/configs that expect unlimited calls.
- Around line 772-782: The code reads the entire uploaded file into memory via
audio_file.read() before any size validation; add an early size check using the
uploaded file object (e.g., audio_file.size or audio_file.size attribute on the
InMemoryUploadedFile/TemporaryUploadedFile returned by request.FILES.get) and
return a 413/400 response if it exceeds a configured MAX_AUDIO_UPLOAD_BYTES
constant, then pass the file-like object (not materialized bytes) into the
transcription backend (update the backend.imported via
settings.AUDIO_TRANSCRIPTION_BACKEND and its transcribe method to accept a
stream/file-like argument instead of bytes) so large uploads are rejected before
buffering and the backend can stream the data.

In `@src/backend/conversations/settings.py`:
- Around line 753-757: The AUDIO_TRANSCRIPTION_BACKEND setting is not validated
at startup; add a post_setup validation in settings.py that resolves and imports
the backend referenced by AUDIO_TRANSCRIPTION_BACKEND, verifies any required
companion settings (e.g., model keys or API creds) are present, and raises a
clear configuration error (or logs and exits) if import or prerequisites fail so
the mic UI cannot enable with a bad path; implement this in the module's
post_setup hook (or add one if missing) and reference
AUDIO_TRANSCRIPTION_BACKEND when performing the import/check so
misconfigurations fail fast rather than at first upload.

In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`:
- Line 4: The import of the Node.js built-in 'events' module (import { on } from
'events';) is unused in useAudioRecording.tsx and should be removed; open the
useAudioRecording hook and delete that import line (and any unused-import
entries referencing `on`) so client-side bundling doesn't attempt to include the
Node built-in and lint/prettier passes cleanly.
- Around line 26-37: The hook currently doesn't release media and audio
resources on unmount; add a cleanup effect that runs on component unmount to
stop and null out active resources: call stopVisualization(), stop and nullify
mediaRecorderRef.current (and finalize audioChunksRef if needed), stop all
tracks on streamRef.current and set streamRef.current = null, cancel animation
frame via animFrameRef and close audioCtxRef (if stopVisualization isn't
sufficient ensure audioCtxRef.current?.close()), and clear refs so the
microphone and AudioContext are released when the component unmounts.
- Around line 67-84: startRecording currently calls startVisualization() and
setRecordingState('recording') before MediaRecorder is fully constructed, and
the catch block doesn't stop the acquired stream; move the MediaRecorder
construction (const mediaRecorder = new MediaRecorder(stream)),
mediaRecorder.ondataavailable setup, mediaRecorder.start(), and mediaRecorderRef
assignment to occur before calling startVisualization() and setRecordingState so
visualization/state only run on successful recorder init, add stopStream() to
the catch block to stop tracks and prevent mic/visualization leaks if any step
throws, and add stopStream to the useCallback dependency array; reference
startRecording, startVisualization, mediaRecorderRef, streamRef, stopStream,
setRecordingState, and onTranscriptionErrorRef when making the change.
- Around line 95-99: The filename extension is being hard-coded to 'webm' for
all non-mp4 MIME types in useAudioRecording.tsx (see mimeType, ext, blob,
form.append), causing mismatches; update the logic that sets ext so it derives
the extension from the actual mimeType (e.g., map 'audio/webm'->'webm',
'audio/ogg'->'ogg', 'audio/mp4' or 'audio/m4a'->'mp4', 'audio/x-m4a'->'m4a',
etc.) and fall back to a safe default like 'bin' only if unknown, then continue
to create the Blob and call form.append('audio', blob, `recording.${ext}`) with
the correct extension. Ensure the mapping covers common browser types
(Firefox/Safari/Chrome) so filename matches content-type.

In `@src/frontend/apps/conversations/src/i18n/translations.json`:
- Around line 30-32: The new localization keys ("Record audio", "Cancel
recording", "Confirm recording") were added only to the French translations,
causing other locales to fallback; add these same keys with appropriate
translations (or at minimum placeholder English strings) to all locale
translation files (e.g., NL, RU, UK) so the UI has consistent strings across
languages—update the corresponding translations.json for each locale to include
the three keys with proper translated values.
- Line 71: Update the French translation for the key "An error occurred during
transcription. Please try again." by replacing "ré-essayer" with the preferred
spelling "réessayer" so the value becomes "Il semble y avoir eu une erreur dans
la transcription. Merci de réessayer."; locate and edit the string value in
translations.json associated with that exact English key.

In `@src/frontend/apps/conversations/src/pages/index.tsx`:
- Around line 4-5: Remove the two leftover debug comments "// test reload" and
"// test reload 2" found in pages/index.tsx (they are dev/hot-reload artifacts);
simply delete those comment lines so the file contains only production code and
no stray debug comments.

---

Nitpick comments:
In `@src/backend/chat/input/base_audio_backend.py`:
- Around line 14-19: The docstring for the transcribe method in BaseAudioBackend
mismatches its signature: it documents a "language" parameter that does not
exist. Update the transcribe docstring to reflect the actual signature of
BaseAudioBackend.transcribe by removing the "language" entry (or if you intend
to support language, add a language parameter to the method signature and update
implementations accordingly); ensure the Args section only lists file_name,
file_content, and content_type to match the method name transcribe and its
parameters.

In `@src/backend/chat/tests/input/test_albert_audio_backend.py`:
- Around line 49-63: Update the test_transcribe_sends_correct_request test to
also assert that the configured ASR form fields are sent: after calling
backend.transcribe and grabbing request = responses.calls[0].request, check the
multipart request.body contains the form field names and values for the ASR
configuration (e.g. that b'name="model"' and the expected model value, and
b'name="language"' and the expected language value, are present) in addition to
the existing Authorization and audio-content assertions so regressions that drop
model or language will fail.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5de646e4-29c1-4ee3-be16-25be1006780c

📥 Commits

Reviewing files that changed from the base of the PR and between 26a5fa1 and a8beba8.

⛔ Files ignored due to path filters (1)
  • yarn.lock is excluded by !**/yarn.lock, !**/*.lock
📒 Files selected for processing (21)
  • package.json
  • src/backend/chat/exceptions.py
  • src/backend/chat/input/__init__.py
  • src/backend/chat/input/albert_audio_backend.py
  • src/backend/chat/input/base_audio_backend.py
  • src/backend/chat/tests/input/__init__.py
  • src/backend/chat/tests/input/test_albert_audio_backend.py
  • src/backend/chat/tests/views/test_audio_transcription.py
  • src/backend/chat/views.py
  • src/backend/conversations/settings.py
  • src/backend/core/api/viewsets.py
  • src/backend/core/urls.py
  • src/frontend/apps/conversations/src/core/config/api/useConfig.tsx
  • src/frontend/apps/conversations/src/features/chat/components/AudioRecordButton.tsx
  • src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx
  • src/frontend/apps/conversations/src/features/chat/components/InputChatAction.tsx
  • src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx
  • src/frontend/apps/conversations/src/i18n/translations.json
  • src/frontend/apps/conversations/src/pages/globals.css
  • src/frontend/apps/conversations/src/pages/index.tsx
  • src/frontend/apps/conversations/tsconfig.json

Comment on lines +24 to +27
def __init__(self):
self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions")
self._headers = {"Authorization": f"Bearer {settings.ALBERT_API_KEY}"}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast when Albert credentials are not configured.

Line 26 will send Authorization: Bearer None if ALBERT_API_KEY is unset, which turns a config error into a runtime upstream failure. Validate required settings at initialization and raise a configuration error early.

🛠️ Suggested fix
 from django.conf import settings
+from django.core.exceptions import ImproperlyConfigured
@@
     def __init__(self):
+        if not settings.ALBERT_API_KEY:
+            raise ImproperlyConfigured("ALBERT_API_KEY must be set for AlbertAudioBackend")
         self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions")
         self._headers = {"Authorization": f"Bearer {settings.ALBERT_API_KEY}"}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def __init__(self):
self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions")
self._headers = {"Authorization": f"Bearer {settings.ALBERT_API_KEY}"}
from django.conf import settings
from django.core.exceptions import ImproperlyConfigured
def __init__(self):
if not settings.ALBERT_API_KEY:
raise ImproperlyConfigured("ALBERT_API_KEY must be set for AlbertAudioBackend")
self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions")
self._headers = {"Authorization": f"Bearer {settings.ALBERT_API_KEY}"}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/input/albert_audio_backend.py` around lines 24 - 27, In the
__init__ of AlbertAudioBackend, validate required config before building
_transcriptions_endpoint and _headers: check settings.ALBERT_API_KEY (and
optionally settings.ALBERT_API_URL) are set/non-empty and raise a clear
configuration exception (e.g., ValueError or a ConfigError) if missing so you
don't create _headers = {"Authorization": f"Bearer None"}; construct
_transcriptions_endpoint and _headers only after validation so failures fail
fast during initialization.

Comment on lines +57 to +59
response.raise_for_status()
return response.json().get("text", "").strip()
except requests.RequestException as e:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not silently accept invalid transcription payloads.

Line 58 falls back to "" when text is missing, which can silently drop user input and look like a valid success. Treat missing/invalid text as transcription failure.

🛠️ Suggested fix
             response.raise_for_status()
-            return response.json().get("text", "").strip()
+            payload = response.json()
+            text = payload.get("text")
+            if not isinstance(text, str) or not text.strip():
+                raise TranscriptionError("Transcription response missing non-empty 'text'")
+            return text.strip()
         except requests.RequestException as e:
             raise TranscriptionError(f"Error during transcription: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
response.raise_for_status()
return response.json().get("text", "").strip()
except requests.RequestException as e:
response.raise_for_status()
payload = response.json()
text = payload.get("text")
if not isinstance(text, str) or not text.strip():
raise TranscriptionError("Transcription response missing non-empty 'text'")
return text.strip()
except requests.RequestException as e:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/input/albert_audio_backend.py` around lines 57 - 59, The
code currently treats a missing or empty "text" field as success by returning
""—change this to treat missing/invalid transcription payloads as failures:
after response.raise_for_status(), parse json = response.json(), then validate
that "text" exists and is a non-empty string; if not, raise an exception (e.g.,
requests.RequestException or RuntimeError) that includes the full response
content for debugging instead of returning an empty string; update the except
block handling requests.RequestException accordingly so transcription failures
propagate.

Comment thread src/backend/chat/views.py
Comment thread src/backend/chat/views.py
Comment on lines +753 to +757
AUDIO_TRANSCRIPTION_BACKEND = values.Value(
None,
environ_name="AUDIO_TRANSCRIPTION_BACKEND",
environ_prefix=None,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate the transcription backend before exposing it.

AUDIO_TRANSCRIPTION_BACKEND is now the switch that enables this feature, but unlike the existing Find/OCR settings it is never sanity-checked during startup. A typo in the import path or missing backend prerequisites will only fail on the first user upload, after the mic UI is already enabled. Please add a post_setup validation for this backend and its required settings so misconfiguration fails fast instead of becoming a runtime 500/502.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/conversations/settings.py` around lines 753 - 757, The
AUDIO_TRANSCRIPTION_BACKEND setting is not validated at startup; add a
post_setup validation in settings.py that resolves and imports the backend
referenced by AUDIO_TRANSCRIPTION_BACKEND, verifies any required companion
settings (e.g., model keys or API creds) are present, and raises a clear
configuration error (or logs and exits) if import or prerequisites fail so the
mic UI cannot enable with a bad path; implement this in the module's post_setup
hook (or add one if missing) and reference AUDIO_TRANSCRIPTION_BACKEND when
performing the import/check so misconfigurations fail fast rather than at first
upload.

Comment thread src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx Outdated
Comment thread src/frontend/apps/conversations/src/i18n/translations.json
Comment thread src/frontend/apps/conversations/src/i18n/translations.json Outdated
Comment thread src/frontend/apps/conversations/src/pages/index.tsx Outdated
Signed-off-by: natoromano <nathanaelpromano@gmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (2)
src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx (2)

1-1: ⚠️ Potential issue | 🟠 Major

Add unmount teardown for recorder/stream/audio resources.

This was previously flagged and is still missing: if the component unmounts mid-recording, mic/audio resources can remain active until tab close.

🧹 Proposed fix
-import { useCallback, useRef, useState } from 'react';
+import { useCallback, useEffect, useRef, useState } from 'react';
...
+  useEffect(() => {
+    return () => {
+      const recorder = mediaRecorderRef.current;
+      if (recorder && recorder.state !== 'inactive') {
+        recorder.onstop = null;
+        recorder.stop();
+      }
+      mediaRecorderRef.current = null;
+      audioChunksRef.current = [];
+      stopStream();
+    };
+  }, [stopStream]);

Also applies to: 25-30

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`
at line 1, The hook useAudioRecording is missing an unmount teardown to release
mic/audio resources; add a useEffect with a cleanup that checks and stops any
active recorder (e.g., recorderRef or MediaRecorder), stops and nulls the media
stream tracks (for each track call stop()), disconnects and closes any
AudioContext and source nodes (e.g., audioContextRef, sourceNodeRef), removes
any event listeners, and clear related refs (mediaStreamRef, recorderRef,
audioChunksRef) so resources are freed if the component unmounts mid-recording.
Ensure the cleanup runs once (empty deps) and mirrors logic in
startRecording/stopRecording functions to avoid leaving tracks or audio contexts
open.

95-99: ⚠️ Potential issue | 🟠 Major

Derive filename extension from normalized MIME type.

This mismatch was already reported and is still present: non-mp4 formats are forced to .webm, which can diverge from actual recorder output on Firefox/Safari.

🎯 Proposed fix
         const mimeType = recorder.mimeType || 'audio/webm';
-        const ext = mimeType.includes('mp4') ? 'mp4' : 'webm';
+        const normalizedMimeType = mimeType.split(';', 1)[0].toLowerCase();
+        const extByMime: Record<string, string> = {
+          'audio/webm': 'webm',
+          'audio/ogg': 'ogg',
+          'audio/mp4': 'mp4',
+          'audio/m4a': 'm4a',
+          'audio/x-m4a': 'm4a',
+        };
+        const ext = extByMime[normalizedMimeType] ?? 'bin';
         const blob = new Blob(audioChunksRef.current, { type: mimeType });
         const form = new FormData();
         form.append('audio', blob, `recording.${ext}`);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`
around lines 95 - 99, The code forces non-mp4 formats to .webm by checking
mimeType.includes('mp4'), which can mismatch actual recorder output (e.g.,
Firefox/Safari); normalize the MIME type (strip parameters like charset by using
mimeType.split(';')[0].trim()) and derive the file extension from the normalized
type's subtype (const subtype = normalized.split('/')[1]) with a small mapping
for common audio types (mp4 -> mp4, webm -> webm, ogg -> ogg, mpeg/mp3 -> mp3)
and a sensible fallback (e.g., webm); also use the normalized MIME when creating
the Blob (blob = new Blob(..., { type: normalized })) so the appended FormData
filename (recording.{ext}) and blob type match the actual MIME.
🧹 Nitpick comments (1)
src/backend/conversations/settings.py (1)

758-767: Consider aligning the environ_name with the setting name for consistency.

The setting AUDIO_MAX_SIZE uses environ_name="MAX_AUDIO_UPLOAD_BYTES", which differs from the Django setting name. This inconsistency could cause confusion when configuring the application. Consider using AUDIO_MAX_SIZE as the env var name for consistency with other settings.

Also note that the 512MB default is quite large for audio files—even a 1-hour uncompressed WAV at 44.1kHz stereo is ~635MB. For a POC this is fine, but you may want to tune this for production.

♻️ Suggested alignment
     AUDIO_MAX_SIZE = values.PositiveIntegerValue(
         512 * 1024 * 1024,  # 512 MB
-        environ_name="MAX_AUDIO_UPLOAD_BYTES",
+        environ_name="AUDIO_MAX_SIZE",
         environ_prefix=None,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/conversations/settings.py` around lines 758 - 767, The
environment variable name for the AUDIO_MAX_SIZE setting is inconsistent: change
the values.PositiveIntegerValue(...) call for AUDIO_MAX_SIZE so its environ_name
matches the Django setting name (use "AUDIO_MAX_SIZE" instead of
"MAX_AUDIO_UPLOAD_BYTES"), leaving environ_prefix=None; verify the default 512 *
1024 * 1024 remains or adjust the default value in AUDIO_MAX_SIZE if you want a
smaller production-safe cap, and ensure AUDIO_TRANSCRIPTION_BACKEND's
environ_name is left as-is or similarly aligned if desired.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/backend/chat/input/albert_audio_backend.py`:
- Line 26: The current use of urljoin with an absolute path causes any path in
settings.ALBERT_API_URL to be dropped (e.g., "/api" lost); update the
construction of self._transcriptions_endpoint to concatenate safely by
stripping/trimming slashes instead of using urljoin with a leading slash—use
settings.ALBERT_API_URL.rstrip('/') + '/v1/audio/transcriptions' (or build the
path component without a leading slash before joining) so the base URL's path
prefix is preserved; apply the same pattern wherever urljoin is used with a
leading-path literal.

In `@src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx`:
- Around line 421-431: The keydown handler calls confirmRecording() which
returns a Promise but is not handled; update the handleKeyDown in the useEffect
(the function referencing recordingState and confirmRecording) to handle the
returned Promise—either prefix the call with void (void confirmRecording()) or
explicitly call confirmRecording().catch(err => {/* handle or log error */}) so
any rejection is handled and no unhandled promise rejection occurs.
- Around line 554-561: The linter complains because startRecording and
confirmRecording return Promises while the prop types expect void; fix this by
passing arrow functions that discard the returned Promise using the void
operator instead of forwarding the async functions directly—update the props
onStartRecording and onConfirmRecording to call void startRecording(...) and
void confirmRecording(...) via arrow wrappers (keeping recordingState, volume,
cancelRecording, etc. unchanged) so TypeScript's strict Promise checks are
satisfied.

In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`:
- Line 86: The confirmRecording function is marked async but contains no direct
awaits (all async work happens inside recorder.onstop), which triggers
require-await; remove the async keyword from the useCallback declaration for
confirmRecording and keep the existing recorder.onstop handler and returned
value unchanged so behavior stays the same (locate the useCallback named
confirmRecording and delete the async modifier).
- Around line 31-35: The stopVisualization callback currently calls
audioCtxRef.current?.close() without handling the returned Promise; update the
stopVisualization implementation (the useCallback named stopVisualization in
useAudioRecording.tsx) to explicitly handle the Promise from
AudioContext.close()—either make stopVisualization async and await
audioCtxRef.current.close() inside a try/catch, or call
audioCtxRef.current.close().then(...).catch(...)—and only set
audioCtxRef.current = null and setVolume(0) after successfully closing (or in
finally) to avoid unhandled rejections and ensure proper cleanup.
- Around line 107-109: Create a typed interface for the transcription response
(e.g., interface TranscriptionResponse { text?: string; error?: any }) and cast
the parsed JSON to it (const data = await res.json() as TranscriptionResponse)
so member access is typed; keep the original error-checking logic by still using
if (data.error) onTranscriptionErrorRef.current?.(); and before calling
onTranscriptionRef.current ensure type-safety with if (typeof data.text ===
'string') onTranscriptionRef.current(data.text). Use the identifiers data, res,
onTranscriptionRef.current, and onTranscriptionErrorRef.current in
useAudioRecording.tsx.

---

Duplicate comments:
In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`:
- Line 1: The hook useAudioRecording is missing an unmount teardown to release
mic/audio resources; add a useEffect with a cleanup that checks and stops any
active recorder (e.g., recorderRef or MediaRecorder), stops and nulls the media
stream tracks (for each track call stop()), disconnects and closes any
AudioContext and source nodes (e.g., audioContextRef, sourceNodeRef), removes
any event listeners, and clear related refs (mediaStreamRef, recorderRef,
audioChunksRef) so resources are freed if the component unmounts mid-recording.
Ensure the cleanup runs once (empty deps) and mirrors logic in
startRecording/stopRecording functions to avoid leaving tracks or audio contexts
open.
- Around line 95-99: The code forces non-mp4 formats to .webm by checking
mimeType.includes('mp4'), which can mismatch actual recorder output (e.g.,
Firefox/Safari); normalize the MIME type (strip parameters like charset by using
mimeType.split(';')[0].trim()) and derive the file extension from the normalized
type's subtype (const subtype = normalized.split('/')[1]) with a small mapping
for common audio types (mp4 -> mp4, webm -> webm, ogg -> ogg, mpeg/mp3 -> mp3)
and a sensible fallback (e.g., webm); also use the normalized MIME when creating
the Blob (blob = new Blob(..., { type: normalized })) so the appended FormData
filename (recording.{ext}) and blob type match the actual MIME.

---

Nitpick comments:
In `@src/backend/conversations/settings.py`:
- Around line 758-767: The environment variable name for the AUDIO_MAX_SIZE
setting is inconsistent: change the values.PositiveIntegerValue(...) call for
AUDIO_MAX_SIZE so its environ_name matches the Django setting name (use
"AUDIO_MAX_SIZE" instead of "MAX_AUDIO_UPLOAD_BYTES"), leaving
environ_prefix=None; verify the default 512 * 1024 * 1024 remains or adjust the
default value in AUDIO_MAX_SIZE if you want a smaller production-safe cap, and
ensure AUDIO_TRANSCRIPTION_BACKEND's environ_name is left as-is or similarly
aligned if desired.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2e76ae4d-6162-4f44-9be0-1d3ba2e6db55

📥 Commits

Reviewing files that changed from the base of the PR and between a8beba8 and f36d82f.

⛔ Files ignored due to path filters (1)
  • yarn.lock is excluded by !**/yarn.lock, !**/*.lock
📒 Files selected for processing (25)
  • CHANGELOG.md
  • src/backend/chat/exceptions.py
  • src/backend/chat/input/__init__.py
  • src/backend/chat/input/albert_audio_backend.py
  • src/backend/chat/input/base_audio_backend.py
  • src/backend/chat/tests/input/__init__.py
  • src/backend/chat/tests/input/test_albert_audio_backend.py
  • src/backend/chat/tests/views/test_audio_transcription.py
  • src/backend/chat/views.py
  • src/backend/conversations/settings.py
  • src/backend/core/api/viewsets.py
  • src/backend/core/tests/test_api_config.py
  • src/backend/core/urls.py
  • src/frontend/apps/conversations/src/core/config/api/useConfig.tsx
  • src/frontend/apps/conversations/src/features/chat/components/AudioRecordButton.tsx
  • src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx
  • src/frontend/apps/conversations/src/features/chat/components/InputChatAction.tsx
  • src/frontend/apps/conversations/src/features/chat/components/__tests__/InputChatAction.test.tsx
  • src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx
  • src/frontend/apps/conversations/src/features/header/components/ButtonToggleLeftPanel.tsx
  • src/frontend/apps/conversations/src/features/header/components/Header.tsx
  • src/frontend/apps/conversations/src/features/language/components/LanguagePicker.tsx
  • src/frontend/apps/conversations/src/i18n/translations.json
  • src/frontend/apps/conversations/src/pages/globals.css
  • src/frontend/apps/conversations/tsconfig.json
💤 Files with no reviewable changes (2)
  • src/frontend/apps/conversations/src/features/header/components/Header.tsx
  • src/frontend/apps/conversations/src/features/chat/components/tests/InputChatAction.test.tsx
✅ Files skipped from review due to trivial changes (9)
  • src/frontend/apps/conversations/src/features/header/components/ButtonToggleLeftPanel.tsx
  • src/frontend/apps/conversations/src/features/language/components/LanguagePicker.tsx
  • CHANGELOG.md
  • src/backend/chat/exceptions.py
  • src/frontend/apps/conversations/tsconfig.json
  • src/frontend/apps/conversations/src/pages/globals.css
  • src/frontend/apps/conversations/src/core/config/api/useConfig.tsx
  • src/frontend/apps/conversations/src/i18n/translations.json
  • src/backend/chat/tests/views/test_audio_transcription.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/backend/chat/input/base_audio_backend.py
  • src/frontend/apps/conversations/src/features/chat/components/InputChatAction.tsx
  • src/backend/chat/views.py

"""

def __init__(self):
self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

urljoin with an absolute path discards the base URL's path.

urljoin("https://albert.example.com/api", "/v1/audio/transcriptions") returns "https://albert.example.com/v1/audio/transcriptions" — the /api portion is lost. If ALBERT_API_URL ever includes a path prefix, this will break.

🛠️ Suggested fix using rstrip/lstrip
     def __init__(self):
-        self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions")
+        base = settings.ALBERT_API_URL.rstrip("/")
+        self._transcriptions_endpoint = f"{base}/v1/audio/transcriptions"
         self._headers = {"Authorization": f"Bearer {settings.ALBERT_API_KEY}"}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/chat/input/albert_audio_backend.py` at line 26, The current use
of urljoin with an absolute path causes any path in settings.ALBERT_API_URL to
be dropped (e.g., "/api" lost); update the construction of
self._transcriptions_endpoint to concatenate safely by stripping/trimming
slashes instead of using urljoin with a leading slash—use
settings.ALBERT_API_URL.rstrip('/') + '/v1/audio/transcriptions' (or build the
path component without a leading slash before joining) so the base URL's path
prefix is preserved; apply the same pattern wherever urljoin is used with a
leading-path literal.

Comment on lines +31 to +35
const stopVisualization = useCallback(() => {
cancelAnimationFrame(animFrameRef.current);
audioCtxRef.current?.close();
audioCtxRef.current = null;
setVolume(0);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

head -50 src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx | cat -n

Repository: suitenumerique/conversations

Length of output: 2123


🌐 Web query:

AudioContext.close() Promise Web API documentation

💡 Result:

The AudioContext.close method is part of the Web Audio API. It closes the AudioContext, releasing system audio resources it uses, suspending audio time progression, and stopping audio data processing. It does not automatically release all AudioContext-created objects (unless other references are gone). The returned Promise resolves with undefined once all AudioContext-creation-blocking resources are released. Calling close on an OfflineAudioContext throws INVALID_STATE_ERR. Syntax: close: Promise Example: audioCtx.close.then( => { // Context closed, resources released }); This is well-supported across browsers since April 2021. Official documentation: MDN Web Docs and Web Audio API specification (webaudio.github.io/web-audio-api/#dom-audiocontext-close, referenced in MDN).

Citations:


Handle AudioContext.close() Promise explicitly.

Line 33 creates a floating Promise—AudioContext.close() returns a Promise<void>, which should be handled to prevent potential unhandled rejections and lint failures.

✅ Proposed fix
   const stopVisualization = useCallback(() => {
     cancelAnimationFrame(animFrameRef.current);
-    audioCtxRef.current?.close();
-    audioCtxRef.current = null;
+    const ctx = audioCtxRef.current;
+    audioCtxRef.current = null;
+    if (ctx) void ctx.close().catch(() => undefined);
     setVolume(0);
   }, []);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const stopVisualization = useCallback(() => {
cancelAnimationFrame(animFrameRef.current);
audioCtxRef.current?.close();
audioCtxRef.current = null;
setVolume(0);
const stopVisualization = useCallback(() => {
cancelAnimationFrame(animFrameRef.current);
const ctx = audioCtxRef.current;
audioCtxRef.current = null;
if (ctx) void ctx.close().catch(() => undefined);
setVolume(0);
🧰 Tools
🪛 GitHub Check: lint-front

[failure] 33-33:
Promises must be awaited, end with a call to .catch, end with a call to .then with a rejection handler or be explicitly marked as ignored with the void operator

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`
around lines 31 - 35, The stopVisualization callback currently calls
audioCtxRef.current?.close() without handling the returned Promise; update the
stopVisualization implementation (the useCallback named stopVisualization in
useAudioRecording.tsx) to explicitly handle the Promise from
AudioContext.close()—either make stopVisualization async and await
audioCtxRef.current.close() inside a try/catch, or call
audioCtxRef.current.close().then(...).catch(...)—and only set
audioCtxRef.current = null and setVolume(0) after successfully closing (or in
finally) to avoid unhandled rejections and ensure proper cleanup.

Comment thread src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx Outdated
Comment on lines +107 to +109
const data = await res.json();
if (data.error) onTranscriptionErrorRef.current?.();
if (data.text) onTranscriptionRef.current(data.text);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's read the file and see the actual code at lines 107-109
cat -n src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx | sed -n '100,120p'

Repository: suitenumerique/conversations

Length of output: 751


🏁 Script executed:

# Let's also check if there are any type definitions or interfaces related to transcription
rg "transcription|Transcription" src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx -A 2 -B 2

Repository: suitenumerique/conversations

Length of output: 1170


🏁 Script executed:

# Check for tsconfig or eslint configuration to understand the lint rules
fd -e tsconfig.json -e .eslintrc -e eslint.config.js src/frontend/apps/conversations | head -20

Repository: suitenumerique/conversations

Length of output: 501


🏁 Script executed:

# Check the full context of the function containing lines 107-109
wc -l src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx

Repository: suitenumerique/conversations

Length of output: 153


🏁 Script executed:

# Find tsconfig.json files
fd --full-path tsconfig.json

Repository: suitenumerique/conversations

Length of output: 194


🏁 Script executed:

# Check the tsconfig at the project root or relevant directory
cat tsconfig.json 2>/dev/null || cat src/tsconfig.json 2>/dev/null || cat src/frontend/tsconfig.json 2>/dev/null || true

Repository: suitenumerique/conversations

Length of output: 54


🏁 Script executed:

# Look for any type definitions or response structures for the transcribe API
rg "transcribe" src/frontend/apps/conversations -A 5 -B 2 | head -60

Repository: suitenumerique/conversations

Length of output: 909


🏁 Script executed:

# Check if there are tests that might reveal the expected response structure
fd -e test.ts -e test.tsx -e spec.ts src/frontend/apps/conversations/src/features/chat | xargs grep -l "transcribe" 2>/dev/null || echo "No test files found"

Repository: suitenumerique/conversations

Length of output: 610


🏁 Script executed:

# Read the tsconfig.json for the conversations app
cat src/frontend/apps/conversations/tsconfig.json

Repository: suitenumerique/conversations

Length of output: 771


🏁 Script executed:

# Find and read fetchAPI definition
rg "fetchAPI" src/frontend/apps/conversations -A 10 -B 5 | head -80

Repository: suitenumerique/conversations

Length of output: 7629


🏁 Script executed:

# Search for any type definitions for API responses in the entire frontend
fd --full-path -e 'types.ts' -e 'types.d.ts' src/frontend | head -10

Repository: suitenumerique/conversations

Length of output: 54


🏁 Script executed:

# Look for API response interfaces or types
rg "interface.*Response|type.*Response" src/frontend/apps/conversations --type ts --type tsx | head -20

Repository: suitenumerique/conversations

Length of output: 99


🏁 Script executed:

# Find the api module definition
fd --full-path '*/api/index.ts' src/frontend/apps/conversations

Repository: suitenumerique/conversations

Length of output: 365


🏁 Script executed:

# Search for fetchAPI definition more broadly
rg "export.*fetchAPI|function fetchAPI|const fetchAPI" src/frontend -A 15 | head -100

Repository: suitenumerique/conversations

Length of output: 2882


🏁 Script executed:

# Look for the api directory structure
fd --full-path -type d 'api' src/frontend/apps/conversations

Repository: suitenumerique/conversations

Length of output: 242


🏁 Script executed:

# Read the fetchAPI implementation
cat -n src/frontend/apps/conversations/src/api/fetchApi.ts

Repository: suitenumerique/conversations

Length of output: 975


🏁 Script executed:

# Look for eslint configuration
fd --glob '.eslintrc*' src/frontend/apps/conversations

Repository: suitenumerique/conversations

Length of output: 279


🏁 Script executed:

# Check for eslint rules in package.json
cat src/frontend/apps/conversations/package.json 2>/dev/null | grep -A 20 '"eslintConfig"' || echo "No eslintConfig in package.json"

Repository: suitenumerique/conversations

Length of output: 103


🏁 Script executed:

# Look at the actual current code again with more context to understand if there are actual lint issues
cat -n src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx | sed -n '95,115p'

Repository: suitenumerique/conversations

Length of output: 991


🏁 Script executed:

# Check if there's an eslint config file with -H flag to show hidden files
fd --hidden -glob '.eslintrc*' src/frontend/apps/conversations

Repository: suitenumerique/conversations

Length of output: 193


🏁 Script executed:

# Look at the entire useAudioRecording.tsx file to understand the full context
cat src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx

Repository: suitenumerique/conversations

Length of output: 4377


Type the transcription response before member access.

Lines 107-109 currently assign await res.json() to data with type any, causing unsafe member access and unsafe argument issues. Line 109 passes data.text (type any) to a function expecting string, which violates strict type checking.

The proposed fix has a semantic issue: changing if (data.error) to if (typeof data.error === 'string') alters the error-checking logic. A simpler fix preserves the original behavior while adding type safety:

Suggested fix
-          const data = await res.json();
+          const data = (await res.json()) as Record<string, unknown>;
           if (data.error) onTranscriptionErrorRef.current?.();
           if (data.text) onTranscriptionRef.current(data.text as string);

Alternatively, define a proper interface for the transcription response and cast to it instead of Record<string, unknown>.

🧰 Tools
🪛 GitHub Check: lint-front

[failure] 109-109:
Unsafe member access .text on an any value


[failure] 109-109:
Unsafe argument of type any assigned to a parameter of type string


[failure] 109-109:
Unsafe member access .text on an any value


[failure] 108-108:
Unsafe member access .error on an any value


[failure] 107-107:
Unsafe assignment of an any value

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`
around lines 107 - 109, Create a typed interface for the transcription response
(e.g., interface TranscriptionResponse { text?: string; error?: any }) and cast
the parsed JSON to it (const data = await res.json() as TranscriptionResponse)
so member access is typed; keep the original error-checking logic by still using
if (data.error) onTranscriptionErrorRef.current?.(); and before calling
onTranscriptionRef.current ensure type-safety with if (typeof data.text ===
'string') onTranscriptionRef.current(data.text). Use the identifiers data, res,
onTranscriptionRef.current, and onTranscriptionErrorRef.current in
useAudioRecording.tsx.

Signed-off-by: natoromano <nathanaelpromano@gmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx (2)

21-25: Avoid mutating refs during render.

Assigning to onTranscriptionRef.current / onTranscriptionErrorRef.current in the component body runs on every render and mutates during the render phase. Under React 19's concurrent rendering, a render can be discarded, leaving refs pointing at a stale callback from an abandoned render. Move the sync into a layout effect so it only runs on committed renders.

♻️ Proposed refactor
-import { useCallback, useRef, useState } from 'react';
+import { useCallback, useLayoutEffect, useRef, useState } from 'react';
@@
-  const onTranscriptionRef = useRef(onTranscription);
-  onTranscriptionRef.current = onTranscription;
-
-  const onTranscriptionErrorRef = useRef(onTranscriptionError);
-  onTranscriptionErrorRef.current = onTranscriptionError;
+  const onTranscriptionRef = useRef(onTranscription);
+  const onTranscriptionErrorRef = useRef(onTranscriptionError);
+  useLayoutEffect(() => {
+    onTranscriptionRef.current = onTranscription;
+    onTranscriptionErrorRef.current = onTranscriptionError;
+  });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`
around lines 21 - 25, The refs onTranscriptionRef and onTranscriptionErrorRef
are being mutated during render; move the assignments into a committed-phase
effect by replacing the direct assigns with a useLayoutEffect that sets
onTranscriptionRef.current = onTranscription and onTranscriptionErrorRef.current
= onTranscriptionError so updates only occur after commit; ensure
useLayoutEffect is imported and used inside the same hook (useAudioRecording /
the component that declares these refs) to avoid render-phase mutation and stale
callbacks from discarded renders.

93-117: recorder.onstop is assigned an async function; its rejection is unhandled.

MediaRecorder.onstop has a void return type, so the returned Promise from this async handler is never awaited or caught by the browser. The try/catch inside covers fetchAPI and res.json(), but anything thrown synchronously before the try (e.g., new Blob under extreme memory pressure, or stopStream throwing) becomes an unhandled rejection. Not a blocker — the try covers the main failure modes — but worth a small scope tightening.

♻️ Optional hardening
-      recorder.onstop = async () => {
-        stopStream();
-        setRecordingState('transcribing');
-
-        const mimeType = recorder.mimeType || 'audio/webm';
-        ...
-        try {
-          ...
-        } catch {
-          onTranscriptionErrorRef.current?.();
-        } finally {
-          setRecordingState('idle');
-        }
-      };
+      recorder.onstop = () => {
+        void (async () => {
+          stopStream();
+          setRecordingState('transcribing');
+          try {
+            ...
+          } catch {
+            onTranscriptionErrorRef.current?.();
+          } finally {
+            setRecordingState('idle');
+          }
+        })();
+      };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`
around lines 93 - 117, The onstop handler assigned to recorder.onstop is async
so any rejection from code outside the inner try/catch (e.g., stopStream(), new
Blob(...)) can become an unhandled rejection; fix by making recorder.onstop a
synchronous function that immediately wraps the async work in a .catch-safe
block (either a try/catch around the entire body or call an async IIFE and
attach .catch), ensuring all exceptions call onTranscriptionErrorRef.current?.()
and setRecordingState('idle'); update the handler that references
recorder.onstop, stopStream, setRecordingState, audioChunksRef, new Blob,
fetchAPI('transcribe/') and res.json() so every possible throw is caught.
src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx (1)

384-395: Directly mutating a controlled textarea's DOM value is fragile.

The textarea is a controlled input (value={input ?? ''} on Line 506), so setting textarea.value = text is immediately overwritten on the next render unless the synthesized handleTextareaChange call manages to flush through handleInputChange first. It happens to work because you build a fake event whose target.value carries the new text, but relying on React re-reading that value out of a mutated DOM node is brittle (it breaks if handleInputChange is ever memoized on a stale input, or if the parent batches state updates differently). Prefer routing transcribed text through the same state pipeline that the parent already owns.

♻️ Proposed refactor

If handleInputChange ultimately calls a setInput, lift that setter (or expose an appendInput(text)) up and call it here instead of faking an event. At minimum, keep the fake-event approach but skip the DOM mutation:

-  const handleAudioTranscription = useCallback(
-    (text: string) => {
-      const textarea = textareaRef.current;
-      if (!textarea) return;
-      textarea.value = text;
-      handleTextareaChange({
-        target: textarea,
-      } as React.ChangeEvent<HTMLTextAreaElement>);
-      textarea.focus();
-    },
-    [handleTextareaChange],
-  );
+  const handleAudioTranscription = useCallback(
+    (text: string) => {
+      const textarea = textareaRef.current;
+      if (!textarea) return;
+      handleTextareaChange({
+        target: { ...textarea, value: text, style: textarea.style, scrollHeight: textarea.scrollHeight },
+      } as unknown as React.ChangeEvent<HTMLTextAreaElement>);
+      textarea.focus();
+    },
+    [handleTextareaChange],
+  );

The cleaner long-term fix is to have the parent expose a dedicated setter for the input string and call it directly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx`
around lines 384 - 395, The current handleAudioTranscription mutates the DOM
node (textarea.value = text) which is fragile for a controlled input; instead
route the transcribed text through the component's state updater used by the
parent (do not write to textarea.value). Update handleAudioTranscription to call
the parent's input setter or an exposed helper (e.g., setInput or appendInput)
so the new text flows through the controlled value pipeline; if that setter
isn't available, at minimum stop mutating textareaRef.current and invoke
handleTextareaChange with a synthetic event whose target.value is the text
(keeping textareaRef only for focus), referencing handleAudioTranscription,
textareaRef, handleTextareaChange and the controlled value prop (value={input ??
''}) to locate the places to change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx`:
- Around line 421-431: The current document-level Enter handler intercepts all
Enter presses while recordingState === 'recording' which can hijack other
inputs; change the listener to target only the chat input element instead of
document by attaching the keydown handler to the chat textarea/input element
(use a ref to the input or the component container) or by checking the
event.target inside handleKeyDown (ensure e.target is the chat textarea or
within the chat container via e.target instanceof HTMLElement &&
e.target.closest('.your-chat-input-class')). Keep the same key checks and
preventDefault only when the target matches, attach with
inputRef.current.addEventListener and remove on cleanup, and update references
to recordingState and confirmRecording accordingly.

---

Nitpick comments:
In `@src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx`:
- Around line 384-395: The current handleAudioTranscription mutates the DOM node
(textarea.value = text) which is fragile for a controlled input; instead route
the transcribed text through the component's state updater used by the parent
(do not write to textarea.value). Update handleAudioTranscription to call the
parent's input setter or an exposed helper (e.g., setInput or appendInput) so
the new text flows through the controlled value pipeline; if that setter isn't
available, at minimum stop mutating textareaRef.current and invoke
handleTextareaChange with a synthetic event whose target.value is the text
(keeping textareaRef only for focus), referencing handleAudioTranscription,
textareaRef, handleTextareaChange and the controlled value prop (value={input ??
''}) to locate the places to change.

In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`:
- Around line 21-25: The refs onTranscriptionRef and onTranscriptionErrorRef are
being mutated during render; move the assignments into a committed-phase effect
by replacing the direct assigns with a useLayoutEffect that sets
onTranscriptionRef.current = onTranscription and onTranscriptionErrorRef.current
= onTranscriptionError so updates only occur after commit; ensure
useLayoutEffect is imported and used inside the same hook (useAudioRecording /
the component that declares these refs) to avoid render-phase mutation and stale
callbacks from discarded renders.
- Around line 93-117: The onstop handler assigned to recorder.onstop is async so
any rejection from code outside the inner try/catch (e.g., stopStream(), new
Blob(...)) can become an unhandled rejection; fix by making recorder.onstop a
synchronous function that immediately wraps the async work in a .catch-safe
block (either a try/catch around the entire body or call an async IIFE and
attach .catch), ensuring all exceptions call onTranscriptionErrorRef.current?.()
and setRecordingState('idle'); update the handler that references
recorder.onstop, stopStream, setRecordingState, audioChunksRef, new Blob,
fetchAPI('transcribe/') and res.json() so every possible throw is caught.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 548e3f97-d7bf-47e9-b7cd-c02c28ebd38e

📥 Commits

Reviewing files that changed from the base of the PR and between f36d82f and 74204d2.

📒 Files selected for processing (4)
  • src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx
  • src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx
  • src/frontend/apps/conversations/src/features/chat/types.tsx
  • src/frontend/apps/conversations/src/i18n/translations.json
✅ Files skipped from review due to trivial changes (2)
  • src/frontend/apps/conversations/src/features/chat/types.tsx
  • src/frontend/apps/conversations/src/i18n/translations.json

Comment on lines +421 to +431
useEffect(() => {
if (recordingState !== 'recording') return;
const handleKeyDown = (e: KeyboardEvent) => {
if (e.key === 'Enter' && !e.ctrlKey && !e.shiftKey) {
e.preventDefault();
void confirmRecording();
}
};
document.addEventListener('keydown', handleKeyDown);
return () => document.removeEventListener('keydown', handleKeyDown);
}, [recordingState, confirmRecording]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Document-level Enter handler is too broad.

While recordingState === 'recording', any Enter keypress on the page (e.g., inside another input, a modal, a button) will be intercepted, have preventDefault() called, and trigger confirmRecording(). Scope this to a more specific target (e.g., only when focus is on the textarea or on body) to avoid accidentally hijacking unrelated Enter presses.

🩹 Proposed fix
     const handleKeyDown = (e: KeyboardEvent) => {
       if (e.key === 'Enter' && !e.ctrlKey && !e.shiftKey) {
+        const target = e.target as HTMLElement | null;
+        const tag = target?.tagName;
+        const isInteractive =
+          tag === 'INPUT' || tag === 'TEXTAREA' || target?.isContentEditable;
+        if (isInteractive && target !== textareaRef.current) return;
         e.preventDefault();
         void confirmRecording();
       }
     };
🧰 Tools
🪛 GitHub Check: SonarCloud Code Analysis

[failure] 426-426: Remove this use of the "void" operator.

See more on https://sonarcloud.io/project/issues?id=suitenumerique_conversations&issues=AZ2aoPm1E8rVSZSZ0hok&open=AZ2aoPm1E8rVSZSZ0hok&pullRequest=414

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx`
around lines 421 - 431, The current document-level Enter handler intercepts all
Enter presses while recordingState === 'recording' which can hijack other
inputs; change the listener to target only the chat input element instead of
document by attaching the keydown handler to the chat textarea/input element
(use a ref to the input or the component container) or by checking the
event.target inside handleKeyDown (ensure e.target is the chat textarea or
within the chat container via e.target instanceof HTMLElement &&
e.target.closest('.your-chat-input-class')). Keep the same key checks and
preventDefault only when the target matches, attach with
inputRef.current.addEventListener and remove on cleanup, and update references
to recordingState and confirmRecording accordingly.

Signed-off-by: natoromano <nathanaelpromano@gmail.com>
@sonarqubecloud
Copy link
Copy Markdown

@eliott07 eliott07 mentioned this pull request Apr 23, 2026
@Tibo-aplus
Copy link
Copy Markdown

par curiosité ça passe en prod quand ? J'ai très hate de tester et de pouvoir potentiellement enfin annuler mon abonnement ChatGPT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants