✨(feat) POC for audio prompting with Albert API#414
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 51 minutes and 35 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdds end-to-end audio transcription: backend API endpoint and settings, abstract and Albert transcription backends, frontend recording hook and UI components, types/translations/styles, and tests for backend and frontend flows. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Browser
participant Frontend as InputChat / useAudioRecording
participant API as AudioTranscriptionView
participant Albert as Albert API
User->>Frontend: Click record
Frontend->>Browser: request microphone access
Browser-->>Frontend: MediaStream + audio samples
Frontend->>Frontend: start MediaRecorder & visualize volume
User->>Frontend: Confirm recording
Frontend->>Browser: stop MediaRecorder, build Blob
Frontend->>API: POST /api/.../transcribe/ (FormData with audio)
API->>API: validate auth & size
API->>Albert: POST /v1/audio/transcriptions (multipart, Bearer)
Albert-->>API: { "text": "..." }
API-->>Frontend: { "text": "..." }
Frontend->>Frontend: insert text into textarea
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: natoromano <nathanaelpromano@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (2)
src/backend/chat/input/base_audio_backend.py (1)
14-19: Fixtranscribecontract doc mismatch.Line 18 documents a
languageargument that is not part of the method signature on Line 10. Please align docstring and signature to avoid backend implementation drift.🛠️ Suggested doc fix
def transcribe(self, file_name: str, file_content: bytes, content_type: str) -> str: """ Transcribe the given audio file to text. @@ Args: file_name (str): The original file name of the audio file. file_content (bytes): The raw audio file content. content_type (str): The MIME type of the audio file. - language (str): The language to transcribe the audio content in.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/backend/chat/input/base_audio_backend.py` around lines 14 - 19, The docstring for the transcribe method in BaseAudioBackend mismatches its signature: it documents a "language" parameter that does not exist. Update the transcribe docstring to reflect the actual signature of BaseAudioBackend.transcribe by removing the "language" entry (or if you intend to support language, add a language parameter to the method signature and update implementations accordingly); ensure the Args section only lists file_name, file_content, and content_type to match the method name transcribe and its parameters.src/backend/chat/tests/input/test_albert_audio_backend.py (1)
49-63: Assert the configured ASR fields too.This test only proves the auth header and raw bytes survive the request. A regression that drops
modelorlanguagefrom the multipart form would still pass, even though both are now part of the backend configuration added in this PR. Please assert those fields in the outbound request as well.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/backend/chat/tests/input/test_albert_audio_backend.py` around lines 49 - 63, Update the test_transcribe_sends_correct_request test to also assert that the configured ASR form fields are sent: after calling backend.transcribe and grabbing request = responses.calls[0].request, check the multipart request.body contains the form field names and values for the ASR configuration (e.g. that b'name="model"' and the expected model value, and b'name="language"' and the expected language value, are present) in addition to the existing Authorization and audio-content assertions so regressions that drop model or language will fail.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/backend/chat/input/albert_audio_backend.py`:
- Around line 57-59: The code currently treats a missing or empty "text" field
as success by returning ""—change this to treat missing/invalid transcription
payloads as failures: after response.raise_for_status(), parse json =
response.json(), then validate that "text" exists and is a non-empty string; if
not, raise an exception (e.g., requests.RequestException or RuntimeError) that
includes the full response content for debugging instead of returning an empty
string; update the except block handling requests.RequestException accordingly
so transcription failures propagate.
- Around line 24-27: In the __init__ of AlbertAudioBackend, validate required
config before building _transcriptions_endpoint and _headers: check
settings.ALBERT_API_KEY (and optionally settings.ALBERT_API_URL) are
set/non-empty and raise a clear configuration exception (e.g., ValueError or a
ConfigError) if missing so you don't create _headers = {"Authorization":
f"Bearer None"}; construct _transcriptions_endpoint and _headers only after
validation so failures fail fast during initialization.
In `@src/backend/chat/views.py`:
- Around line 758-763: Add a ScopedRateThrottle to the transcribe view: import
ScopedRateThrottle and set throttle_classes = [ScopedRateThrottle] and
throttle_scope = "transcribe" on the same view/class that currently defines
permission_classes and parser_classes (so the ASR-triggering endpoint is
rate-limited). Then add a matching DEFAULT_THROTTLE_RATES entry (e.g.
"transcribe": "X/min" or whatever quota you choose) in the REST_FRAMEWORK
settings so the scope is enforced. Ensure you choose an appropriate rate and
update any tests/configs that expect unlimited calls.
- Around line 772-782: The code reads the entire uploaded file into memory via
audio_file.read() before any size validation; add an early size check using the
uploaded file object (e.g., audio_file.size or audio_file.size attribute on the
InMemoryUploadedFile/TemporaryUploadedFile returned by request.FILES.get) and
return a 413/400 response if it exceeds a configured MAX_AUDIO_UPLOAD_BYTES
constant, then pass the file-like object (not materialized bytes) into the
transcription backend (update the backend.imported via
settings.AUDIO_TRANSCRIPTION_BACKEND and its transcribe method to accept a
stream/file-like argument instead of bytes) so large uploads are rejected before
buffering and the backend can stream the data.
In `@src/backend/conversations/settings.py`:
- Around line 753-757: The AUDIO_TRANSCRIPTION_BACKEND setting is not validated
at startup; add a post_setup validation in settings.py that resolves and imports
the backend referenced by AUDIO_TRANSCRIPTION_BACKEND, verifies any required
companion settings (e.g., model keys or API creds) are present, and raises a
clear configuration error (or logs and exits) if import or prerequisites fail so
the mic UI cannot enable with a bad path; implement this in the module's
post_setup hook (or add one if missing) and reference
AUDIO_TRANSCRIPTION_BACKEND when performing the import/check so
misconfigurations fail fast rather than at first upload.
In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`:
- Line 4: The import of the Node.js built-in 'events' module (import { on } from
'events';) is unused in useAudioRecording.tsx and should be removed; open the
useAudioRecording hook and delete that import line (and any unused-import
entries referencing `on`) so client-side bundling doesn't attempt to include the
Node built-in and lint/prettier passes cleanly.
- Around line 26-37: The hook currently doesn't release media and audio
resources on unmount; add a cleanup effect that runs on component unmount to
stop and null out active resources: call stopVisualization(), stop and nullify
mediaRecorderRef.current (and finalize audioChunksRef if needed), stop all
tracks on streamRef.current and set streamRef.current = null, cancel animation
frame via animFrameRef and close audioCtxRef (if stopVisualization isn't
sufficient ensure audioCtxRef.current?.close()), and clear refs so the
microphone and AudioContext are released when the component unmounts.
- Around line 67-84: startRecording currently calls startVisualization() and
setRecordingState('recording') before MediaRecorder is fully constructed, and
the catch block doesn't stop the acquired stream; move the MediaRecorder
construction (const mediaRecorder = new MediaRecorder(stream)),
mediaRecorder.ondataavailable setup, mediaRecorder.start(), and mediaRecorderRef
assignment to occur before calling startVisualization() and setRecordingState so
visualization/state only run on successful recorder init, add stopStream() to
the catch block to stop tracks and prevent mic/visualization leaks if any step
throws, and add stopStream to the useCallback dependency array; reference
startRecording, startVisualization, mediaRecorderRef, streamRef, stopStream,
setRecordingState, and onTranscriptionErrorRef when making the change.
- Around line 95-99: The filename extension is being hard-coded to 'webm' for
all non-mp4 MIME types in useAudioRecording.tsx (see mimeType, ext, blob,
form.append), causing mismatches; update the logic that sets ext so it derives
the extension from the actual mimeType (e.g., map 'audio/webm'->'webm',
'audio/ogg'->'ogg', 'audio/mp4' or 'audio/m4a'->'mp4', 'audio/x-m4a'->'m4a',
etc.) and fall back to a safe default like 'bin' only if unknown, then continue
to create the Blob and call form.append('audio', blob, `recording.${ext}`) with
the correct extension. Ensure the mapping covers common browser types
(Firefox/Safari/Chrome) so filename matches content-type.
In `@src/frontend/apps/conversations/src/i18n/translations.json`:
- Around line 30-32: The new localization keys ("Record audio", "Cancel
recording", "Confirm recording") were added only to the French translations,
causing other locales to fallback; add these same keys with appropriate
translations (or at minimum placeholder English strings) to all locale
translation files (e.g., NL, RU, UK) so the UI has consistent strings across
languages—update the corresponding translations.json for each locale to include
the three keys with proper translated values.
- Line 71: Update the French translation for the key "An error occurred during
transcription. Please try again." by replacing "ré-essayer" with the preferred
spelling "réessayer" so the value becomes "Il semble y avoir eu une erreur dans
la transcription. Merci de réessayer."; locate and edit the string value in
translations.json associated with that exact English key.
In `@src/frontend/apps/conversations/src/pages/index.tsx`:
- Around line 4-5: Remove the two leftover debug comments "// test reload" and
"// test reload 2" found in pages/index.tsx (they are dev/hot-reload artifacts);
simply delete those comment lines so the file contains only production code and
no stray debug comments.
---
Nitpick comments:
In `@src/backend/chat/input/base_audio_backend.py`:
- Around line 14-19: The docstring for the transcribe method in BaseAudioBackend
mismatches its signature: it documents a "language" parameter that does not
exist. Update the transcribe docstring to reflect the actual signature of
BaseAudioBackend.transcribe by removing the "language" entry (or if you intend
to support language, add a language parameter to the method signature and update
implementations accordingly); ensure the Args section only lists file_name,
file_content, and content_type to match the method name transcribe and its
parameters.
In `@src/backend/chat/tests/input/test_albert_audio_backend.py`:
- Around line 49-63: Update the test_transcribe_sends_correct_request test to
also assert that the configured ASR form fields are sent: after calling
backend.transcribe and grabbing request = responses.calls[0].request, check the
multipart request.body contains the form field names and values for the ASR
configuration (e.g. that b'name="model"' and the expected model value, and
b'name="language"' and the expected language value, are present) in addition to
the existing Authorization and audio-content assertions so regressions that drop
model or language will fail.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5de646e4-29c1-4ee3-be16-25be1006780c
⛔ Files ignored due to path filters (1)
yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (21)
package.jsonsrc/backend/chat/exceptions.pysrc/backend/chat/input/__init__.pysrc/backend/chat/input/albert_audio_backend.pysrc/backend/chat/input/base_audio_backend.pysrc/backend/chat/tests/input/__init__.pysrc/backend/chat/tests/input/test_albert_audio_backend.pysrc/backend/chat/tests/views/test_audio_transcription.pysrc/backend/chat/views.pysrc/backend/conversations/settings.pysrc/backend/core/api/viewsets.pysrc/backend/core/urls.pysrc/frontend/apps/conversations/src/core/config/api/useConfig.tsxsrc/frontend/apps/conversations/src/features/chat/components/AudioRecordButton.tsxsrc/frontend/apps/conversations/src/features/chat/components/InputChat.tsxsrc/frontend/apps/conversations/src/features/chat/components/InputChatAction.tsxsrc/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsxsrc/frontend/apps/conversations/src/i18n/translations.jsonsrc/frontend/apps/conversations/src/pages/globals.csssrc/frontend/apps/conversations/src/pages/index.tsxsrc/frontend/apps/conversations/tsconfig.json
| def __init__(self): | ||
| self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions") | ||
| self._headers = {"Authorization": f"Bearer {settings.ALBERT_API_KEY}"} | ||
|
|
There was a problem hiding this comment.
Fail fast when Albert credentials are not configured.
Line 26 will send Authorization: Bearer None if ALBERT_API_KEY is unset, which turns a config error into a runtime upstream failure. Validate required settings at initialization and raise a configuration error early.
🛠️ Suggested fix
from django.conf import settings
+from django.core.exceptions import ImproperlyConfigured
@@
def __init__(self):
+ if not settings.ALBERT_API_KEY:
+ raise ImproperlyConfigured("ALBERT_API_KEY must be set for AlbertAudioBackend")
self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions")
self._headers = {"Authorization": f"Bearer {settings.ALBERT_API_KEY}"}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def __init__(self): | |
| self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions") | |
| self._headers = {"Authorization": f"Bearer {settings.ALBERT_API_KEY}"} | |
| from django.conf import settings | |
| from django.core.exceptions import ImproperlyConfigured | |
| def __init__(self): | |
| if not settings.ALBERT_API_KEY: | |
| raise ImproperlyConfigured("ALBERT_API_KEY must be set for AlbertAudioBackend") | |
| self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions") | |
| self._headers = {"Authorization": f"Bearer {settings.ALBERT_API_KEY}"} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/backend/chat/input/albert_audio_backend.py` around lines 24 - 27, In the
__init__ of AlbertAudioBackend, validate required config before building
_transcriptions_endpoint and _headers: check settings.ALBERT_API_KEY (and
optionally settings.ALBERT_API_URL) are set/non-empty and raise a clear
configuration exception (e.g., ValueError or a ConfigError) if missing so you
don't create _headers = {"Authorization": f"Bearer None"}; construct
_transcriptions_endpoint and _headers only after validation so failures fail
fast during initialization.
| response.raise_for_status() | ||
| return response.json().get("text", "").strip() | ||
| except requests.RequestException as e: |
There was a problem hiding this comment.
Do not silently accept invalid transcription payloads.
Line 58 falls back to "" when text is missing, which can silently drop user input and look like a valid success. Treat missing/invalid text as transcription failure.
🛠️ Suggested fix
response.raise_for_status()
- return response.json().get("text", "").strip()
+ payload = response.json()
+ text = payload.get("text")
+ if not isinstance(text, str) or not text.strip():
+ raise TranscriptionError("Transcription response missing non-empty 'text'")
+ return text.strip()
except requests.RequestException as e:
raise TranscriptionError(f"Error during transcription: {e}") from e📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| response.raise_for_status() | |
| return response.json().get("text", "").strip() | |
| except requests.RequestException as e: | |
| response.raise_for_status() | |
| payload = response.json() | |
| text = payload.get("text") | |
| if not isinstance(text, str) or not text.strip(): | |
| raise TranscriptionError("Transcription response missing non-empty 'text'") | |
| return text.strip() | |
| except requests.RequestException as e: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/backend/chat/input/albert_audio_backend.py` around lines 57 - 59, The
code currently treats a missing or empty "text" field as success by returning
""—change this to treat missing/invalid transcription payloads as failures:
after response.raise_for_status(), parse json = response.json(), then validate
that "text" exists and is a non-empty string; if not, raise an exception (e.g.,
requests.RequestException or RuntimeError) that includes the full response
content for debugging instead of returning an empty string; update the except
block handling requests.RequestException accordingly so transcription failures
propagate.
| AUDIO_TRANSCRIPTION_BACKEND = values.Value( | ||
| None, | ||
| environ_name="AUDIO_TRANSCRIPTION_BACKEND", | ||
| environ_prefix=None, | ||
| ) |
There was a problem hiding this comment.
Validate the transcription backend before exposing it.
AUDIO_TRANSCRIPTION_BACKEND is now the switch that enables this feature, but unlike the existing Find/OCR settings it is never sanity-checked during startup. A typo in the import path or missing backend prerequisites will only fail on the first user upload, after the mic UI is already enabled. Please add a post_setup validation for this backend and its required settings so misconfiguration fails fast instead of becoming a runtime 500/502.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/backend/conversations/settings.py` around lines 753 - 757, The
AUDIO_TRANSCRIPTION_BACKEND setting is not validated at startup; add a
post_setup validation in settings.py that resolves and imports the backend
referenced by AUDIO_TRANSCRIPTION_BACKEND, verifies any required companion
settings (e.g., model keys or API creds) are present, and raises a clear
configuration error (or logs and exits) if import or prerequisites fail so the
mic UI cannot enable with a bad path; implement this in the module's post_setup
hook (or add one if missing) and reference AUDIO_TRANSCRIPTION_BACKEND when
performing the import/check so misconfigurations fail fast rather than at first
upload.
Signed-off-by: natoromano <nathanaelpromano@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (2)
src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx (2)
1-1:⚠️ Potential issue | 🟠 MajorAdd unmount teardown for recorder/stream/audio resources.
This was previously flagged and is still missing: if the component unmounts mid-recording, mic/audio resources can remain active until tab close.
🧹 Proposed fix
-import { useCallback, useRef, useState } from 'react'; +import { useCallback, useEffect, useRef, useState } from 'react'; ... + useEffect(() => { + return () => { + const recorder = mediaRecorderRef.current; + if (recorder && recorder.state !== 'inactive') { + recorder.onstop = null; + recorder.stop(); + } + mediaRecorderRef.current = null; + audioChunksRef.current = []; + stopStream(); + }; + }, [stopStream]);Also applies to: 25-30
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx` at line 1, The hook useAudioRecording is missing an unmount teardown to release mic/audio resources; add a useEffect with a cleanup that checks and stops any active recorder (e.g., recorderRef or MediaRecorder), stops and nulls the media stream tracks (for each track call stop()), disconnects and closes any AudioContext and source nodes (e.g., audioContextRef, sourceNodeRef), removes any event listeners, and clear related refs (mediaStreamRef, recorderRef, audioChunksRef) so resources are freed if the component unmounts mid-recording. Ensure the cleanup runs once (empty deps) and mirrors logic in startRecording/stopRecording functions to avoid leaving tracks or audio contexts open.
95-99:⚠️ Potential issue | 🟠 MajorDerive filename extension from normalized MIME type.
This mismatch was already reported and is still present: non-mp4 formats are forced to
.webm, which can diverge from actual recorder output on Firefox/Safari.🎯 Proposed fix
const mimeType = recorder.mimeType || 'audio/webm'; - const ext = mimeType.includes('mp4') ? 'mp4' : 'webm'; + const normalizedMimeType = mimeType.split(';', 1)[0].toLowerCase(); + const extByMime: Record<string, string> = { + 'audio/webm': 'webm', + 'audio/ogg': 'ogg', + 'audio/mp4': 'mp4', + 'audio/m4a': 'm4a', + 'audio/x-m4a': 'm4a', + }; + const ext = extByMime[normalizedMimeType] ?? 'bin'; const blob = new Blob(audioChunksRef.current, { type: mimeType }); const form = new FormData(); form.append('audio', blob, `recording.${ext}`);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx` around lines 95 - 99, The code forces non-mp4 formats to .webm by checking mimeType.includes('mp4'), which can mismatch actual recorder output (e.g., Firefox/Safari); normalize the MIME type (strip parameters like charset by using mimeType.split(';')[0].trim()) and derive the file extension from the normalized type's subtype (const subtype = normalized.split('/')[1]) with a small mapping for common audio types (mp4 -> mp4, webm -> webm, ogg -> ogg, mpeg/mp3 -> mp3) and a sensible fallback (e.g., webm); also use the normalized MIME when creating the Blob (blob = new Blob(..., { type: normalized })) so the appended FormData filename (recording.{ext}) and blob type match the actual MIME.
🧹 Nitpick comments (1)
src/backend/conversations/settings.py (1)
758-767: Consider aligning theenviron_namewith the setting name for consistency.The setting
AUDIO_MAX_SIZEusesenviron_name="MAX_AUDIO_UPLOAD_BYTES", which differs from the Django setting name. This inconsistency could cause confusion when configuring the application. Consider usingAUDIO_MAX_SIZEas the env var name for consistency with other settings.Also note that the 512MB default is quite large for audio files—even a 1-hour uncompressed WAV at 44.1kHz stereo is ~635MB. For a POC this is fine, but you may want to tune this for production.
♻️ Suggested alignment
AUDIO_MAX_SIZE = values.PositiveIntegerValue( 512 * 1024 * 1024, # 512 MB - environ_name="MAX_AUDIO_UPLOAD_BYTES", + environ_name="AUDIO_MAX_SIZE", environ_prefix=None, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/backend/conversations/settings.py` around lines 758 - 767, The environment variable name for the AUDIO_MAX_SIZE setting is inconsistent: change the values.PositiveIntegerValue(...) call for AUDIO_MAX_SIZE so its environ_name matches the Django setting name (use "AUDIO_MAX_SIZE" instead of "MAX_AUDIO_UPLOAD_BYTES"), leaving environ_prefix=None; verify the default 512 * 1024 * 1024 remains or adjust the default value in AUDIO_MAX_SIZE if you want a smaller production-safe cap, and ensure AUDIO_TRANSCRIPTION_BACKEND's environ_name is left as-is or similarly aligned if desired.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/backend/chat/input/albert_audio_backend.py`:
- Line 26: The current use of urljoin with an absolute path causes any path in
settings.ALBERT_API_URL to be dropped (e.g., "/api" lost); update the
construction of self._transcriptions_endpoint to concatenate safely by
stripping/trimming slashes instead of using urljoin with a leading slash—use
settings.ALBERT_API_URL.rstrip('/') + '/v1/audio/transcriptions' (or build the
path component without a leading slash before joining) so the base URL's path
prefix is preserved; apply the same pattern wherever urljoin is used with a
leading-path literal.
In `@src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx`:
- Around line 421-431: The keydown handler calls confirmRecording() which
returns a Promise but is not handled; update the handleKeyDown in the useEffect
(the function referencing recordingState and confirmRecording) to handle the
returned Promise—either prefix the call with void (void confirmRecording()) or
explicitly call confirmRecording().catch(err => {/* handle or log error */}) so
any rejection is handled and no unhandled promise rejection occurs.
- Around line 554-561: The linter complains because startRecording and
confirmRecording return Promises while the prop types expect void; fix this by
passing arrow functions that discard the returned Promise using the void
operator instead of forwarding the async functions directly—update the props
onStartRecording and onConfirmRecording to call void startRecording(...) and
void confirmRecording(...) via arrow wrappers (keeping recordingState, volume,
cancelRecording, etc. unchanged) so TypeScript's strict Promise checks are
satisfied.
In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`:
- Line 86: The confirmRecording function is marked async but contains no direct
awaits (all async work happens inside recorder.onstop), which triggers
require-await; remove the async keyword from the useCallback declaration for
confirmRecording and keep the existing recorder.onstop handler and returned
value unchanged so behavior stays the same (locate the useCallback named
confirmRecording and delete the async modifier).
- Around line 31-35: The stopVisualization callback currently calls
audioCtxRef.current?.close() without handling the returned Promise; update the
stopVisualization implementation (the useCallback named stopVisualization in
useAudioRecording.tsx) to explicitly handle the Promise from
AudioContext.close()—either make stopVisualization async and await
audioCtxRef.current.close() inside a try/catch, or call
audioCtxRef.current.close().then(...).catch(...)—and only set
audioCtxRef.current = null and setVolume(0) after successfully closing (or in
finally) to avoid unhandled rejections and ensure proper cleanup.
- Around line 107-109: Create a typed interface for the transcription response
(e.g., interface TranscriptionResponse { text?: string; error?: any }) and cast
the parsed JSON to it (const data = await res.json() as TranscriptionResponse)
so member access is typed; keep the original error-checking logic by still using
if (data.error) onTranscriptionErrorRef.current?.(); and before calling
onTranscriptionRef.current ensure type-safety with if (typeof data.text ===
'string') onTranscriptionRef.current(data.text). Use the identifiers data, res,
onTranscriptionRef.current, and onTranscriptionErrorRef.current in
useAudioRecording.tsx.
---
Duplicate comments:
In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`:
- Line 1: The hook useAudioRecording is missing an unmount teardown to release
mic/audio resources; add a useEffect with a cleanup that checks and stops any
active recorder (e.g., recorderRef or MediaRecorder), stops and nulls the media
stream tracks (for each track call stop()), disconnects and closes any
AudioContext and source nodes (e.g., audioContextRef, sourceNodeRef), removes
any event listeners, and clear related refs (mediaStreamRef, recorderRef,
audioChunksRef) so resources are freed if the component unmounts mid-recording.
Ensure the cleanup runs once (empty deps) and mirrors logic in
startRecording/stopRecording functions to avoid leaving tracks or audio contexts
open.
- Around line 95-99: The code forces non-mp4 formats to .webm by checking
mimeType.includes('mp4'), which can mismatch actual recorder output (e.g.,
Firefox/Safari); normalize the MIME type (strip parameters like charset by using
mimeType.split(';')[0].trim()) and derive the file extension from the normalized
type's subtype (const subtype = normalized.split('/')[1]) with a small mapping
for common audio types (mp4 -> mp4, webm -> webm, ogg -> ogg, mpeg/mp3 -> mp3)
and a sensible fallback (e.g., webm); also use the normalized MIME when creating
the Blob (blob = new Blob(..., { type: normalized })) so the appended FormData
filename (recording.{ext}) and blob type match the actual MIME.
---
Nitpick comments:
In `@src/backend/conversations/settings.py`:
- Around line 758-767: The environment variable name for the AUDIO_MAX_SIZE
setting is inconsistent: change the values.PositiveIntegerValue(...) call for
AUDIO_MAX_SIZE so its environ_name matches the Django setting name (use
"AUDIO_MAX_SIZE" instead of "MAX_AUDIO_UPLOAD_BYTES"), leaving
environ_prefix=None; verify the default 512 * 1024 * 1024 remains or adjust the
default value in AUDIO_MAX_SIZE if you want a smaller production-safe cap, and
ensure AUDIO_TRANSCRIPTION_BACKEND's environ_name is left as-is or similarly
aligned if desired.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 2e76ae4d-6162-4f44-9be0-1d3ba2e6db55
⛔ Files ignored due to path filters (1)
yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (25)
CHANGELOG.mdsrc/backend/chat/exceptions.pysrc/backend/chat/input/__init__.pysrc/backend/chat/input/albert_audio_backend.pysrc/backend/chat/input/base_audio_backend.pysrc/backend/chat/tests/input/__init__.pysrc/backend/chat/tests/input/test_albert_audio_backend.pysrc/backend/chat/tests/views/test_audio_transcription.pysrc/backend/chat/views.pysrc/backend/conversations/settings.pysrc/backend/core/api/viewsets.pysrc/backend/core/tests/test_api_config.pysrc/backend/core/urls.pysrc/frontend/apps/conversations/src/core/config/api/useConfig.tsxsrc/frontend/apps/conversations/src/features/chat/components/AudioRecordButton.tsxsrc/frontend/apps/conversations/src/features/chat/components/InputChat.tsxsrc/frontend/apps/conversations/src/features/chat/components/InputChatAction.tsxsrc/frontend/apps/conversations/src/features/chat/components/__tests__/InputChatAction.test.tsxsrc/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsxsrc/frontend/apps/conversations/src/features/header/components/ButtonToggleLeftPanel.tsxsrc/frontend/apps/conversations/src/features/header/components/Header.tsxsrc/frontend/apps/conversations/src/features/language/components/LanguagePicker.tsxsrc/frontend/apps/conversations/src/i18n/translations.jsonsrc/frontend/apps/conversations/src/pages/globals.csssrc/frontend/apps/conversations/tsconfig.json
💤 Files with no reviewable changes (2)
- src/frontend/apps/conversations/src/features/header/components/Header.tsx
- src/frontend/apps/conversations/src/features/chat/components/tests/InputChatAction.test.tsx
✅ Files skipped from review due to trivial changes (9)
- src/frontend/apps/conversations/src/features/header/components/ButtonToggleLeftPanel.tsx
- src/frontend/apps/conversations/src/features/language/components/LanguagePicker.tsx
- CHANGELOG.md
- src/backend/chat/exceptions.py
- src/frontend/apps/conversations/tsconfig.json
- src/frontend/apps/conversations/src/pages/globals.css
- src/frontend/apps/conversations/src/core/config/api/useConfig.tsx
- src/frontend/apps/conversations/src/i18n/translations.json
- src/backend/chat/tests/views/test_audio_transcription.py
🚧 Files skipped from review as they are similar to previous changes (3)
- src/backend/chat/input/base_audio_backend.py
- src/frontend/apps/conversations/src/features/chat/components/InputChatAction.tsx
- src/backend/chat/views.py
| """ | ||
|
|
||
| def __init__(self): | ||
| self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions") |
There was a problem hiding this comment.
urljoin with an absolute path discards the base URL's path.
urljoin("https://albert.example.com/api", "/v1/audio/transcriptions") returns "https://albert.example.com/v1/audio/transcriptions" — the /api portion is lost. If ALBERT_API_URL ever includes a path prefix, this will break.
🛠️ Suggested fix using rstrip/lstrip
def __init__(self):
- self._transcriptions_endpoint = urljoin(settings.ALBERT_API_URL, "/v1/audio/transcriptions")
+ base = settings.ALBERT_API_URL.rstrip("/")
+ self._transcriptions_endpoint = f"{base}/v1/audio/transcriptions"
self._headers = {"Authorization": f"Bearer {settings.ALBERT_API_KEY}"}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/backend/chat/input/albert_audio_backend.py` at line 26, The current use
of urljoin with an absolute path causes any path in settings.ALBERT_API_URL to
be dropped (e.g., "/api" lost); update the construction of
self._transcriptions_endpoint to concatenate safely by stripping/trimming
slashes instead of using urljoin with a leading slash—use
settings.ALBERT_API_URL.rstrip('/') + '/v1/audio/transcriptions' (or build the
path component without a leading slash before joining) so the base URL's path
prefix is preserved; apply the same pattern wherever urljoin is used with a
leading-path literal.
| const stopVisualization = useCallback(() => { | ||
| cancelAnimationFrame(animFrameRef.current); | ||
| audioCtxRef.current?.close(); | ||
| audioCtxRef.current = null; | ||
| setVolume(0); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
head -50 src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx | cat -nRepository: suitenumerique/conversations
Length of output: 2123
🌐 Web query:
AudioContext.close() Promise Web API documentation
💡 Result:
The AudioContext.close method is part of the Web Audio API. It closes the AudioContext, releasing system audio resources it uses, suspending audio time progression, and stopping audio data processing. It does not automatically release all AudioContext-created objects (unless other references are gone). The returned Promise resolves with undefined once all AudioContext-creation-blocking resources are released. Calling close on an OfflineAudioContext throws INVALID_STATE_ERR. Syntax: close: Promise Example: audioCtx.close.then( => { // Context closed, resources released }); This is well-supported across browsers since April 2021. Official documentation: MDN Web Docs and Web Audio API specification (webaudio.github.io/web-audio-api/#dom-audiocontext-close, referenced in MDN).
Citations:
- 1: https://developer.mozilla.org/en-US/docs/Web/API/AudioContext/close
- 2: https://developer.mozilla.org/docs/Web/API/AudioContext/close
- 3: https://dev.mozilla.org/en-US/docs/Web/API/AudioContext/close
- 4: https://developer.mozilla.org/en-US/docs/Web/API/AudioContext
- 5: https://www.w3.org/TR/webaudio-1.1/
Handle AudioContext.close() Promise explicitly.
Line 33 creates a floating Promise—AudioContext.close() returns a Promise<void>, which should be handled to prevent potential unhandled rejections and lint failures.
✅ Proposed fix
const stopVisualization = useCallback(() => {
cancelAnimationFrame(animFrameRef.current);
- audioCtxRef.current?.close();
- audioCtxRef.current = null;
+ const ctx = audioCtxRef.current;
+ audioCtxRef.current = null;
+ if (ctx) void ctx.close().catch(() => undefined);
setVolume(0);
}, []);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const stopVisualization = useCallback(() => { | |
| cancelAnimationFrame(animFrameRef.current); | |
| audioCtxRef.current?.close(); | |
| audioCtxRef.current = null; | |
| setVolume(0); | |
| const stopVisualization = useCallback(() => { | |
| cancelAnimationFrame(animFrameRef.current); | |
| const ctx = audioCtxRef.current; | |
| audioCtxRef.current = null; | |
| if (ctx) void ctx.close().catch(() => undefined); | |
| setVolume(0); |
🧰 Tools
🪛 GitHub Check: lint-front
[failure] 33-33:
Promises must be awaited, end with a call to .catch, end with a call to .then with a rejection handler or be explicitly marked as ignored with the void operator
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`
around lines 31 - 35, The stopVisualization callback currently calls
audioCtxRef.current?.close() without handling the returned Promise; update the
stopVisualization implementation (the useCallback named stopVisualization in
useAudioRecording.tsx) to explicitly handle the Promise from
AudioContext.close()—either make stopVisualization async and await
audioCtxRef.current.close() inside a try/catch, or call
audioCtxRef.current.close().then(...).catch(...)—and only set
audioCtxRef.current = null and setVolume(0) after successfully closing (or in
finally) to avoid unhandled rejections and ensure proper cleanup.
| const data = await res.json(); | ||
| if (data.error) onTranscriptionErrorRef.current?.(); | ||
| if (data.text) onTranscriptionRef.current(data.text); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's read the file and see the actual code at lines 107-109
cat -n src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx | sed -n '100,120p'Repository: suitenumerique/conversations
Length of output: 751
🏁 Script executed:
# Let's also check if there are any type definitions or interfaces related to transcription
rg "transcription|Transcription" src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx -A 2 -B 2Repository: suitenumerique/conversations
Length of output: 1170
🏁 Script executed:
# Check for tsconfig or eslint configuration to understand the lint rules
fd -e tsconfig.json -e .eslintrc -e eslint.config.js src/frontend/apps/conversations | head -20Repository: suitenumerique/conversations
Length of output: 501
🏁 Script executed:
# Check the full context of the function containing lines 107-109
wc -l src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsxRepository: suitenumerique/conversations
Length of output: 153
🏁 Script executed:
# Find tsconfig.json files
fd --full-path tsconfig.jsonRepository: suitenumerique/conversations
Length of output: 194
🏁 Script executed:
# Check the tsconfig at the project root or relevant directory
cat tsconfig.json 2>/dev/null || cat src/tsconfig.json 2>/dev/null || cat src/frontend/tsconfig.json 2>/dev/null || trueRepository: suitenumerique/conversations
Length of output: 54
🏁 Script executed:
# Look for any type definitions or response structures for the transcribe API
rg "transcribe" src/frontend/apps/conversations -A 5 -B 2 | head -60Repository: suitenumerique/conversations
Length of output: 909
🏁 Script executed:
# Check if there are tests that might reveal the expected response structure
fd -e test.ts -e test.tsx -e spec.ts src/frontend/apps/conversations/src/features/chat | xargs grep -l "transcribe" 2>/dev/null || echo "No test files found"Repository: suitenumerique/conversations
Length of output: 610
🏁 Script executed:
# Read the tsconfig.json for the conversations app
cat src/frontend/apps/conversations/tsconfig.jsonRepository: suitenumerique/conversations
Length of output: 771
🏁 Script executed:
# Find and read fetchAPI definition
rg "fetchAPI" src/frontend/apps/conversations -A 10 -B 5 | head -80Repository: suitenumerique/conversations
Length of output: 7629
🏁 Script executed:
# Search for any type definitions for API responses in the entire frontend
fd --full-path -e 'types.ts' -e 'types.d.ts' src/frontend | head -10Repository: suitenumerique/conversations
Length of output: 54
🏁 Script executed:
# Look for API response interfaces or types
rg "interface.*Response|type.*Response" src/frontend/apps/conversations --type ts --type tsx | head -20Repository: suitenumerique/conversations
Length of output: 99
🏁 Script executed:
# Find the api module definition
fd --full-path '*/api/index.ts' src/frontend/apps/conversationsRepository: suitenumerique/conversations
Length of output: 365
🏁 Script executed:
# Search for fetchAPI definition more broadly
rg "export.*fetchAPI|function fetchAPI|const fetchAPI" src/frontend -A 15 | head -100Repository: suitenumerique/conversations
Length of output: 2882
🏁 Script executed:
# Look for the api directory structure
fd --full-path -type d 'api' src/frontend/apps/conversationsRepository: suitenumerique/conversations
Length of output: 242
🏁 Script executed:
# Read the fetchAPI implementation
cat -n src/frontend/apps/conversations/src/api/fetchApi.tsRepository: suitenumerique/conversations
Length of output: 975
🏁 Script executed:
# Look for eslint configuration
fd --glob '.eslintrc*' src/frontend/apps/conversationsRepository: suitenumerique/conversations
Length of output: 279
🏁 Script executed:
# Check for eslint rules in package.json
cat src/frontend/apps/conversations/package.json 2>/dev/null | grep -A 20 '"eslintConfig"' || echo "No eslintConfig in package.json"Repository: suitenumerique/conversations
Length of output: 103
🏁 Script executed:
# Look at the actual current code again with more context to understand if there are actual lint issues
cat -n src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx | sed -n '95,115p'Repository: suitenumerique/conversations
Length of output: 991
🏁 Script executed:
# Check if there's an eslint config file with -H flag to show hidden files
fd --hidden -glob '.eslintrc*' src/frontend/apps/conversationsRepository: suitenumerique/conversations
Length of output: 193
🏁 Script executed:
# Look at the entire useAudioRecording.tsx file to understand the full context
cat src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsxRepository: suitenumerique/conversations
Length of output: 4377
Type the transcription response before member access.
Lines 107-109 currently assign await res.json() to data with type any, causing unsafe member access and unsafe argument issues. Line 109 passes data.text (type any) to a function expecting string, which violates strict type checking.
The proposed fix has a semantic issue: changing if (data.error) to if (typeof data.error === 'string') alters the error-checking logic. A simpler fix preserves the original behavior while adding type safety:
Suggested fix
- const data = await res.json();
+ const data = (await res.json()) as Record<string, unknown>;
if (data.error) onTranscriptionErrorRef.current?.();
if (data.text) onTranscriptionRef.current(data.text as string);Alternatively, define a proper interface for the transcription response and cast to it instead of Record<string, unknown>.
🧰 Tools
🪛 GitHub Check: lint-front
[failure] 109-109:
Unsafe member access .text on an any value
[failure] 109-109:
Unsafe argument of type any assigned to a parameter of type string
[failure] 109-109:
Unsafe member access .text on an any value
[failure] 108-108:
Unsafe member access .error on an any value
[failure] 107-107:
Unsafe assignment of an any value
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`
around lines 107 - 109, Create a typed interface for the transcription response
(e.g., interface TranscriptionResponse { text?: string; error?: any }) and cast
the parsed JSON to it (const data = await res.json() as TranscriptionResponse)
so member access is typed; keep the original error-checking logic by still using
if (data.error) onTranscriptionErrorRef.current?.(); and before calling
onTranscriptionRef.current ensure type-safety with if (typeof data.text ===
'string') onTranscriptionRef.current(data.text). Use the identifiers data, res,
onTranscriptionRef.current, and onTranscriptionErrorRef.current in
useAudioRecording.tsx.
Signed-off-by: natoromano <nathanaelpromano@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx (2)
21-25: Avoid mutating refs during render.Assigning to
onTranscriptionRef.current/onTranscriptionErrorRef.currentin the component body runs on every render and mutates during the render phase. Under React 19's concurrent rendering, a render can be discarded, leaving refs pointing at a stale callback from an abandoned render. Move the sync into a layout effect so it only runs on committed renders.♻️ Proposed refactor
-import { useCallback, useRef, useState } from 'react'; +import { useCallback, useLayoutEffect, useRef, useState } from 'react'; @@ - const onTranscriptionRef = useRef(onTranscription); - onTranscriptionRef.current = onTranscription; - - const onTranscriptionErrorRef = useRef(onTranscriptionError); - onTranscriptionErrorRef.current = onTranscriptionError; + const onTranscriptionRef = useRef(onTranscription); + const onTranscriptionErrorRef = useRef(onTranscriptionError); + useLayoutEffect(() => { + onTranscriptionRef.current = onTranscription; + onTranscriptionErrorRef.current = onTranscriptionError; + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx` around lines 21 - 25, The refs onTranscriptionRef and onTranscriptionErrorRef are being mutated during render; move the assignments into a committed-phase effect by replacing the direct assigns with a useLayoutEffect that sets onTranscriptionRef.current = onTranscription and onTranscriptionErrorRef.current = onTranscriptionError so updates only occur after commit; ensure useLayoutEffect is imported and used inside the same hook (useAudioRecording / the component that declares these refs) to avoid render-phase mutation and stale callbacks from discarded renders.
93-117:recorder.onstopis assigned an async function; its rejection is unhandled.
MediaRecorder.onstophas avoidreturn type, so the returned Promise from this async handler is never awaited or caught by the browser. Thetry/catchinside coversfetchAPIandres.json(), but anything thrown synchronously before thetry(e.g.,new Blobunder extreme memory pressure, orstopStreamthrowing) becomes an unhandled rejection. Not a blocker — thetrycovers the main failure modes — but worth a small scope tightening.♻️ Optional hardening
- recorder.onstop = async () => { - stopStream(); - setRecordingState('transcribing'); - - const mimeType = recorder.mimeType || 'audio/webm'; - ... - try { - ... - } catch { - onTranscriptionErrorRef.current?.(); - } finally { - setRecordingState('idle'); - } - }; + recorder.onstop = () => { + void (async () => { + stopStream(); + setRecordingState('transcribing'); + try { + ... + } catch { + onTranscriptionErrorRef.current?.(); + } finally { + setRecordingState('idle'); + } + })(); + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx` around lines 93 - 117, The onstop handler assigned to recorder.onstop is async so any rejection from code outside the inner try/catch (e.g., stopStream(), new Blob(...)) can become an unhandled rejection; fix by making recorder.onstop a synchronous function that immediately wraps the async work in a .catch-safe block (either a try/catch around the entire body or call an async IIFE and attach .catch), ensuring all exceptions call onTranscriptionErrorRef.current?.() and setRecordingState('idle'); update the handler that references recorder.onstop, stopStream, setRecordingState, audioChunksRef, new Blob, fetchAPI('transcribe/') and res.json() so every possible throw is caught.src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx (1)
384-395: Directly mutating a controlled textarea's DOM value is fragile.The
textareais a controlled input (value={input ?? ''}on Line 506), so settingtextarea.value = textis immediately overwritten on the next render unless the synthesizedhandleTextareaChangecall manages to flush throughhandleInputChangefirst. It happens to work because you build a fake event whosetarget.valuecarries the new text, but relying on React re-reading that value out of a mutated DOM node is brittle (it breaks ifhandleInputChangeis ever memoized on a stale input, or if the parent batches state updates differently). Prefer routing transcribed text through the same state pipeline that the parent already owns.♻️ Proposed refactor
If
handleInputChangeultimately calls asetInput, lift that setter (or expose anappendInput(text)) up and call it here instead of faking an event. At minimum, keep the fake-event approach but skip the DOM mutation:- const handleAudioTranscription = useCallback( - (text: string) => { - const textarea = textareaRef.current; - if (!textarea) return; - textarea.value = text; - handleTextareaChange({ - target: textarea, - } as React.ChangeEvent<HTMLTextAreaElement>); - textarea.focus(); - }, - [handleTextareaChange], - ); + const handleAudioTranscription = useCallback( + (text: string) => { + const textarea = textareaRef.current; + if (!textarea) return; + handleTextareaChange({ + target: { ...textarea, value: text, style: textarea.style, scrollHeight: textarea.scrollHeight }, + } as unknown as React.ChangeEvent<HTMLTextAreaElement>); + textarea.focus(); + }, + [handleTextareaChange], + );The cleaner long-term fix is to have the parent expose a dedicated setter for the input string and call it directly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx` around lines 384 - 395, The current handleAudioTranscription mutates the DOM node (textarea.value = text) which is fragile for a controlled input; instead route the transcribed text through the component's state updater used by the parent (do not write to textarea.value). Update handleAudioTranscription to call the parent's input setter or an exposed helper (e.g., setInput or appendInput) so the new text flows through the controlled value pipeline; if that setter isn't available, at minimum stop mutating textareaRef.current and invoke handleTextareaChange with a synthetic event whose target.value is the text (keeping textareaRef only for focus), referencing handleAudioTranscription, textareaRef, handleTextareaChange and the controlled value prop (value={input ?? ''}) to locate the places to change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx`:
- Around line 421-431: The current document-level Enter handler intercepts all
Enter presses while recordingState === 'recording' which can hijack other
inputs; change the listener to target only the chat input element instead of
document by attaching the keydown handler to the chat textarea/input element
(use a ref to the input or the component container) or by checking the
event.target inside handleKeyDown (ensure e.target is the chat textarea or
within the chat container via e.target instanceof HTMLElement &&
e.target.closest('.your-chat-input-class')). Keep the same key checks and
preventDefault only when the target matches, attach with
inputRef.current.addEventListener and remove on cleanup, and update references
to recordingState and confirmRecording accordingly.
---
Nitpick comments:
In `@src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx`:
- Around line 384-395: The current handleAudioTranscription mutates the DOM node
(textarea.value = text) which is fragile for a controlled input; instead route
the transcribed text through the component's state updater used by the parent
(do not write to textarea.value). Update handleAudioTranscription to call the
parent's input setter or an exposed helper (e.g., setInput or appendInput) so
the new text flows through the controlled value pipeline; if that setter isn't
available, at minimum stop mutating textareaRef.current and invoke
handleTextareaChange with a synthetic event whose target.value is the text
(keeping textareaRef only for focus), referencing handleAudioTranscription,
textareaRef, handleTextareaChange and the controlled value prop (value={input ??
''}) to locate the places to change.
In
`@src/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsx`:
- Around line 21-25: The refs onTranscriptionRef and onTranscriptionErrorRef are
being mutated during render; move the assignments into a committed-phase effect
by replacing the direct assigns with a useLayoutEffect that sets
onTranscriptionRef.current = onTranscription and onTranscriptionErrorRef.current
= onTranscriptionError so updates only occur after commit; ensure
useLayoutEffect is imported and used inside the same hook (useAudioRecording /
the component that declares these refs) to avoid render-phase mutation and stale
callbacks from discarded renders.
- Around line 93-117: The onstop handler assigned to recorder.onstop is async so
any rejection from code outside the inner try/catch (e.g., stopStream(), new
Blob(...)) can become an unhandled rejection; fix by making recorder.onstop a
synchronous function that immediately wraps the async work in a .catch-safe
block (either a try/catch around the entire body or call an async IIFE and
attach .catch), ensuring all exceptions call onTranscriptionErrorRef.current?.()
and setRecordingState('idle'); update the handler that references
recorder.onstop, stopStream, setRecordingState, audioChunksRef, new Blob,
fetchAPI('transcribe/') and res.json() so every possible throw is caught.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 548e3f97-d7bf-47e9-b7cd-c02c28ebd38e
📒 Files selected for processing (4)
src/frontend/apps/conversations/src/features/chat/components/InputChat.tsxsrc/frontend/apps/conversations/src/features/chat/hooks/useAudioRecording.tsxsrc/frontend/apps/conversations/src/features/chat/types.tsxsrc/frontend/apps/conversations/src/i18n/translations.json
✅ Files skipped from review due to trivial changes (2)
- src/frontend/apps/conversations/src/features/chat/types.tsx
- src/frontend/apps/conversations/src/i18n/translations.json
| useEffect(() => { | ||
| if (recordingState !== 'recording') return; | ||
| const handleKeyDown = (e: KeyboardEvent) => { | ||
| if (e.key === 'Enter' && !e.ctrlKey && !e.shiftKey) { | ||
| e.preventDefault(); | ||
| void confirmRecording(); | ||
| } | ||
| }; | ||
| document.addEventListener('keydown', handleKeyDown); | ||
| return () => document.removeEventListener('keydown', handleKeyDown); | ||
| }, [recordingState, confirmRecording]); |
There was a problem hiding this comment.
Document-level Enter handler is too broad.
While recordingState === 'recording', any Enter keypress on the page (e.g., inside another input, a modal, a button) will be intercepted, have preventDefault() called, and trigger confirmRecording(). Scope this to a more specific target (e.g., only when focus is on the textarea or on body) to avoid accidentally hijacking unrelated Enter presses.
🩹 Proposed fix
const handleKeyDown = (e: KeyboardEvent) => {
if (e.key === 'Enter' && !e.ctrlKey && !e.shiftKey) {
+ const target = e.target as HTMLElement | null;
+ const tag = target?.tagName;
+ const isInteractive =
+ tag === 'INPUT' || tag === 'TEXTAREA' || target?.isContentEditable;
+ if (isInteractive && target !== textareaRef.current) return;
e.preventDefault();
void confirmRecording();
}
};🧰 Tools
🪛 GitHub Check: SonarCloud Code Analysis
[failure] 426-426: Remove this use of the "void" operator.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/frontend/apps/conversations/src/features/chat/components/InputChat.tsx`
around lines 421 - 431, The current document-level Enter handler intercepts all
Enter presses while recordingState === 'recording' which can hijack other
inputs; change the listener to target only the chat input element instead of
document by attaching the keydown handler to the chat textarea/input element
(use a ref to the input or the component container) or by checking the
event.target inside handleKeyDown (ensure e.target is the chat textarea or
within the chat container via e.target instanceof HTMLElement &&
e.target.closest('.your-chat-input-class')). Keep the same key checks and
preventDefault only when the target matches, attach with
inputRef.current.addEventListener and remove on cleanup, and update references
to recordingState and confirmRecording accordingly.
Signed-off-by: natoromano <nathanaelpromano@gmail.com>
|
|
par curiosité ça passe en prod quand ? J'ai très hate de tester et de pouvoir potentiellement enfin annuler mon abonnement ChatGPT |



Purpose
Wanted to build a small POC to see if audio prompting with Albert API (whisper large v3) worked well as-is, turns out it does. So I figured I'd make it a PR, but do whatever you want with it!
cc @eliott07
Summary by CodeRabbit
New Features
Tests
Chores