Add dashboard diarize action#3
Conversation
|
I rebased this branch with your recent changes. |
|
@abn What do you think ? |
|
@abn I also added a way to keep the diarization model always loaded on VRAM. |
|
Awesome @PsychOsmosis Let me try get to this over the weekend (that magical weekend 😄 ) |
There was a problem hiding this comment.
Pull request overview
This PR adds a “diarize” action for already-transcribed jobs in the dashboard UI and introduces a backend “diarization-only” flow, including support for a persistent (resident) diarization worker that can be loaded/unloaded (and optionally loaded at server startup).
Changes:
- Frontend: adds a diarize action in the transcription list (desktop + swipe actions) and reuses the existing profile dialog to run diarization-only with optional forced diarization settings.
- Backend: adds
/api/v1/transcription/:id/diarizeplusDiarizationOnlyjob handling to load an existing transcript and merge new diarization results without re-transcribing. - Adds a persistent diarization worker manager (PyAnnote + Sortformer) with new API routes and UI controls, plus a user setting to request loading a diarization model at startup.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| web/frontend/src/features/transcription/components/AudioFilesTable.tsx | Adds a dashboard “Diarize” action and submits diarization-only jobs against completed transcripts. |
| web/frontend/src/features/settings/components/ProfileSettings.tsx | Adds a user setting to choose a diarization model to keep loaded at startup. |
| web/frontend/src/components/ui/swipeable-item.tsx | Adds an optional “Diarize” swipe action and dynamically sizes the action area. |
| web/frontend/src/components/transcription/TranscriptionConfigDialog.tsx | Extends WhisperXParams typing to include diarization_only. |
| web/frontend/src/components/TranscribeDDialog.tsx | Generalizes dialog copy/labels and adds forceDiarization to show diarization settings even if profile disables them. |
| web/frontend/src/components/Header.tsx | Adds UI + polling + load/unload dialog for the persistent diarization worker. |
| internal/transcription/unified_service.go | Implements diarization-only processing by loading an existing transcript and optionally running diarization separately. |
| internal/transcription/queue_integration.go | Exposes persistent diarization worker operations via the unified job processor. |
| internal/transcription/adapters/sortformer_adapter.go | Adds persistent-worker execution path and embeds/copies the worker script. |
| internal/transcription/adapters/pyannote_adapter.go | Adds persistent-worker execution path and relaxes HF token requirement when the resident model is loaded. |
| internal/transcription/adapters/py/pyannote/pyannote_worker.py | Adds a long-lived PyAnnote diarization worker process (stdin/stdout JSON protocol). |
| internal/transcription/adapters/py/nvidia/sortformer_worker.py | Adds a long-lived Sortformer diarization worker process (stdin/stdout JSON protocol). |
| internal/transcription/adapters/persistent_diarization_manager.go | Adds the Go manager that owns the single resident worker process and request protocol. |
| internal/models/transcription.go | Adds diarization_only to job parameters and startup_diarization_model to users. |
| internal/api/router.go | Registers the new transcription diarize route and diarization-worker management routes. |
| internal/api/handlers.go | Implements StartDiarization and persistent diarization worker load/unload/status endpoints + user setting validation. |
| cmd/server/main.go | Loads a persistent diarization model at startup based on user settings and resolves HF token from a profile. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func (w *persistentDiarizationWorker) Stop(ctx context.Context) error { | ||
| w.stopMu.Lock() | ||
| if w.stopped { | ||
| w.stopMu.Unlock() | ||
| return nil | ||
| } | ||
| w.stopped = true | ||
| w.stopMu.Unlock() | ||
|
|
||
| _ = w.protocolEnc.Encode(map[string]interface{}{ | ||
| "id": fmt.Sprintf("shutdown-%d", time.Now().UnixNano()), | ||
| "action": "shutdown", | ||
| }) |
| hfToken := strings.TrimSpace(*profile.Parameters.HfToken) | ||
| params["hf_token"] = hfToken | ||
| if err := os.Setenv("HF_TOKEN", hfToken); err != nil { | ||
| logger.Warn("Failed to set HF_TOKEN from default transcription profile", "user_id", userID, "profile_id", profileID, "error", err) | ||
| return | ||
| } | ||
|
|
||
| logger.Info("Resolved HF_TOKEN from default transcription profile for startup diarization", "user_id", userID, "profile_id", profileID) | ||
| } |
There was a problem hiding this comment.
You can remove that if you want. I just included this to avoid leaving my HF_token in plaintext in my docker-compose.yml
| users, _, err := userRepo.List(ctx, 0, 100) | ||
| if err != nil { | ||
| logger.Warn("Failed to load startup diarization preference", "error", err) | ||
| return | ||
| } | ||
|
|
||
| for _, user := range users { | ||
| modelID := strings.TrimSpace(strings.ToLower(user.StartupDiarizationModel)) | ||
| if modelID == "" || modelID == "none" { | ||
| continue | ||
| } |
| // Load starts a resident diarization worker and waits until the model is loaded. | ||
| func (m *PersistentDiarizationManager) Load(ctx context.Context, modelID string, params map[string]interface{}) (PersistentDiarizationStatus, error) { | ||
| modelID = normalizePersistentDiarizationModel(modelID) | ||
| if modelID == "" { | ||
| return m.Status(), fmt.Errorf("unsupported diarization model") | ||
| } | ||
|
|
Summary:
Testing: