✨(summary) extended support for all video / audio files by FloChehab · Pull Request #1358 · suitenumerique/meet

FloChehab · 2026-05-21T11:05:55Z

Removed constraint on file extension
Infer audio / video streams from the media with ffmpeg
Infer the correct processed audio file extension based on actual codec to avoid ffmpeg errors.

cameledev

Looks mostly good to me.

One minor structure question/suggestion.

Otherwise, just naming stuff

FloChehab · 2026-05-28T14:00:54Z

Looks mostly good to me.

One minor structure question/suggestion.

Otherwise, just naming stuff

Thanks for the review!
I'll try to clean things by the end of day :)

lebaudantoine

I think the current structure evolved in a way that makes the file service harder to extend safely.

My initial choice was to encapsulate file utilities inside a service class to support interchangeability of implementations. However, in practice we’ve ended up continuously modifying the same implementation for new features, which increases regression risk for v1 and reduces flexibility.

In hindsight, it may have been better to introduce a stable interface with separate implementations (e.g. v1/v2) and a factory or dependency injection layer to select between them, instead of evolving a single service over time.

We already adapt the behavior of the service with the iflogic base on the file url/path.

This would likely have preserved v1 behavior while allowing faster iteration on v2 without affecting existing logic.

We can’t assume all summary callers will require the same file service implementation. This suggests we may need a more explicit abstraction (e.g. interface + multiple implementations) rather than a single tightly coupled service.

As a general rule, I think we should try to keep clear boundaries between implementations rather than mixing extension logic into a single evolving class.

I think we should clarify the ownership boundaries between module-level utilities and service classes. Right now the split is not always consistent, which makes it harder to understand where new logic should live.

lebaudantoine · 2026-05-28T13:47:08Z

+        transcription_res = WhisperXResponse(
+            **transcribe_audio(  # type: ignore
+                task_id=job_id,
+                cloud_storage_url=payload.cloud_storage_url,
+                language=payload.language,
+                raises=True,
+            ).model_dump()
+        )


not related with this PR, can't transcribe_audio returns directly a WhisperXResponse instance?

There is indeed work to be done around using from pydantic models, especially in the helpers by @cameledev. I left that for another PR.

lebaudantoine · 2026-05-28T14:12:11Z

+    except BaseException as e:
+        if isinstance(e, FileNotFoundError):
+            logger.error("ffmpeg not found. Please install ffmpeg.")
+        elif isinstance(e, subprocess.CalledProcessError):
+            logger.error("Audio extraction failed: %s", e.stderr.decode())
+        else:
+            logger.error("Unexpected error during audio extraction: %s", e)
+
+        if output_path.exists():
+            os.remove(output_path)
+        raise RuntimeError("Failed to extract audio from file") from e


in terms of python style I found the initial syntax with several except easier to scan/parse.
maybe, extract the cleanup in a function?

I think this is to be reworked as a bigger clean of those file handles

FloChehab · 2026-05-28T20:28:41Z

Regarding your big comment on the file service @lebaudantoine, in #1362 I get rid of old "read from minio" and other stuff from the file service. From my PoV summary was considered kind of experimental in regards to other hosters community (there is a bunch of forced posthog in config for instance).
So I think we are still ok in regards to merging those changes.

* Removed constraint on file extension * Infer audio/video streams from the media with ffmpeg * Infer the correct processed audio file extension based on actual codec to avoid ffmpeg errors We need to support more extensions and make audio extraction dynamic, as we shipped transcript in production and it led to user complaints requesting more formats.

sonarqubecloud · 2026-05-30T21:51:25Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

FloChehab force-pushed the feat/summary-extend-file-support branch from 7c99781 to 571fe7f Compare May 21, 2026 11:06

FloChehab commented May 21, 2026

View reviewed changes

Comment thread src/summary/summary/core/shared_models.py

FloChehab marked this pull request as ready for review May 21, 2026 11:07

FloChehab requested a review from lebaudantoine May 25, 2026 15:50

cameledev reviewed May 28, 2026

View reviewed changes

lebaudantoine reviewed May 28, 2026

View reviewed changes

FloChehab requested a review from lebaudantoine May 28, 2026 21:11

lebaudantoine force-pushed the feat/summary-extend-file-support branch from 1392009 to 919b7f3 Compare May 30, 2026 21:50

lebaudantoine approved these changes May 30, 2026

View reviewed changes

lebaudantoine merged commit ec688e7 into main May 30, 2026
29 checks passed

Conversation

FloChehab commented May 21, 2026

Uh oh!

Uh oh!

cameledev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FloChehab commented May 28, 2026

Uh oh!

lebaudantoine left a comment

Choose a reason for hiding this comment

Uh oh!

lebaudantoine May 28, 2026

Choose a reason for hiding this comment

Uh oh!

FloChehab May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lebaudantoine May 28, 2026

Choose a reason for hiding this comment

Uh oh!

FloChehab May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FloChehab commented May 28, 2026

Uh oh!

sonarqubecloud Bot commented May 30, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants