Skip to content

✨(summary) extended support for all video / audio files#1358

Merged
lebaudantoine merged 1 commit into
mainfrom
feat/summary-extend-file-support
May 30, 2026
Merged

✨(summary) extended support for all video / audio files#1358
lebaudantoine merged 1 commit into
mainfrom
feat/summary-extend-file-support

Conversation

@FloChehab
Copy link
Copy Markdown
Collaborator

  • Removed constraint on file extension
  • Infer audio / video streams from the media with ffmpeg
  • Infer the correct processed audio file extension based on actual codec to avoid ffmpeg errors.

@FloChehab FloChehab force-pushed the feat/summary-extend-file-support branch from 7c99781 to 571fe7f Compare May 21, 2026 11:06
Comment thread src/summary/summary/core/shared_models.py
@FloChehab FloChehab marked this pull request as ready for review May 21, 2026 11:07
@FloChehab FloChehab requested a review from lebaudantoine May 25, 2026 15:50
Copy link
Copy Markdown
Collaborator

@cameledev cameledev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me.

One minor structure question/suggestion.

Otherwise, just naming stuff

Comment thread src/summary/summary/core/file_service.py
Comment thread src/summary/summary/core/file_service.py Outdated
Comment thread src/summary/summary/core/file_service.py Outdated
Comment thread src/summary/summary/core/file_service.py Outdated
Comment thread src/summary/summary/core/file_service.py Outdated
Comment thread src/summary/summary/core/file_service.py Outdated
Comment thread src/summary/summary/core/shared_models.py
@FloChehab
Copy link
Copy Markdown
Collaborator Author

Looks mostly good to me.

One minor structure question/suggestion.

Otherwise, just naming stuff

Thanks for the review!
I'll try to clean things by the end of day :)

Copy link
Copy Markdown
Collaborator

@lebaudantoine lebaudantoine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current structure evolved in a way that makes the file service harder to extend safely.

My initial choice was to encapsulate file utilities inside a service class to support interchangeability of implementations. However, in practice we’ve ended up continuously modifying the same implementation for new features, which increases regression risk for v1 and reduces flexibility.

In hindsight, it may have been better to introduce a stable interface with separate implementations (e.g. v1/v2) and a factory or dependency injection layer to select between them, instead of evolving a single service over time.

We already adapt the behavior of the service with the iflogic base on the file url/path.

This would likely have preserved v1 behavior while allowing faster iteration on v2 without affecting existing logic.

We can’t assume all summary callers will require the same file service implementation. This suggests we may need a more explicit abstraction (e.g. interface + multiple implementations) rather than a single tightly coupled service.

As a general rule, I think we should try to keep clear boundaries between implementations rather than mixing extension logic into a single evolving class.

I think we should clarify the ownership boundaries between module-level utilities and service classes. Right now the split is not always consistent, which makes it harder to understand where new logic should live.

Comment on lines +549 to +556
transcription_res = WhisperXResponse(
**transcribe_audio( # type: ignore
task_id=job_id,
cloud_storage_url=payload.cloud_storage_url,
language=payload.language,
raises=True,
).model_dump()
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related with this PR, can't transcribe_audio returns directly a WhisperXResponse instance?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is indeed work to be done around using from pydantic models, especially in the helpers by @cameledev. I left that for another PR.

Comment thread src/summary/summary/core/shared_models.py
Comment thread src/summary/summary/core/celery_worker.py Outdated
Comment thread src/summary/summary/core/file_service.py Outdated
Comment thread src/summary/summary/core/file_service.py Outdated
Comment on lines +259 to +269
except BaseException as e:
if isinstance(e, FileNotFoundError):
logger.error("ffmpeg not found. Please install ffmpeg.")
elif isinstance(e, subprocess.CalledProcessError):
logger.error("Audio extraction failed: %s", e.stderr.decode())
else:
logger.error("Unexpected error during audio extraction: %s", e)

if output_path.exists():
os.remove(output_path)
raise RuntimeError("Failed to extract audio from file") from e
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in terms of python style I found the initial syntax with several except easier to scan/parse.
maybe, extract the cleanup in a function?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is to be reworked as a bigger clean of those file handles

Comment thread src/summary/summary/core/file_service.py Outdated
Comment thread src/summary/summary/core/file_service.py
Comment thread src/summary/summary/core/file_service.py Outdated
Comment thread src/summary/summary/core/file_service.py Outdated
@FloChehab
Copy link
Copy Markdown
Collaborator Author

Regarding your big comment on the file service @lebaudantoine, in #1362 I get rid of old "read from minio" and other stuff from the file service. From my PoV summary was considered kind of experimental in regards to other hosters community (there is a bunch of forced posthog in config for instance).
So I think we are still ok in regards to merging those changes.

@FloChehab FloChehab requested a review from lebaudantoine May 28, 2026 21:11
* Removed constraint on file extension
* Infer audio/video streams from the media with ffmpeg
* Infer the correct processed audio file extension based on actual
  codec to avoid ffmpeg errors

We need to support more extensions and make audio extraction dynamic,
as we shipped transcript in production and it led to user complaints
requesting more formats.
@lebaudantoine lebaudantoine force-pushed the feat/summary-extend-file-support branch from 1392009 to 919b7f3 Compare May 30, 2026 21:50
@sonarqubecloud
Copy link
Copy Markdown

@lebaudantoine lebaudantoine merged commit ec688e7 into main May 30, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants