-
Notifications
You must be signed in to change notification settings - Fork 0
Copilot/update run action step #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
3dca90c
efbad45
dc341ae
06ee9c6
21dfb8c
0ff30d9
be2e8bd
a2deb51
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| name: Build and Test | ||
|
|
||
| on: | ||
| push: | ||
| branches: [ "main" ] | ||
| pull_request: | ||
| branches: [ "main" ] | ||
|
|
||
| jobs: | ||
| build: | ||
|
|
||
| runs-on: ubuntu-latest | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Setup Node.js | ||
| uses: actions/setup-node@v4 | ||
| with: | ||
| node-version: '20' | ||
| cache: 'npm' | ||
|
|
||
| - name: Install dependencies | ||
| run: npm ci | ||
|
|
||
| - name: Build | ||
| run: npm run build |
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,149 @@ | ||
| # Palestinian AI Voice Broadcasting Platform | ||
|
|
||
| ## What This Does | ||
|
|
||
| This platform creates authentic Arabic speech for broadcasting, trained specifically on Palestinian dialect and Quranic pronunciation patterns. It combines multiple voice datasets to produce natural-sounding news announcements and can generate synchronized video content. | ||
|
|
||
| ## Getting Started | ||
|
|
||
| **Hardware needed:** | ||
| - Computer with Python 3.8 or newer | ||
| - At least 8 GB memory (more is better for model training) | ||
| - Optional: NVIDIA GPU makes training much faster | ||
|
|
||
| **Installation steps:** | ||
|
|
||
| Execute the setup automation: | ||
| ```bash | ||
| ./setup.sh | ||
| ``` | ||
|
|
||
| The script handles ffmpeg installation and Python package configuration automatically. | ||
|
|
||
| ## How To Use This System | ||
|
|
||
| ### Step 1: Collect Voice Data | ||
|
|
||
| Your audio samples go into six specialized folders. Each needs a `metadata.csv` file mapping audio filenames to their Arabic transcriptions using pipe delimiter format: `audiofile.wav|النص العربي` | ||
|
|
||
| Reference the `.csv.example` files in each folder to see the expected format. | ||
|
|
||
| **The six dataset categories:** | ||
| - `dataset_quran` → Quranic verses with proper tajweed | ||
| - `dataset_speaker` → General broadcaster voice samples | ||
| - `dataset_speaker_news` → Formal news presentation style | ||
| - `dataset_speaker_palestinian` → Authentic Palestinian colloquial speech | ||
| - `dataset_speaker_realistic` → Natural conversational patterns | ||
| - `dataset_authority` → Official statement delivery style | ||
|
|
||
| ### Step 2: Build Your Model | ||
|
|
||
| Run the training process: | ||
| ```bash | ||
| python train.py | ||
| ``` | ||
|
|
||
| The trainer consolidates all your datasets and adapts the XTTS v2 foundation model. Output goes to `models/final_broadcast_model/`. Expect this to take time - possibly hours based on your dataset size and hardware. | ||
|
|
||
| ### Step 3: Create Audio Content | ||
|
|
||
| Generate test audio: | ||
| ```bash | ||
| python generate.py | ||
| ``` | ||
|
|
||
| The system produces `output/demo.wav` with professional audio treatment including frequency filtering and broadcast loudness standards (EBU R128 at -16 LUFS target). | ||
|
|
||
| ### Step 4: Launch Broadcasting (Optional) | ||
|
|
||
| To enable video with lip synchronization, you need the Wav2Lip tool installed and an anchor presenter image at `input/anchor.jpg`. | ||
|
|
||
| Start the broadcast generator: | ||
| ```bash | ||
| python run_tv_channel.py & | ||
| ``` | ||
|
|
||
| Start the web interface: | ||
| ```bash | ||
| python tv_server.py | ||
| ``` | ||
|
|
||
| View output at localhost port 3000. | ||
|
|
||
| ## Programming Interface | ||
|
|
||
| Import the voice generation function: | ||
|
|
||
| ```python | ||
| from generate import generate_voice | ||
|
|
||
| generated_file = generate_voice( | ||
| text="نص عربي هنا", | ||
| style="news", | ||
| output_name="output_filename.wav" | ||
| ) | ||
| ``` | ||
|
|
||
| The function returns the path to your generated audio file. | ||
|
|
||
| ## File Organization | ||
|
|
||
| - `train.py` → Dataset merging and model fine-tuning | ||
| - `generate.py` → Speech synthesis with audio processing | ||
| - `run_tv_channel.py` → Automated segment generation loop | ||
| - `tv_server.py` → HTTP streaming endpoint (Flask-based) | ||
| - `requirements.txt` → Python dependencies list | ||
| - `setup.sh` → Automated environment configuration | ||
|
|
||
| Generated content appears in `output/`, trained models in `models/`, source recordings in `dataset_*/wavs/`. | ||
|
|
||
| ## Audio Quality Tips | ||
|
|
||
| **Recording standards:** | ||
| - Sample at 16000 Hz minimum | ||
| - Remove background noise before use | ||
| - Normalize volume across all samples | ||
| - Use consistent microphone setup | ||
|
|
||
| **Dataset composition:** | ||
| - Target 100+ recordings per category for good results | ||
| - Balance formal and informal speaking styles | ||
| - Mix different sentence structures and lengths | ||
| - Include Arabic diacritics for Quran dataset accuracy | ||
|
|
||
| ## Configuration Changes | ||
|
|
||
| **Switch reference voice:** | ||
| Modify `REFERENCE_WAV` path in `generate.py` to point at your preferred sample. | ||
|
|
||
| **Adjust audio filters:** | ||
| Edit the `af_chain` variable in `generate.py` to customize frequency response and loudness targets. | ||
|
|
||
| **Update news content:** | ||
| Modify `NEWS_HEADLINES` array in `run_tv_channel.py` to change broadcast content pool. | ||
|
|
||
| ## Common Issues | ||
|
|
||
| **Training fails with memory error:** | ||
| Your system needs more RAM or reduce the batch size parameter. | ||
|
|
||
| **Cannot find model:** | ||
| Complete the training step first before attempting generation. | ||
|
|
||
| **Video generation skips lip sync:** | ||
| Verify Wav2Lip installation and checkpoint file presence at `models/wav2lip.pth`. | ||
|
|
||
| **FFmpeg command not found:** | ||
| Install via your package manager (apt, brew, yum) or download from official site. | ||
|
|
||
| ## Web Server Endpoints | ||
|
|
||
| - Root `/` serves the Arabic RTL player interface | ||
| - `/stream` provides direct media access | ||
| - `/health` returns JSON status for monitoring | ||
|
|
||
| ## Technical Notes | ||
|
|
||
| The system merges diverse Arabic speech patterns to capture authentic Palestinian broadcasting style while maintaining clear pronunciation through Quranic conditioning. Audio post-processing applies industry-standard loudness normalization suitable for broadcast transmission. | ||
|
|
||
| Model training adapts the multilingual XTTS v2 architecture with your custom Arabic datasets. The generator uses your reference voice sample to guide prosody and timbre during synthesis. | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -85,16 +85,23 @@ def generate_voice( | |||||||||||||||||||||||||||||||||||||||||||||
| "loudnorm=I=-16:LRA=11:TP=-1.5" | ||||||||||||||||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||
| subprocess.run( | ||||||||||||||||||||||||||||||||||||||||||||||
| [ | ||||||||||||||||||||||||||||||||||||||||||||||
| "ffmpeg", "-y", | ||||||||||||||||||||||||||||||||||||||||||||||
| "-i", raw_path, | ||||||||||||||||||||||||||||||||||||||||||||||
| "-af", af_chain, | ||||||||||||||||||||||||||||||||||||||||||||||
| final_path, | ||||||||||||||||||||||||||||||||||||||||||||||
| ], | ||||||||||||||||||||||||||||||||||||||||||||||
| check=True, | ||||||||||||||||||||||||||||||||||||||||||||||
| capture_output=True, | ||||||||||||||||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||||||||||||||||
| try: | ||||||||||||||||||||||||||||||||||||||||||||||
| subprocess.run( | ||||||||||||||||||||||||||||||||||||||||||||||
| [ | ||||||||||||||||||||||||||||||||||||||||||||||
| "ffmpeg", "-y", | ||||||||||||||||||||||||||||||||||||||||||||||
| "-i", raw_path, | ||||||||||||||||||||||||||||||||||||||||||||||
| "-af", af_chain, | ||||||||||||||||||||||||||||||||||||||||||||||
| final_path, | ||||||||||||||||||||||||||||||||||||||||||||||
| ], | ||||||||||||||||||||||||||||||||||||||||||||||
| check=True, | ||||||||||||||||||||||||||||||||||||||||||||||
| capture_output=True, | ||||||||||||||||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||||||||||||||||
| except (subprocess.CalledProcessError, FileNotFoundError) as e: | ||||||||||||||||||||||||||||||||||||||||||||||
| error_type = "not found" if isinstance(e, FileNotFoundError) else "failed" | ||||||||||||||||||||||||||||||||||||||||||||||
| print(f"[WARN] ffmpeg {error_type} - skipping audio post-processing") | ||||||||||||||||||||||||||||||||||||||||||||||
| print(f"[WARN] Using raw audio output instead") | ||||||||||||||||||||||||||||||||||||||||||||||
| # If ffmpeg fails or is not installed, use the raw audio file | ||||||||||||||||||||||||||||||||||||||||||||||
| final_path = raw_path | ||||||||||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+99
to
+104
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. suggestion: ffmpeg errors are logged without stderr, which may reduce diagnosability when post-processing fails. The fallback to
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
When post-processing fails, this assigns Useful? React with 👍 / 👎. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggestion: Falling back by setting Severity Level: Major
|
||||||||||||||||||||||||||||||||||||||||||||||
| final_path = raw_path | |
| os.replace(raw_path, final_path) |
Steps of Reproduction ✅
1. Start the TV channel loop by running `python broadcast-ai/run_tv_channel.py` which
calls `run()` in `broadcast-ai/run_tv_channel.py:120-145`, and separately start the
streaming server with `python broadcast-ai/tv_server.py` which serves media from
`broadcast-ai/tv_server.py:100-113`.
2. On a machine where `ffmpeg` is NOT installed on the PATH, `_generate_segment()` at
`broadcast-ai/run_tv_channel.py:80-88` calls `generate_voice(headline, style="news",
output_name="tv_audio.wav")`. In `generate_voice()` (`broadcast-ai/generate.py:45-107`),
the `subprocess.run([... "ffmpeg" ...], check=True, ...)` at
`broadcast-ai/generate.py:88-98` raises `FileNotFoundError`, triggering the `except` block
at `broadcast-ai/generate.py:99-104`.
3. Inside that `except` block, the code executes the fallback `final_path = raw_path` at
`broadcast-ai/generate.py:104`, so `generate_voice()` returns `"output/raw.wav"` instead
of the requested `"output/tv_audio.wav"` (which was originally set at
`broadcast-ai/generate.py:66`). `_generate_segment()` therefore returns `"output/raw.wav"`
as `audio_path` when Wav2Lip is unavailable, due to the early returns at
`broadcast-ai/run_tv_channel.py:85-91`.
4. The streaming server in `broadcast-ai/tv_server.py` is hard-coded to look for
`AUDIO_FALLBACK = "output/tv_audio.wav"` at `broadcast-ai/tv_server.py:23-25`. When only
`"output/raw.wav"` exists (ffmpeg missing) and Wav2Lip is also missing or its checkpoint
is missing so no video is produced (`broadcast-ai/run_tv_channel.py:85-91`), the `/stream`
endpoint at `broadcast-ai/tv_server.py:100-113` finds neither `VIDEO_PATH` nor
`AUDIO_FALLBACK` and returns `404`, while the `/health` endpoint at
`broadcast-ai/tv_server.py:116-124` reports `"audio": false`. Users see "no broadcast
currently" despite valid audio being generated, because the fallback changed the return
path from the requested `output_name` to the shared temporary `raw.wav`.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** broadcast-ai/generate.py
**Line:** 104:104
**Comment:**
*Logic Error: Falling back by setting `final_path` to `raw_path` breaks the function contract and returns a shared temporary filename (`output/raw.wav`) instead of the requested `output_name`. This causes callers to lose the expected output path and makes repeated generations overwrite the same fallback file. Keep `final_path` as the requested destination and move/copy the raw audio there when ffmpeg fails.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -56,20 +56,25 @@ def _ensure_anchor_video(): | |||||||||||||||
| return | ||||||||||||||||
|
|
||||||||||||||||
| os.makedirs("input", exist_ok=True) | ||||||||||||||||
| subprocess.run( | ||||||||||||||||
| [ | ||||||||||||||||
| "ffmpeg", "-y", | ||||||||||||||||
| "-loop", "1", | ||||||||||||||||
| "-i", anchor_img, | ||||||||||||||||
| "-c:v", "libx264", | ||||||||||||||||
| "-t", "15", | ||||||||||||||||
| "-pix_fmt", "yuv420p", | ||||||||||||||||
| ANCHOR_VIDEO, | ||||||||||||||||
| ], | ||||||||||||||||
| check=True, | ||||||||||||||||
| capture_output=True, | ||||||||||||||||
| ) | ||||||||||||||||
| print(f"[ANCHOR] Created {ANCHOR_VIDEO}") | ||||||||||||||||
| try: | ||||||||||||||||
| subprocess.run( | ||||||||||||||||
| [ | ||||||||||||||||
| "ffmpeg", "-y", | ||||||||||||||||
| "-loop", "1", | ||||||||||||||||
| "-i", anchor_img, | ||||||||||||||||
| "-c:v", "libx264", | ||||||||||||||||
| "-t", "15", | ||||||||||||||||
| "-pix_fmt", "yuv420p", | ||||||||||||||||
| ANCHOR_VIDEO, | ||||||||||||||||
|
digitalstore2025 marked this conversation as resolved.
|
||||||||||||||||
| ], | ||||||||||||||||
| check=True, | ||||||||||||||||
| capture_output=True, | ||||||||||||||||
| ) | ||||||||||||||||
| print(f"[ANCHOR] Created {ANCHOR_VIDEO}") | ||||||||||||||||
| except subprocess.CalledProcessError as e: | ||||||||||||||||
| print(f"[ERROR] Failed to create anchor video: {e}") | ||||||||||||||||
| except FileNotFoundError: | ||||||||||||||||
| print(f"[ERROR] ffmpeg not found - cannot create anchor video") | ||||||||||||||||
|
|
||||||||||||||||
|
|
||||||||||||||||
| def _generate_segment(headline: str) -> str | None: | ||||||||||||||||
|
|
@@ -85,18 +90,26 @@ def _generate_segment(headline: str) -> str | None: | |||||||||||||||
| print("[WARN] Wav2Lip checkpoint not found — skipping lip-sync") | ||||||||||||||||
| return audio_path | ||||||||||||||||
|
|
||||||||||||||||
| subprocess.run( | ||||||||||||||||
| [ | ||||||||||||||||
| "python", WAV2LIP_INFERENCE, | ||||||||||||||||
| "--checkpoint_path", WAV2LIP_CHECKPOINT, | ||||||||||||||||
| "--face", ANCHOR_VIDEO, | ||||||||||||||||
| "--audio", audio_path, | ||||||||||||||||
| "--outfile", VIDEO_OUTPUT, | ||||||||||||||||
| ], | ||||||||||||||||
| check=True, | ||||||||||||||||
| ) | ||||||||||||||||
|
|
||||||||||||||||
| return VIDEO_OUTPUT | ||||||||||||||||
| try: | ||||||||||||||||
| subprocess.run( | ||||||||||||||||
| [ | ||||||||||||||||
| "python", WAV2LIP_INFERENCE, | ||||||||||||||||
|
Comment on lines
+94
to
+96
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggestion: The Wav2Lip subprocess is invoked with a hardcoded Severity Level: Major
|
||||||||||||||||
| subprocess.run( | |
| [ | |
| "python", WAV2LIP_INFERENCE, | |
| import sys | |
| subprocess.run( | |
| [ | |
| sys.executable, WAV2LIP_INFERENCE, |
Steps of Reproduction ✅
1. Execute the TV channel loop via `python3 broadcast-ai/run_tv_channel.py`, which
triggers the `run()` entry point and its infinite loop (see
`broadcast-ai/run_tv_channel.py`, lines 119–146 in the final PR state).
2. Ensure that `WAV2LIP_INFERENCE` (`Wav2Lip/inference.py`) and `WAV2LIP_CHECKPOINT`
(`models/wav2lip.pth`) exist so `_generate_segment()` takes the Wav2Lip path (function
body around lines 80–113 in `broadcast-ai/run_tv_channel.py`).
3. In `_generate_segment()`, when it reaches the Wav2Lip call, it executes
`subprocess.run([... "python", WAV2LIP_INFERENCE, ...], check=True)` at lines 94–103 in
`broadcast-ai/run_tv_channel.py`; on a system where `python` is missing or points to a
different interpreter than the one running `run_tv_channel.py` (for example, only
`python3` exists or dependencies are installed only in a virtualenv), this subprocess
fails with `FileNotFoundError` or `CalledProcessError`.
4. The failure is caught by the surrounding `except` blocks in `_generate_segment()`
(lines 105–112), which log warnings and return `audio_path` instead of `VIDEO_OUTPUT`, so
the main loop logs a "Segment ready" message but only ever produces audio-only output,
never the intended lip-synced video.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** broadcast-ai/run_tv_channel.py
**Line:** 94:96
**Comment:**
*Possible Bug: The Wav2Lip subprocess is invoked with a hardcoded `python` executable, which can point to a missing or wrong interpreter (for example, systems that only provide `python3` or active virtualenvs), causing lip-sync to fail every time and always falling back to audio. Use the current interpreter to preserve environment compatibility.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (bug_risk): Catching bare Exception in the main loop may hide useful debugging information.
To keep robustness without losing debuggability, either log the full traceback (e.g., via traceback.print_exc() or the logging module) or narrow the except to specific, expected exception types so unexpected failures still surface during development.
Suggested implementation:
except Exception as exc:
print(f"[TV] Unexpected error during segment generation ({type(exc).__name__}): {exc}")
# Print full traceback to aid debugging while keeping the loop robust
traceback.print_exc()
print("[TV] Continuing to next segment...")To fully implement this change, ensure traceback is imported at the top of broadcast-ai/run_tv_channel.py:
- Add
import tracebacknear the other imports, e.g.:import traceback
If the file already uses a logging framework instead of print, you may want to replace traceback.print_exc() with logger.exception(...) following your existing logging conventions.
Uh oh!
There was an error while loading. Please reload this page.