Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Build and Test

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]

jobs:
build:

runs-on: ubuntu-latest

permissions:
contents: read

steps:
- uses: actions/checkout@v4

- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'

- name: Install dependencies
run: npm ci

- name: Build
run: npm run build
20 changes: 0 additions & 20 deletions .github/workflows/jekyll-docker.yml

This file was deleted.

149 changes: 149 additions & 0 deletions broadcast-ai/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Palestinian AI Voice Broadcasting Platform

## What This Does

This platform creates authentic Arabic speech for broadcasting, trained specifically on Palestinian dialect and Quranic pronunciation patterns. It combines multiple voice datasets to produce natural-sounding news announcements and can generate synchronized video content.

## Getting Started

**Hardware needed:**
- Computer with Python 3.8 or newer
- At least 8 GB memory (more is better for model training)
- Optional: NVIDIA GPU makes training much faster

**Installation steps:**

Execute the setup automation:
```bash
./setup.sh
```

The script handles ffmpeg installation and Python package configuration automatically.

## How To Use This System

### Step 1: Collect Voice Data

Your audio samples go into six specialized folders. Each needs a `metadata.csv` file mapping audio filenames to their Arabic transcriptions using pipe delimiter format: `audiofile.wav|النص العربي`

Reference the `.csv.example` files in each folder to see the expected format.

**The six dataset categories:**
- `dataset_quran` → Quranic verses with proper tajweed
- `dataset_speaker` → General broadcaster voice samples
- `dataset_speaker_news` → Formal news presentation style
- `dataset_speaker_palestinian` → Authentic Palestinian colloquial speech
- `dataset_speaker_realistic` → Natural conversational patterns
- `dataset_authority` → Official statement delivery style

### Step 2: Build Your Model

Run the training process:
```bash
python train.py
```

The trainer consolidates all your datasets and adapts the XTTS v2 foundation model. Output goes to `models/final_broadcast_model/`. Expect this to take time - possibly hours based on your dataset size and hardware.

### Step 3: Create Audio Content

Generate test audio:
```bash
python generate.py
```

The system produces `output/demo.wav` with professional audio treatment including frequency filtering and broadcast loudness standards (EBU R128 at -16 LUFS target).

### Step 4: Launch Broadcasting (Optional)

To enable video with lip synchronization, you need the Wav2Lip tool installed and an anchor presenter image at `input/anchor.jpg`.

Start the broadcast generator:
```bash
python run_tv_channel.py &
```

Start the web interface:
```bash
python tv_server.py
```

View output at localhost port 3000.

## Programming Interface

Import the voice generation function:

```python
from generate import generate_voice

generated_file = generate_voice(
text="نص عربي هنا",
style="news",
output_name="output_filename.wav"
)
```

The function returns the path to your generated audio file.

## File Organization

- `train.py` → Dataset merging and model fine-tuning
- `generate.py` → Speech synthesis with audio processing
- `run_tv_channel.py` → Automated segment generation loop
- `tv_server.py` → HTTP streaming endpoint (Flask-based)
- `requirements.txt` → Python dependencies list
- `setup.sh` → Automated environment configuration

Generated content appears in `output/`, trained models in `models/`, source recordings in `dataset_*/wavs/`.

## Audio Quality Tips

**Recording standards:**
- Sample at 16000 Hz minimum
- Remove background noise before use
- Normalize volume across all samples
- Use consistent microphone setup

**Dataset composition:**
- Target 100+ recordings per category for good results
- Balance formal and informal speaking styles
- Mix different sentence structures and lengths
- Include Arabic diacritics for Quran dataset accuracy

## Configuration Changes

**Switch reference voice:**
Modify `REFERENCE_WAV` path in `generate.py` to point at your preferred sample.

**Adjust audio filters:**
Edit the `af_chain` variable in `generate.py` to customize frequency response and loudness targets.

**Update news content:**
Modify `NEWS_HEADLINES` array in `run_tv_channel.py` to change broadcast content pool.

## Common Issues

**Training fails with memory error:**
Your system needs more RAM or reduce the batch size parameter.
Comment thread
digitalstore2025 marked this conversation as resolved.

**Cannot find model:**
Complete the training step first before attempting generation.

**Video generation skips lip sync:**
Verify Wav2Lip installation and checkpoint file presence at `models/wav2lip.pth`.

**FFmpeg command not found:**
Install via your package manager (apt, brew, yum) or download from official site.

## Web Server Endpoints

- Root `/` serves the Arabic RTL player interface
- `/stream` provides direct media access
- `/health` returns JSON status for monitoring

## Technical Notes

The system merges diverse Arabic speech patterns to capture authentic Palestinian broadcasting style while maintaining clear pronunciation through Quranic conditioning. Audio post-processing applies industry-standard loudness normalization suitable for broadcast transmission.

Model training adapts the multilingual XTTS v2 architecture with your custom Arabic datasets. The generator uses your reference voice sample to guide prosody and timbre during synthesis.
27 changes: 17 additions & 10 deletions broadcast-ai/generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,16 +85,23 @@ def generate_voice(
"loudnorm=I=-16:LRA=11:TP=-1.5"
)

subprocess.run(
[
"ffmpeg", "-y",
"-i", raw_path,
"-af", af_chain,
final_path,
],
check=True,
capture_output=True,
)
try:
subprocess.run(
[
"ffmpeg", "-y",
"-i", raw_path,
"-af", af_chain,
final_path,
],
check=True,
capture_output=True,
)
except (subprocess.CalledProcessError, FileNotFoundError) as e:
error_type = "not found" if isinstance(e, FileNotFoundError) else "failed"
print(f"[WARN] ffmpeg {error_type} - skipping audio post-processing")
print(f"[WARN] Using raw audio output instead")
# If ffmpeg fails or is not installed, use the raw audio file
final_path = raw_path
Comment on lines +99 to +104
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: ffmpeg errors are logged without stderr, which may reduce diagnosability when post-processing fails.

The fallback to raw_path makes sense, but since you're already using capture_output=True, it would help to include ffmpeg’s stderr in the warning (e.g., e.stderr.decode('utf-8', errors='ignore') for CalledProcessError) so failures are easier to debug.

Suggested change
except (subprocess.CalledProcessError, FileNotFoundError) as e:
error_type = "not found" if isinstance(e, FileNotFoundError) else "failed"
print(f"[WARN] ffmpeg {error_type} - skipping audio post-processing")
print(f"[WARN] Using raw audio output instead")
# If ffmpeg fails or is not installed, use the raw audio file
final_path = raw_path
except (subprocess.CalledProcessError, FileNotFoundError) as e:
error_type = "not found" if isinstance(e, FileNotFoundError) else "failed"
print(f"[WARN] ffmpeg {error_type} - skipping audio post-processing")
# Include stderr for diagnosability when ffmpeg is present but fails
if isinstance(e, subprocess.CalledProcessError) and e.stderr:
stderr_text = e.stderr.decode("utf-8", errors="ignore")
if stderr_text.strip():
print("[WARN] ffmpeg stderr:")
print(stderr_text)
print(f"[WARN] Using raw audio output instead")
# If ffmpeg fails or is not installed, use the raw audio file
final_path = raw_path

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve requested output filename on ffmpeg fallback

When post-processing fails, this assigns final_path to output/raw.wav, which ignores the caller’s output_name and causes downstream breakage in the fallback path: run_tv_channel.py requests tv_audio.wav, but tv_server.py only serves output/tv_audio.wav as audio fallback, so /health can report no media even though synthesis succeeded. It also makes repeated generations overwrite a single raw.wav file instead of producing per-request outputs.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Falling back by setting final_path to raw_path breaks the function contract and returns a shared temporary filename (output/raw.wav) instead of the requested output_name. This causes callers to lose the expected output path and makes repeated generations overwrite the same fallback file. Keep final_path as the requested destination and move/copy the raw audio there when ffmpeg fails. [logic error]

Severity Level: Major ⚠️
- ❌ TV audio-only fallback fails when ffmpeg and Wav2Lip absent.
- ⚠️ CLI `generate.py` returns unexpected path ignoring `output_name`.
- ⚠️ Raw audio saved under shared temp name `output/raw.wav`.
- ⚠️ Future consumers relying on `output_name` receive wrong path.
Suggested change
final_path = raw_path
os.replace(raw_path, final_path)
Steps of Reproduction ✅
1. Start the TV channel loop by running `python broadcast-ai/run_tv_channel.py` which
calls `run()` in `broadcast-ai/run_tv_channel.py:120-145`, and separately start the
streaming server with `python broadcast-ai/tv_server.py` which serves media from
`broadcast-ai/tv_server.py:100-113`.

2. On a machine where `ffmpeg` is NOT installed on the PATH, `_generate_segment()` at
`broadcast-ai/run_tv_channel.py:80-88` calls `generate_voice(headline, style="news",
output_name="tv_audio.wav")`. In `generate_voice()` (`broadcast-ai/generate.py:45-107`),
the `subprocess.run([... "ffmpeg" ...], check=True, ...)` at
`broadcast-ai/generate.py:88-98` raises `FileNotFoundError`, triggering the `except` block
at `broadcast-ai/generate.py:99-104`.

3. Inside that `except` block, the code executes the fallback `final_path = raw_path` at
`broadcast-ai/generate.py:104`, so `generate_voice()` returns `"output/raw.wav"` instead
of the requested `"output/tv_audio.wav"` (which was originally set at
`broadcast-ai/generate.py:66`). `_generate_segment()` therefore returns `"output/raw.wav"`
as `audio_path` when Wav2Lip is unavailable, due to the early returns at
`broadcast-ai/run_tv_channel.py:85-91`.

4. The streaming server in `broadcast-ai/tv_server.py` is hard-coded to look for
`AUDIO_FALLBACK = "output/tv_audio.wav"` at `broadcast-ai/tv_server.py:23-25`. When only
`"output/raw.wav"` exists (ffmpeg missing) and Wav2Lip is also missing or its checkpoint
is missing so no video is produced (`broadcast-ai/run_tv_channel.py:85-91`), the `/stream`
endpoint at `broadcast-ai/tv_server.py:100-113` finds neither `VIDEO_PATH` nor
`AUDIO_FALLBACK` and returns `404`, while the `/health` endpoint at
`broadcast-ai/tv_server.py:116-124` reports `"audio": false`. Users see "no broadcast
currently" despite valid audio being generated, because the fallback changed the return
path from the requested `output_name` to the shared temporary `raw.wav`.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** broadcast-ai/generate.py
**Line:** 104:104
**Comment:**
	*Logic Error: Falling back by setting `final_path` to `raw_path` breaks the function contract and returns a shared temporary filename (`output/raw.wav`) instead of the requested `output_name`. This causes callers to lose the expected output path and makes repeated generations overwrite the same fallback file. Keep `final_path` as the requested destination and move/copy the raw audio there when ffmpeg fails.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎


print(f"[GEN] {style} → {final_path}")
return final_path
Expand Down
71 changes: 44 additions & 27 deletions broadcast-ai/run_tv_channel.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,20 +56,25 @@ def _ensure_anchor_video():
return

os.makedirs("input", exist_ok=True)
subprocess.run(
[
"ffmpeg", "-y",
"-loop", "1",
"-i", anchor_img,
"-c:v", "libx264",
"-t", "15",
"-pix_fmt", "yuv420p",
ANCHOR_VIDEO,
],
check=True,
capture_output=True,
)
print(f"[ANCHOR] Created {ANCHOR_VIDEO}")
try:
subprocess.run(
[
"ffmpeg", "-y",
"-loop", "1",
"-i", anchor_img,
"-c:v", "libx264",
"-t", "15",
"-pix_fmt", "yuv420p",
ANCHOR_VIDEO,
Comment thread
digitalstore2025 marked this conversation as resolved.
],
check=True,
capture_output=True,
)
print(f"[ANCHOR] Created {ANCHOR_VIDEO}")
except subprocess.CalledProcessError as e:
print(f"[ERROR] Failed to create anchor video: {e}")
except FileNotFoundError:
print(f"[ERROR] ffmpeg not found - cannot create anchor video")


def _generate_segment(headline: str) -> str | None:
Expand All @@ -85,18 +90,26 @@ def _generate_segment(headline: str) -> str | None:
print("[WARN] Wav2Lip checkpoint not found — skipping lip-sync")
return audio_path

subprocess.run(
[
"python", WAV2LIP_INFERENCE,
"--checkpoint_path", WAV2LIP_CHECKPOINT,
"--face", ANCHOR_VIDEO,
"--audio", audio_path,
"--outfile", VIDEO_OUTPUT,
],
check=True,
)

return VIDEO_OUTPUT
try:
subprocess.run(
[
"python", WAV2LIP_INFERENCE,
Comment on lines +94 to +96
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The Wav2Lip subprocess is invoked with a hardcoded python executable, which can point to a missing or wrong interpreter (for example, systems that only provide python3 or active virtualenvs), causing lip-sync to fail every time and always falling back to audio. Use the current interpreter to preserve environment compatibility. [possible bug]

Severity Level: Major ⚠️
- ⚠️ Lip-synced video generation fails on many Python3-only systems.
- ⚠️ Broadcast loop silently degrades to audio-only segments.
Suggested change
subprocess.run(
[
"python", WAV2LIP_INFERENCE,
import sys
subprocess.run(
[
sys.executable, WAV2LIP_INFERENCE,
Steps of Reproduction ✅
1. Execute the TV channel loop via `python3 broadcast-ai/run_tv_channel.py`, which
triggers the `run()` entry point and its infinite loop (see
`broadcast-ai/run_tv_channel.py`, lines 119–146 in the final PR state).

2. Ensure that `WAV2LIP_INFERENCE` (`Wav2Lip/inference.py`) and `WAV2LIP_CHECKPOINT`
(`models/wav2lip.pth`) exist so `_generate_segment()` takes the Wav2Lip path (function
body around lines 80–113 in `broadcast-ai/run_tv_channel.py`).

3. In `_generate_segment()`, when it reaches the Wav2Lip call, it executes
`subprocess.run([... "python", WAV2LIP_INFERENCE, ...], check=True)` at lines 94–103 in
`broadcast-ai/run_tv_channel.py`; on a system where `python` is missing or points to a
different interpreter than the one running `run_tv_channel.py` (for example, only
`python3` exists or dependencies are installed only in a virtualenv), this subprocess
fails with `FileNotFoundError` or `CalledProcessError`.

4. The failure is caught by the surrounding `except` blocks in `_generate_segment()`
(lines 105–112), which log warnings and return `audio_path` instead of `VIDEO_OUTPUT`, so
the main loop logs a "Segment ready" message but only ever produces audio-only output,
never the intended lip-synced video.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** broadcast-ai/run_tv_channel.py
**Line:** 94:96
**Comment:**
	*Possible Bug: The Wav2Lip subprocess is invoked with a hardcoded `python` executable, which can point to a missing or wrong interpreter (for example, systems that only provide `python3` or active virtualenvs), causing lip-sync to fail every time and always falling back to audio. Use the current interpreter to preserve environment compatibility.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

"--checkpoint_path", WAV2LIP_CHECKPOINT,
"--face", ANCHOR_VIDEO,
"--audio", audio_path,
"--outfile", VIDEO_OUTPUT,
],
check=True,
)
return VIDEO_OUTPUT
except subprocess.CalledProcessError as e:
print(f"[WARN] Wav2Lip processing failed: {e}")
print(f"[WARN] Falling back to audio only")
return audio_path
except FileNotFoundError:
print(f"[WARN] Python interpreter or Wav2Lip script not found")
print(f"[WARN] Falling back to audio only")
return audio_path


# ---------------------------------------------------------------------------
Expand All @@ -119,8 +132,12 @@ def run():
result = _generate_segment(headline)
if result:
print(f"[TV] Segment ready: {result}")
except KeyboardInterrupt:
print("[TV] Shutting down broadcast loop")
break
except Exception as exc:
print(f"[TV] Segment failed: {exc}")
print(f"[TV] Unexpected error during segment generation ({type(exc).__name__}): {exc}")
print("[TV] Continuing to next segment...")
Comment on lines 138 to +140
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Catching bare Exception in the main loop may hide useful debugging information.

To keep robustness without losing debuggability, either log the full traceback (e.g., via traceback.print_exc() or the logging module) or narrow the except to specific, expected exception types so unexpected failures still surface during development.

Suggested implementation:

        except Exception as exc:
            print(f"[TV] Unexpected error during segment generation ({type(exc).__name__}): {exc}")
            # Print full traceback to aid debugging while keeping the loop robust
            traceback.print_exc()
            print("[TV] Continuing to next segment...")

To fully implement this change, ensure traceback is imported at the top of broadcast-ai/run_tv_channel.py:

  1. Add import traceback near the other imports, e.g.:
    • import traceback

If the file already uses a logging framework instead of print, you may want to replace traceback.print_exc() with logger.exception(...) following your existing logging conventions.


time.sleep(LOOP_PAUSE_SECONDS)

Expand Down
Loading
Loading