Skip to content

test: add E2E test suite (transcription + UI) with CI jobs#172

Draft
BW-Projects wants to merge 2 commits into
JuergenFleiss:developfrom
BW-Projects:feature_e2e_tests
Draft

test: add E2E test suite (transcription + UI) with CI jobs#172
BW-Projects wants to merge 2 commits into
JuergenFleiss:developfrom
BW-Projects:feature_e2e_tests

Conversation

@BW-Projects
Copy link
Copy Markdown

@BW-Projects BW-Projects commented May 29, 2026

Implements the layered test setup proposed in #171. Opened as a draft
for discussion.

Verified green in a fork preview; all jobs run in parallel.

What this adds

Tests (tests/, split into core/ and ui/):

  • tests/core/test_transcription_e2e.py — drives the engine via the CLI
    (aTrain_core transcribe … --model tiny --device cpu) on a short clip,
    asserting the output files exist and the transcript is non-empty.
    Two cases: speaker detection off and on (the "on" case asserts a
    SPEAKER_ label — a smoke check the diarization path ran, not an exact
    speaker count).
  • tests/ui/test_boot_serve.py — starts aTrain start --no-native and
    polls :8080 for HTTP 200 (app boots + serves, headless).
  • tests/ui/test_click_through.py — NiceGUI User-fixture: renders the
    real page and transcribes through the app's real wiring
    (start_transcriptionrun.cpu_bound → finished dialog), in-process,
    no browser.
  • tests/fixtures/sample_short.mp3 — a ~3 s public-domain LibriVox clip
    (trimmed from sample_data/SampleAudio.mp3).

CI (.github/workflows/ci.yml), two parallel jobs:

  • e2e (core pipeline) — installs aTrain_core standalone against the
    PyTorch CPU index (~1.5 GB), runs the transcription tests. ~55 s.
  • e2e (app, full stack)uv sync --locked of the shipped cu128
    stack (+ GTK build deps), runs the transcription + UI tests against the
    exact dependency set we ship. ~1m57s. (On a GPU-less runner cu128 torch
    just runs on CPU — no GPU runner needed.)

Config: tests/** ruff ignore (pytest asserts + invoking the CLI by
name), asyncio_mode = auto (the User fixture is async), .pytest_cache
gitignored.

Notes / open items (see #171 for the rationale)

  • The e2e (core pipeline) job installs aTrain_core@develop (off-lock;
    aTrain_core has no lock of its own). Kept simple for now; a SHA pin can
    come with the monorepo (Decide long-term pinning strategy for aTrain_core dependency #145).
  • Unit tests (torch-/NiceGUI-free helpers) are a separate follow-up PR.
  • A nightly for a full-size model on a longer sample is optional/later.

Test plan

  • All five jobs (ruff / bandit / pip-audit / both e2e jobs) green.
  • e2e (core pipeline) passes the speaker-off and speaker-on cases.
  • e2e (app, full stack) passes boot-serve + click-through on the
    cu128 stack.

Maps to BSI IT-Grundschutz

  • CON.8 §3.2.5 (Funktionstests und Sicherheitstests) — the E2E tests are
    the "Funktionstest" half, complementing the automated code analysis
    (ruff / bandit / pip-audit) already in place.

Related

cc @gerardo-navarro

Bjoern Werner added 2 commits May 29, 2026 09:31
Transcription E2E (tests/core): tiny model on CPU, asserts output files and
a non-empty transcript, with speaker detection both off and on (asserts a
SPEAKER_ label). UI E2E (tests/ui): a boot-serve smoke plus a NiceGUI
User-fixture click-through that transcribes through the app's real wiring.
Adds a ~3 s public-domain sample clip, a tests/** ruff ignore, and the
pytest asyncio-auto config.
Two parallel jobs: `e2e (core pipeline)` installs aTrain_core with CPU torch
(~1.5 GB) and runs the transcription tests; `e2e (app, full stack)` installs
the locked cu128 app and runs the transcription + UI tests against the
shipped dependency set.
@JuergenFleiss
Copy link
Copy Markdown
Owner

looks good to me and makes sense in the way it is set up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants