Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 127 additions & 0 deletions docs/2026-05-07-session-handoff.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# 2026-05-07 Session Handoff

## Current Focus

Phase 5 is blocked on trustworthy audio-to-tab mapping before any more
video-calibration or `lambda_vision` tuning.

The key finding from today is that highres audio is strong at GuitarSet
onset/pitch, but default pitch-to-string/fret decoding is weak. A learned
GuitarSet train-split position prior improves Tab F1 substantially, but
it should not become an unconditional production default yet.

## What Changed Today

- Added a raw GuitarSet audio-only evaluator:
- `tabvision/tabvision/eval/guitarset_audio.py`
- `tabvision/scripts/eval/guitarset_audio_eval.py`
- `tabvision/tests/unit/test_guitarset_audio_eval.py`
- Added learned pitch-position prior support:
- `tabvision/tabvision/fusion/position_prior.py`
- `tabvision/tests/unit/test_position_prior.py`
- Added a Modal L4 GPU runner for full highres eval:
- `tabvision/scripts/eval/guitarset_highres_modal.py`
- Recorded the Phase 5 position-prior decision in `docs/DECISIONS.md`.

The existing GuitarSet TFRecords were inspected and are insufficient for
Tab F1 because they do not retain string/fret labels. Raw JAMS/WAV is the
correct source for Tab F1.

## Metric Snapshot

Full GuitarSet validation split is player `05` (60 tracks).

Modal L4 highres eval, full validation:

| Run | Onset F1 | Pitch F1 | Tab F1 |
| --- | ---: | ---: | ---: |
| no position prior | `0.9218` | `0.9022` | `0.3878` |
| GuitarSet train prior | `0.9218` | `0.9022` | `0.6104` |

Delta from prior: `+22.26 pp` Tab F1, with onset/pitch unchanged.

Per-track effect: 51/60 improved, 8/60 regressed, 1/60 unchanged. Mean
track Tab F1 moved from `0.347` to `0.589`.

Reports:

- `tabvision-server/tools/outputs/guitarset_audio_eval-highres-validation-none-2026-05-07.md`
- `tabvision-server/tools/outputs/guitarset_audio_eval-highres-validation-guitarset-train-2026-05-07.md`

## Promotion Decision

Do not make the GuitarSet train prior an unconditional production default
yet.

Recommended path:

1. Promote the position prior as a versioned/configured production option.
2. Create a checked-in prior artifact or generator so raw GuitarSet is not
required at runtime.
3. Classify the 8 validation regressions, especially the SS/comp cases
where no-prior was already strong.
4. Run the home-video Phase 5 benchmark with and without the prior.
5. Make it the default only if the home-video benchmark has no regression
and the GuitarSet regressions are understood or accepted.

## Modal State

The Modal GPU path is now working.

Useful commands:

```bash
tabvision-server/venv/bin/modal run tabvision/scripts/eval/guitarset_highres_modal.py --limit 1
tabvision-server/venv/bin/modal run tabvision/scripts/eval/guitarset_highres_modal.py
tabvision-server/venv/bin/modal run tabvision/scripts/eval/guitarset_highres_modal.py --position-prior none
```

The Modal volume `tabvision-guitarset` has been hydrated with 360 raw
GuitarSet JAMS/WAV tracks from `taohu/guitarset`.

## Verification At Handoff

Last full fast suite:

```bash
cd tabvision
.venv/bin/python -m pytest -q
```

Result: `249 passed, 9 skipped`.

Focused checks after the eval/Modal changes:

```bash
cd tabvision
.venv/bin/python -m pytest tests/unit/test_guitarset_audio_eval.py tests/unit/test_position_prior.py -q
.venv/bin/python -m ruff check tabvision/eval/guitarset_audio.py tabvision/fusion/position_prior.py scripts/eval/guitarset_audio_eval.py scripts/eval/guitarset_highres_modal.py tests/unit/test_guitarset_audio_eval.py tests/unit/test_position_prior.py
```

Result: 11 focused tests passed; ruff passed.

## Worktree Note

The worktree is still dirty. Some dirty files predate this handoff. The
most relevant new files from this session are:

- `tabvision/tabvision/eval/guitarset_audio.py`
- `tabvision/tabvision/fusion/position_prior.py`
- `tabvision/scripts/eval/guitarset_audio_eval.py`
- `tabvision/scripts/eval/guitarset_highres_modal.py`
- `tabvision/tests/unit/test_guitarset_audio_eval.py`
- `tabvision/tests/unit/test_position_prior.py`
- `tabvision-server/tools/outputs/guitarset_audio_eval-highres-validation-none-2026-05-07.{md,csv}`
- `tabvision-server/tools/outputs/guitarset_audio_eval-highres-validation-guitarset-train-2026-05-07.{md,csv}`

There are also Phase 5 eval/fusion/video diagnostics and prior Phase 5
implementation changes in the worktree. Do not assume every dirty file
belongs to the GuitarSet work.

## Next Best Step

Build the production integration for the position prior as an explicit
config option/artifact, then run home-video Phase 5 with prior on/off.

Do not proceed to Phase 6 until Phase 5 is recorded and the user explicitly
says `proceed`.
42 changes: 42 additions & 0 deletions docs/DECISIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -380,3 +380,45 @@ to validate; "the hand is around frets 3-6" is a stronger, more stable
visual prior for resolving audio's same-pitch string/fret ambiguity. Keeping
the signal as a prior lets audio and playability override it when the visual
evidence is weak or wrong.

---

## 2026-05-07 — Phase 5 GuitarSet pitch-to-tab bottleneck

**Phase:** 5 (audio-to-tab mapping)
**Decision tree:** GuitarSet audio-only diagnostic — if pitch F1 is good
but Tab F1 is bad, fix string/fret candidate selection before tuning video
calibration or `lambda_vision`.
**Branch taken:** **Add an optional learned pitch-position prior.** Raw
GuitarSet JAMS provide held-out string/fret labels; the evaluator now learns
`P(string,fret | pitch)` from train players and attaches it via the existing
2D `AudioEvent.fret_prior` path before audio-only Viterbi decode.
**Evidence:** On full validation, oracle gold-onset/gold-pitch events scored
only `0.4335` Tab F1 with the default decoder, proving the mapping is bad even
when audio extraction is perfect. On the first 10 validation tracks, about
two-thirds of same-pitch events landed on the wrong adjacent string/fret. Most
errors were low-fret equivalents such as G-string notes decoded on B or B-string
notes decoded on high E. A GuitarSet train-split prior raised oracle
full-validation Tab F1 to `0.6802`. On the 3-track highres smoke, Tab F1 moved
from `0.3356` to `0.7260` while onset F1 (`0.9692`) and pitch F1 (`0.9555`)
stayed unchanged.
**Reasoning:** This confirms the immediate bottleneck is pitch-to-position
ambiguity, not highres onset/pitch extraction and not Phase 5 vision weighting.
The prior is optional for now; it does not change public fusion APIs or the
default production decode until a full validation run and home-video check
justify promoting it.

**Follow-up evidence:** A Modal L4 full-validation highres run completed on
2026-05-07. With no position prior: onset F1 `0.9218`, pitch F1 `0.9022`,
Tab F1 `0.3878`. With the GuitarSet train-split prior: onset F1 `0.9218`,
pitch F1 `0.9022`, Tab F1 `0.6104` (`+22.26 pp`). Per-track, 51/60 improved,
8/60 regressed, and 1/60 was unchanged. Mean track Tab F1 moved from `0.347`
to `0.589`.
**Promotion decision:** **Do not make this an unconditional production default
yet.** Promote the prior next as a versioned/configured production option, then
make it the default only after (a) a checked-in prior artifact is available
without requiring raw GuitarSet at runtime, (b) same-pitch regressions are
classified and reduced or accepted, and (c) the home-video Phase 5 benchmark
shows no regression. The full GuitarSet result is strong enough to justify the
production integration path, but the 8 regressed validation clips make a silent
global default premature.
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
track_id,backend,gold_notes,audio_events,decoded_events,onset_f1,pitch_f1,tab_f1,tab_tp,tab_fp,tab_fn
05_BN1-129-Eb_comp,highres,148,150,150,0.979866,0.973154,0.221477,33,117,115
05_BN1-129-Eb_solo,highres,44,46,46,0.933333,0.933333,0.111111,5,41,39
05_BN1-147-Gb_comp,highres,96,100,100,0.969388,0.938776,0.612245,60,40,36
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# GuitarSet Audio Eval (highres)

Split: **validation**
Tracks: **3**
Gold notes: **288**
Audio events: **296**

## Aggregate

| Metric | Mean F1 | Micro P | Micro R | Micro F1 |
| --- | ---: | ---: | ---: | ---: |
| Onset | 0.961 | 0.956 | 0.983 | 0.969 |
| Pitch | 0.948 | 0.943 | 0.969 | 0.955 |
| Tab | 0.315 | 0.331 | 0.340 | 0.336 |

## Per Track

| Track | Gold | Audio | Decoded | Onset F1 | Pitch F1 | Tab F1 |
| --- | ---: | ---: | ---: | ---: | ---: | ---: |
| `05_BN1-129-Eb_comp` | 148 | 150 | 150 | 0.980 | 0.973 | 0.221 |
| `05_BN1-129-Eb_solo` | 44 | 46 | 46 | 0.933 | 0.933 | 0.111 |
| `05_BN1-147-Gb_comp` | 96 | 100 | 100 | 0.969 | 0.939 | 0.612 |
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
track_id,backend,gold_notes,audio_events,decoded_events,onset_f1,pitch_f1,tab_f1,tab_tp,tab_fp,tab_fn
05_BN1-129-Eb_comp,highres,148,150,150,0.979866,0.973154,0.932886,139,11,9
05_BN1-129-Eb_solo,highres,44,46,46,0.933333,0.933333,0.377778,17,29,27
05_BN1-147-Gb_comp,highres,96,100,100,0.969388,0.938776,0.571429,56,44,40
05_BN1-147-Gb_solo,highres,27,32,32,0.847458,0.847458,0.203390,6,26,21
05_BN2-131-B_comp,highres,202,203,203,0.987654,0.972840,0.809877,164,39,38
05_BN2-131-B_solo,highres,69,76,76,0.937931,0.937931,0.593103,43,33,26
05_BN2-166-Ab_comp,highres,173,162,162,0.955224,0.931343,0.859701,144,18,29
05_BN2-166-Ab_solo,highres,67,69,69,0.941176,0.926471,0.691176,47,22,20
05_BN3-119-G_comp,highres,180,178,178,0.977654,0.977654,0.720670,129,49,51
05_BN3-119-G_solo,highres,62,64,64,0.873016,0.873016,0.619048,39,25,23
05_BN3-154-E_comp,highres,185,180,180,0.958904,0.926027,0.695890,127,53,58
05_BN3-154-E_solo,highres,56,59,59,0.921739,0.904348,0.260870,15,44,41
05_Funk1-114-Ab_comp,highres,181,162,153,0.844311,0.838323,0.808383,135,18,46
05_Funk1-114-Ab_solo,highres,60,56,56,0.844828,0.793103,0.637931,37,19,23
05_Funk1-97-C_comp,highres,177,172,172,0.974212,0.951289,0.796562,139,33,38
05_Funk1-97-C_solo,highres,58,57,57,0.921739,0.921739,0.330435,19,38,39
05_Funk2-108-Eb_comp,highres,245,271,260,0.843564,0.807921,0.673267,170,90,75
05_Funk2-108-Eb_solo,highres,61,61,61,0.983607,0.983607,0.491803,30,31,31
05_Funk2-119-G_comp,highres,182,192,184,0.907104,0.890710,0.841530,154,30,28
05_Funk2-119-G_solo,highres,99,96,96,0.943590,0.923077,0.256410,25,71,74
05_Funk3-112-C#_comp,highres,212,206,206,0.928230,0.913876,0.837321,175,31,37
05_Funk3-112-C#_solo,highres,70,64,64,0.955224,0.955224,0.686567,46,18,24
05_Funk3-98-A_comp,highres,184,182,162,0.901734,0.895954,0.815029,141,21,43
05_Funk3-98-A_solo,highres,65,65,65,0.969231,0.969231,0.538462,35,30,30
05_Jazz1-130-D_comp,highres,141,118,118,0.880309,0.849421,0.687259,89,29,52
05_Jazz1-130-D_solo,highres,58,61,61,0.924370,0.890756,0.554622,33,28,25
05_Jazz1-200-B_comp,highres,175,168,163,0.852071,0.781065,0.491124,83,80,92
05_Jazz1-200-B_solo,highres,50,49,49,0.969697,0.969697,0.464646,23,26,27
05_Jazz2-110-Bb_comp,highres,252,252,242,0.910931,0.890688,0.781377,193,49,59
05_Jazz2-110-Bb_solo,highres,103,97,97,0.940000,0.940000,0.700000,70,27,33
05_Jazz2-187-F#_comp,highres,247,226,219,0.862661,0.836910,0.669528,156,63,91
05_Jazz2-187-F#_solo,highres,61,60,60,0.975207,0.975207,0.280992,17,43,44
05_Jazz3-137-Eb_comp,highres,235,231,227,0.857143,0.800866,0.683983,158,69,77
05_Jazz3-137-Eb_solo,highres,77,76,76,0.967320,0.967320,0.496732,38,38,39
05_Jazz3-150-C_comp,highres,227,226,219,0.910314,0.874439,0.291480,65,154,162
05_Jazz3-150-C_solo,highres,69,67,67,0.955882,0.955882,0.764706,52,15,17
05_Rock1-130-A_comp,highres,312,349,349,0.883510,0.877458,0.166415,55,294,257
05_Rock1-130-A_solo,highres,60,62,62,0.950820,0.950820,0.606557,37,25,23
05_Rock1-90-C#_comp,highres,277,276,276,0.936709,0.871609,0.867993,240,36,37
05_Rock1-90-C#_solo,highres,99,106,106,0.926829,0.907317,0.809756,83,23,16
05_Rock2-142-D_comp,highres,460,450,450,0.890110,0.841758,0.336264,153,297,307
05_Rock2-142-D_solo,highres,86,86,86,0.906977,0.906977,0.290698,25,61,61
05_Rock2-85-F_comp,highres,325,296,296,0.856683,0.821256,0.615137,191,105,134
05_Rock2-85-F_solo,highres,125,124,124,0.955823,0.939759,0.698795,87,37,38
05_Rock3-117-Bb_comp,highres,398,401,401,0.878598,0.868586,0.403004,161,240,237
05_Rock3-117-Bb_solo,highres,70,65,65,0.962963,0.962963,0.518519,35,30,35
05_Rock3-148-C_comp,highres,249,286,286,0.927103,0.923364,0.818692,219,67,30
05_Rock3-148-C_solo,highres,61,60,60,0.991736,0.991736,0.495868,30,30,31
05_SS1-100-C#_comp,highres,102,95,95,0.964467,0.964467,0.395939,39,56,63
05_SS1-100-C#_solo,highres,60,64,64,0.887097,0.870968,0.774194,48,16,12
05_SS1-68-E_comp,highres,143,148,148,0.982818,0.975945,0.783505,114,34,29
05_SS1-68-E_solo,highres,78,74,74,0.894737,0.881579,0.328947,25,49,53
05_SS2-107-Ab_comp,highres,222,227,227,0.971047,0.953229,0.726058,163,64,59
05_SS2-107-Ab_solo,highres,87,85,85,0.941860,0.941860,0.546512,47,38,40
05_SS2-88-F_comp,highres,196,190,190,0.963731,0.943005,0.621762,120,70,76
05_SS2-88-F_solo,highres,93,97,97,0.978947,0.968421,0.589474,56,41,37
05_SS3-84-Bb_comp,highres,211,215,215,0.985915,0.985915,0.671362,143,72,68
05_SS3-84-Bb_solo,highres,111,119,119,0.947826,0.947826,0.330435,38,81,73
05_SS3-98-C_comp,highres,199,187,187,0.937824,0.932642,0.735751,142,45,57
05_SS3-98-C_solo,highres,93,94,94,0.973262,0.973262,0.288770,27,67,66
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# GuitarSet Audio Eval (highres)

Split: **validation**
Position prior: **guitarset-train**
Tracks: **60**
Gold notes: **8715**
Audio events: **8690**

## Aggregate

| Metric | Mean F1 | Micro P | Micro R | Micro F1 |
| --- | ---: | ---: | ---: | ---: |
| Onset | 0.930 | 0.928 | 0.916 | 0.922 |
| Pitch | 0.915 | 0.908 | 0.897 | 0.902 |
| Tab | 0.589 | 0.614 | 0.607 | 0.610 |

## Per Track

| Track | Gold | Audio | Decoded | Onset F1 | Pitch F1 | Tab F1 |
| --- | ---: | ---: | ---: | ---: | ---: | ---: |
| `05_BN1-129-Eb_comp` | 148 | 150 | 150 | 0.980 | 0.973 | 0.933 |
| `05_BN1-129-Eb_solo` | 44 | 46 | 46 | 0.933 | 0.933 | 0.378 |
| `05_BN1-147-Gb_comp` | 96 | 100 | 100 | 0.969 | 0.939 | 0.571 |
| `05_BN1-147-Gb_solo` | 27 | 32 | 32 | 0.847 | 0.847 | 0.203 |
| `05_BN2-131-B_comp` | 202 | 203 | 203 | 0.988 | 0.973 | 0.810 |
| `05_BN2-131-B_solo` | 69 | 76 | 76 | 0.938 | 0.938 | 0.593 |
| `05_BN2-166-Ab_comp` | 173 | 162 | 162 | 0.955 | 0.931 | 0.860 |
| `05_BN2-166-Ab_solo` | 67 | 69 | 69 | 0.941 | 0.926 | 0.691 |
| `05_BN3-119-G_comp` | 180 | 178 | 178 | 0.978 | 0.978 | 0.721 |
| `05_BN3-119-G_solo` | 62 | 64 | 64 | 0.873 | 0.873 | 0.619 |
| `05_BN3-154-E_comp` | 185 | 180 | 180 | 0.959 | 0.926 | 0.696 |
| `05_BN3-154-E_solo` | 56 | 59 | 59 | 0.922 | 0.904 | 0.261 |
| `05_Funk1-114-Ab_comp` | 181 | 162 | 153 | 0.844 | 0.838 | 0.808 |
| `05_Funk1-114-Ab_solo` | 60 | 56 | 56 | 0.845 | 0.793 | 0.638 |
| `05_Funk1-97-C_comp` | 177 | 172 | 172 | 0.974 | 0.951 | 0.797 |
| `05_Funk1-97-C_solo` | 58 | 57 | 57 | 0.922 | 0.922 | 0.330 |
| `05_Funk2-108-Eb_comp` | 245 | 271 | 260 | 0.844 | 0.808 | 0.673 |
| `05_Funk2-108-Eb_solo` | 61 | 61 | 61 | 0.984 | 0.984 | 0.492 |
| `05_Funk2-119-G_comp` | 182 | 192 | 184 | 0.907 | 0.891 | 0.842 |
| `05_Funk2-119-G_solo` | 99 | 96 | 96 | 0.944 | 0.923 | 0.256 |
| `05_Funk3-112-C#_comp` | 212 | 206 | 206 | 0.928 | 0.914 | 0.837 |
| `05_Funk3-112-C#_solo` | 70 | 64 | 64 | 0.955 | 0.955 | 0.687 |
| `05_Funk3-98-A_comp` | 184 | 182 | 162 | 0.902 | 0.896 | 0.815 |
| `05_Funk3-98-A_solo` | 65 | 65 | 65 | 0.969 | 0.969 | 0.538 |
| `05_Jazz1-130-D_comp` | 141 | 118 | 118 | 0.880 | 0.849 | 0.687 |
| `05_Jazz1-130-D_solo` | 58 | 61 | 61 | 0.924 | 0.891 | 0.555 |
| `05_Jazz1-200-B_comp` | 175 | 168 | 163 | 0.852 | 0.781 | 0.491 |
| `05_Jazz1-200-B_solo` | 50 | 49 | 49 | 0.970 | 0.970 | 0.465 |
| `05_Jazz2-110-Bb_comp` | 252 | 252 | 242 | 0.911 | 0.891 | 0.781 |
| `05_Jazz2-110-Bb_solo` | 103 | 97 | 97 | 0.940 | 0.940 | 0.700 |
| `05_Jazz2-187-F#_comp` | 247 | 226 | 219 | 0.863 | 0.837 | 0.670 |
| `05_Jazz2-187-F#_solo` | 61 | 60 | 60 | 0.975 | 0.975 | 0.281 |
| `05_Jazz3-137-Eb_comp` | 235 | 231 | 227 | 0.857 | 0.801 | 0.684 |
| `05_Jazz3-137-Eb_solo` | 77 | 76 | 76 | 0.967 | 0.967 | 0.497 |
| `05_Jazz3-150-C_comp` | 227 | 226 | 219 | 0.910 | 0.874 | 0.291 |
| `05_Jazz3-150-C_solo` | 69 | 67 | 67 | 0.956 | 0.956 | 0.765 |
| `05_Rock1-130-A_comp` | 312 | 349 | 349 | 0.884 | 0.877 | 0.166 |
| `05_Rock1-130-A_solo` | 60 | 62 | 62 | 0.951 | 0.951 | 0.607 |
| `05_Rock1-90-C#_comp` | 277 | 276 | 276 | 0.937 | 0.872 | 0.868 |
| `05_Rock1-90-C#_solo` | 99 | 106 | 106 | 0.927 | 0.907 | 0.810 |
| `05_Rock2-142-D_comp` | 460 | 450 | 450 | 0.890 | 0.842 | 0.336 |
| `05_Rock2-142-D_solo` | 86 | 86 | 86 | 0.907 | 0.907 | 0.291 |
| `05_Rock2-85-F_comp` | 325 | 296 | 296 | 0.857 | 0.821 | 0.615 |
| `05_Rock2-85-F_solo` | 125 | 124 | 124 | 0.956 | 0.940 | 0.699 |
| `05_Rock3-117-Bb_comp` | 398 | 401 | 401 | 0.879 | 0.869 | 0.403 |
| `05_Rock3-117-Bb_solo` | 70 | 65 | 65 | 0.963 | 0.963 | 0.519 |
| `05_Rock3-148-C_comp` | 249 | 286 | 286 | 0.927 | 0.923 | 0.819 |
| `05_Rock3-148-C_solo` | 61 | 60 | 60 | 0.992 | 0.992 | 0.496 |
| `05_SS1-100-C#_comp` | 102 | 95 | 95 | 0.964 | 0.964 | 0.396 |
| `05_SS1-100-C#_solo` | 60 | 64 | 64 | 0.887 | 0.871 | 0.774 |
| `05_SS1-68-E_comp` | 143 | 148 | 148 | 0.983 | 0.976 | 0.784 |
| `05_SS1-68-E_solo` | 78 | 74 | 74 | 0.895 | 0.882 | 0.329 |
| `05_SS2-107-Ab_comp` | 222 | 227 | 227 | 0.971 | 0.953 | 0.726 |
| `05_SS2-107-Ab_solo` | 87 | 85 | 85 | 0.942 | 0.942 | 0.547 |
| `05_SS2-88-F_comp` | 196 | 190 | 190 | 0.964 | 0.943 | 0.622 |
| `05_SS2-88-F_solo` | 93 | 97 | 97 | 0.979 | 0.968 | 0.589 |
| `05_SS3-84-Bb_comp` | 211 | 215 | 215 | 0.986 | 0.986 | 0.671 |
| `05_SS3-84-Bb_solo` | 111 | 119 | 119 | 0.948 | 0.948 | 0.330 |
| `05_SS3-98-C_comp` | 199 | 187 | 187 | 0.938 | 0.933 | 0.736 |
| `05_SS3-98-C_solo` | 93 | 94 | 94 | 0.973 | 0.973 | 0.289 |
Loading
Loading