research: Replication on Gemma 4 26B-A4B

## What

Replicate the current probe pipeline on the Gemma 4 26B-A4B variant. The current probe is trained on E2B activations; 26B-A4B (Apache-2.0, on-device runnable) should give a stronger signal but requires re-extracting activations on the larger model.

## Why

Linear probes generally scale with base-model quality — a stronger backbone gives a more linearly separable "vulnerable" direction. 26B-A4B is the natural next step that keeps us inside the on-device licensing story.

## Plan

1. Re-run `src/extract_token_activations.py` against 26B-A4B on the same CyberSecEval + SVEN dataset
2. Train per-layer probes, sweep layers
3. Compare per-layer AUC vs. the E2B baseline
4. Ship the best 26B-A4B probe variant alongside the E2B one (user selectable in the UI)

## Definition of done

- Per-layer AUC numbers on the same dataset for 26B-A4B
- Best 26B-A4B probe ships alongside the E2B probe (UI dropdown to switch)
- AUC delta reported in `docs/` or the project README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research: Replication on Gemma 4 26B-A4B #8

What

Why

Plan

Definition of done

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research: Replication on Gemma 4 26B-A4B #8

Description

What

Why

Plan

Definition of done

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions