Skip to content

Run first recurring Research Radar trial week #21

@t3chn

Description

@t3chn

Goal: run the first recurring Research Radar trial week before selecting v0.8.

This is an operational recurring research process, not repo automation.

Scope:

  • no new task family;
  • no automation scripts in the public repo;
  • no public scheduled jobs;
  • no scraping or collectors in the repo;
  • no private eval material;
  • no customer data;
  • no raw feeds committed to public repo;
  • no automatic GitHub issues unless explicitly approved.

Cadence:

  • weekday morning Daily Research Radar brief;
  • weekly synthesis checkpoint after the daily briefs;
  • monthly watchlist pruning later, only if the trial is useful.

Daily brief rules:

  • use v0.7 Research Radar watchlist, source map, repos, people, and query sets;
  • focus on benchmark mechanics, not generic AI news;
  • output brief only, not repo changes;
  • recommended actions max 5;
  • answer: Should Agent Bench Lab change direction today?

Weekly synthesis rules:

  • review daily briefs;
  • decide keep / modify / defer / stop;
  • propose issues or decision-log entries;
  • do not choose v0.8 from inertia.

Success criteria:

  • 3 daily briefs completed;
  • 1 weekly synthesis completed;
  • weekly synthesis explicitly recommends keep / modify / defer / stop for each v0.8 candidate;
  • no repo writes from daily briefs;
  • no auto-created issues without approval;
  • no private eval material or customer data.

Possible v0.8 candidates:

  • SEC-01 decision-grade / security overlay;
  • Private Eval Bundle manifest validator;
  • Report schema runtime v1;
  • lightweight Research Radar automation;
  • browser/replay standard;
  • MCP-local extension;
  • nothing yet.

Decision rule:
Do not pick v0.8 until weekly synthesis is complete.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions