Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/web.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ jobs:
cache-dependency-path: web/package-lock.json
- name: Install dependencies
run: npm ci
- name: Derive repo facts
run: npm run prebuild
- name: Run ESLint
run: npm run lint
- name: TypeScript type check
Expand Down
12 changes: 7 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,8 @@ one you want isn't here, that's a good issue to open.
ChatGPT/Codex CLI login (working).

Routing is more than a base URL swap: `/reasoning` effort is translated into
each provider's wire dialect, sub-agent tiers resolve per provider, and the
system prompt's model facts are templated per-model instead of hardcoded.
each provider's wire dialect, delegated Agent tiers resolve per provider, and
the system prompt's model facts are templated per-model instead of hardcoded.
Switch mid-session with `/provider` and `/model`. The full registry —
credentials, base URLs, capability boundaries — lives in
[docs/PROVIDERS.md](docs/PROVIDERS.md).
Expand Down Expand Up @@ -231,11 +231,13 @@ The README is the short version. The rest is in docs and on
- [User guide](docs/GUIDE.md) · [Install guide](docs/INSTALL.md) ·
[Configuration](docs/CONFIGURATION.md) · [Provider registry](docs/PROVIDERS.md)
- [Modes](docs/MODES.md) — Agent, Plan, and YOLO.
- [Sub-agents](docs/SUBAGENTS.md) — roles, lifecycle, output contract, and
recovery behavior.
- [Agents and Workflows terminology](docs/ORCHESTRATION_TERMINOLOGY.md) —
the public naming model for delegated work and durable orchestration.
- [Agents](docs/SUBAGENTS.md) — delegated roles, lifecycle, output contract,
and recovery behavior.
- [Architecture](docs/ARCHITECTURE.md) — crate layout, runtime flow, tool system,
extension points, and security model.
- [Fleet](docs/FLEET.md) · [WhaleFlow authoring](docs/WHALEFLOW_AUTHORING.md) ·
- [Agent control plane](docs/FLEET.md) · [Workflow authoring](docs/WHALEFLOW_AUTHORING.md) ·
[MCP](docs/MCP.md) · [Runtime API](docs/RUNTIME_API.md) ·
[Model Lab](docs/MODEL_LAB.md)
- [Keybindings](docs/KEYBINDINGS.md) · [Sandbox & approvals](docs/SANDBOX.md)
Expand Down
3 changes: 2 additions & 1 deletion crates/tui/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5526,7 +5526,8 @@ pub fn active_provider_has_config_api_key(config: &Config) -> bool {
return crate::oauth::auth_file_path().exists();
}
if matches!(provider, ApiProvider::Huggingface)
&& std::env::var("HF_TOKEN").is_ok_and(|k| !k.trim().is_empty())
&& (std::env::var("HUGGINGFACE_API_KEY").is_ok_and(|k| !k.trim().is_empty())
|| std::env::var("HF_TOKEN").is_ok_and(|k| !k.trim().is_empty()))
{
return true;
}
Expand Down
9 changes: 9 additions & 0 deletions crates/tui/src/tui/ui/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ impl Drop for ConfigPathEnvGuard {

struct SettingsHomeGuard {
_tmp: TempDir,
previous_config_path: Option<OsString>,
previous_home: Option<OsString>,
previous_userprofile: Option<OsString>,
_lock: MutexGuard<'static, ()>,
Expand All @@ -85,15 +86,19 @@ impl SettingsHomeGuard {
fn new() -> Self {
let lock = crate::test_support::lock_test_env();
let tmp = TempDir::new().expect("settings tempdir");
let config_path = tmp.path().join(".codewhale").join("config.toml");
let previous_config_path = std::env::var_os("DEEPSEEK_CONFIG_PATH");
let previous_home = std::env::var_os("HOME");
let previous_userprofile = std::env::var_os("USERPROFILE");
// Safety: test-only environment mutation guarded by a global mutex.
unsafe {
std::env::set_var("DEEPSEEK_CONFIG_PATH", &config_path);
std::env::set_var("HOME", tmp.path());
std::env::set_var("USERPROFILE", tmp.path());
}
Self {
_tmp: tmp,
previous_config_path,
previous_home,
previous_userprofile,
_lock: lock,
Expand All @@ -105,6 +110,10 @@ impl Drop for SettingsHomeGuard {
fn drop(&mut self) {
// Safety: test-only environment mutation guarded by a global mutex.
unsafe {
match self.previous_config_path.take() {
Some(previous) => std::env::set_var("DEEPSEEK_CONFIG_PATH", previous),
None => std::env::remove_var("DEEPSEEK_CONFIG_PATH"),
}
match self.previous_home.take() {
Some(previous) => std::env::set_var("HOME", previous),
None => std::env::remove_var("HOME"),
Expand Down
5 changes: 5 additions & 0 deletions docs/AGENT_RUNTIME.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# The CodeWhale Agent Runtime — one durable substrate, familiar launchers

> Public naming: CodeWhale exposes **Agents** for delegated work and
> **Workflows** for durable multi-agent plans. `sub-agent`, `Fleet`, and
> `WhaleFlow` remain implementation names. See
> [Orchestration Terminology](ORCHESTRATION_TERMINOLOGY.md).

This document explains how sub-agents, the headless `exec` path, and Agent Fleet
relate. It exists because these had drifted into *two* parallel "worker"
systems, and the fix is to make the **fleet-backed worker run** the durable
Expand Down
34 changes: 20 additions & 14 deletions docs/FLEET.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# Agent Fleet
# Agent Control Plane

> Public naming: this is the **Agent control plane**. `Fleet` is the internal
> scheduler/ledger/host-transport name and the current CLI namespace. See
> [Orchestration Terminology](ORCHESTRATION_TERMINOLOGY.md).

Agent Fleet is the local-first control plane for durable multi-worker runs. It
is **not** a separate execution engine: a fleet worker is a headless
Expand Down Expand Up @@ -28,25 +32,27 @@ Fleet state is stored under the workspace in `.codewhale/fleet.jsonl`. Worker
logs and adapter logs are stored under `.codewhale/fleet/` and
`.codewhale/fleet-host/`.

## Naming: Modes, WhaleFlow, Fleet, and Swarm
## Naming: Agents, Workflows, Fleet, and Swarm

These names describe different layers, not competing systems. Agent, Plan, and
YOLO stay the permission/work modes. WhaleFlow is an orchestration overlay that
can run on top of those modes when the task needs a continuous workflow.
These names describe different layers, not competing product concepts. Agent,
Plan, and YOLO stay the permission/work modes. Publicly, CodeWhale has
**Agents** for delegated work and **Workflows** for durable multi-agent plans.

- **WhaleFlow** is the repeatable workflow plan and user-facing orchestration
overlay: a script/IR that decides which phases and agents run next, keeps
intermediate results out of the main conversation, and can be inspected or
rerun. A WhaleFlow run should have a visible progress view and a clear active
header state instead of feeling like a hidden background task.
- **Fleet** is the execution substrate: headless workers, local/SSH hosts,
- **Agents** are delegated workers with roles, model routes, permissions,
transcripts, and status.
- **Workflows** are repeatable orchestration plans that decide which phases and
Agents run next, keep intermediate results out of the main conversation, and
can be inspected or rerun.
- **Fleet** is the Agent control plane: headless workers, local/SSH hosts,
trust policy, leases, heartbeats, logs, receipts, and status APIs.
- **Swarm** is the high-fanout behavior inside WhaleFlow. It is gated in
- **WhaleFlow** is the Workflow engine: typed IR, authoring, validation, and
replay.
- **Swarm** is high-fanout Workflow behavior. It is gated in
v0.8.61: `/swarm` must not revive prompt-only sub-agent fanout. It should
compile into a WhaleFlow-backed fleet run once the durable worker and goal
compile into a Workflow-backed Agent run once the durable worker and goal
re-dispatch substrate is available.

UI guidance: keep the main transcript calm. A WhaleFlow run should appear as a
UI guidance: keep the main transcript calm. A Workflow run should appear as a
compact progress card plus Work/Agents sidebar rows with phase names, worker
counts, receipts, and nested indentation for child workers. Use the whale mark
sparingly as an active header/status signal; avoid repeating emoji-heavy rows
Expand Down
88 changes: 88 additions & 0 deletions docs/ORCHESTRATION_TERMINOLOGY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Orchestration Terminology

CodeWhale should expose two orchestration concepts in user-facing copy:

1. **Agents**
2. **Workflows**

Everything else is an implementation layer, compatibility alias, or architecture
detail.

## Public Names

### Agents

An **Agent** is delegated work with its own role, lifecycle, model route, tool
permissions, transcript, and status.

Use **Agents** for:

- child or delegated work launched from a parent session
- background workers
- role-based scouts, reviewers, implementers, and verifiers
- local or remote workers launched by the durable control plane
- status/sidebar rows that show running delegated work

Public examples:

- "Open an Agent to review this diff."
- "Agents can run locally or remotely."
- "Agents report receipts, artifacts, and status back to the parent."

### Workflows

A **Workflow** is a repeatable multi-step plan that orchestrates agents and
control-flow nodes.

Use **Workflows** for:

- DAGs, phases, branches, reductions, loops, and tournaments
- replayable multi-agent plans
- teacher review and promotion gates
- durable orchestration that spans many agents or runs
- user-authored `.workflow.*`, Starlark, JSON, or TOML specs

Public examples:

- "Run a Workflow to audit the release."
- "Workflows orchestrate Agents through repeatable plans."
- "Workflow replay verifies the same plan without live model calls."

## Internal Names

| Internal name | Public framing | Notes |
|---|---|---|
| `sub-agent` / `subagent` | Agent, child Agent | Keep in code identifiers, config keys, compatibility docs, and protocol fields. Avoid as the headline product term. |
| `Fleet` | Agent control plane | Fleet is the scheduler, ledger, host transport, receipt store, and durable worker substrate for Agents. |
| `WhaleFlow` | Workflow engine | WhaleFlow is the Rust IR/compiler/replay engine behind Workflows. |
| `Workroom` | collaboration context | Workrooms organize threads, links, events, and shared visibility. They are not a third orchestration concept. |
| `/swarm` | high-fanout Workflow behavior | Keep gated or compatibility-only until it compiles into Workflow-backed Agent runs. |

## Naming Rules

- Prefer **Agents** and **Workflows** in website, README, wiki, release notes,
screenshots, and first-run UI.
- Use internal names only when explaining source modules, config compatibility,
protocol types, or migration details.
- When an internal name appears, define it through the two public names:
"Fleet is the Agent control plane" or "WhaleFlow is the Workflow engine."
- Do not present Fleet, WhaleFlow, Workrooms, sub-agents, and swarm as five
separate product concepts.
- Keep stable commands and config keys until a separate compatibility issue
intentionally renames them.

## Recommended Surface Map

| Surface | Preferred label | Compatibility details |
|---|---|---|
| Sidebar panel | Agents | Existing `/subagents` may remain as an alias. |
| Config UI section | Agents | Existing `[subagents]` keys remain stable. |
| Workflow authoring docs | Workflows | Mention WhaleFlow once as the engine name. |
| Fleet docs | Agent control plane | Keep `codewhale fleet` as the CLI implementation surface. |
| Workroom docs | Collaboration context | Keep workroom links/protocol language for architecture docs. |
| Slash command docs | `/agents`, `/workflows` direction | Existing `/agent`, `/subagents`, `/fleet`, `/swarm` require compatibility planning before renaming. |

## One-Sentence Product Description

CodeWhale has two orchestration concepts: **Agents** for delegated work, and
**Workflows** for durable multi-agent plans.
8 changes: 6 additions & 2 deletions docs/SUBAGENTS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# Sub-Agents
# Agents

> Public naming: **Agents** are delegated workers. The codebase and some
> compatibility surfaces still use `sub-agent` / `subagent` for the current
> implementation. See [Orchestration Terminology](ORCHESTRATION_TERMINOLOGY.md).

Sub-agents are the user-facing vocabulary for nested worker assignments: a
parent launches a focused role (`explore`, `review`, `implementer`, `verifier`,
Expand All @@ -18,7 +22,7 @@ cutover completes. It can still be useful for short in-session delegation, but
if a child fails once on a transient provider timeout while an equivalent fleet
worker would retry from the ledger, that is a runtime unification gap. For work
that must survive provider hiccups, process restarts, sleep, or remote
execution, prefer Fleet or a WhaleFlow-backed fleet run.
execution, prefer the Agent control plane or a Workflow-backed Agent run.

Sub-agents inherit the parent's tool registry by default, but child agents are
leaf workers: they do not receive `agent` or nested lifecycle tools. `agent`
Expand Down
6 changes: 5 additions & 1 deletion docs/WHALEFLOW_AUTHORING.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# WhaleFlow Authoring
# Workflow Authoring

> Public naming: **Workflows** are the user-facing concept. `WhaleFlow` is the
> internal Workflow engine and crate name. See
> [Orchestration Terminology](ORCHESTRATION_TERMINOLOGY.md).

WhaleFlow has one runtime boundary: authored workflow source lowers to typed
Rust `WorkflowSpec`, Rust validates the IR, and the scheduler/headless worker
Expand Down
5 changes: 5 additions & 0 deletions docs/WORKROOM_ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# Workroom Architecture

> Public naming: Workrooms are collaboration contexts. They organize threads,
> links, events, and shared visibility; they are not a third orchestration
> concept beside **Agents** and **Workflows**. See
> [Orchestration Terminology](ORCHESTRATION_TERMINOLOGY.md).

## Purpose

Workrooms are CodeWhale's chat-native abstraction for durable, addressable
Expand Down
6 changes: 3 additions & 3 deletions web/app/[locale]/docs/page.tsx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import Link from "next/link";
import { Seal } from "@/components/seal";
import { getFacts } from "@/lib/facts";
import { getFacts, type ProviderFact } from "@/lib/facts";

export async function generateMetadata({ params }: { params: Promise<{ locale: string }> }) {
const { locale } = await params;
Expand Down Expand Up @@ -300,7 +300,7 @@ command = "~/.codewhale/hooks/pre.sh" # / message_submit / mode_change /
,目前共 {facts.providers.length} 个。
</p>
<div className="hairline-t hairline-b mt-5">
{facts.providers.map((p) => (
{facts.providers.map((p: ProviderFact) => (
<div key={p.id} className="grid md:grid-cols-12 gap-0 hairline-t py-3 px-4 hover:bg-paper-deep min-w-0">
<div className="md:col-span-3 font-display font-semibold">{p.label}</div>
<div className="md:col-span-3 font-mono text-[0.78rem] text-ink-soft break-words min-w-0">{p.id}</div>
Expand Down Expand Up @@ -591,7 +591,7 @@ command = "~/.codewhale/hooks/pre.sh" # / message_submit / mode_change /
in <code className="inline">crates/tui/src/config.rs</code> — currently {facts.providers.length} providers.
</p>
<div className="hairline-t hairline-b mt-5">
{facts.providers.map((p) => (
{facts.providers.map((p: ProviderFact) => (
<div key={p.id} className="grid md:grid-cols-12 gap-0 hairline-t py-3 px-4 hover:bg-paper-deep min-w-0">
<div className="md:col-span-3 font-display font-semibold">{p.label}</div>
<div className="md:col-span-3 font-mono text-[0.78rem] text-ink-soft break-words min-w-0">{p.id}</div>
Expand Down
Loading
Loading