Skip to content

Commit 6f5d11f

Browse files
authored
refactor: clean code but keep function (#3)
Signed-off-by: Chojan Shang <psiace@apache.org>
1 parent 094d1d9 commit 6f5d11f

24 files changed

Lines changed: 403 additions & 551 deletions

README.md

Lines changed: 6 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,15 @@
88

99
bubseek turns fragmented data across operational systems, repositories, and agent runtime traces into **explainable, actionable, and shareable insights** without heavy ETL. It keeps the Bub runtime and extension model while packaging a practical default distribution for real deployments.
1010

11-
`bubseek` now boots through a single distribution entry point and targets SeekDB/OceanBase tape storage through the SQLAlchemy URL or `OCEANBASE_*` settings.
11+
`bubseek` packages a practical Bub distribution with SeekDB/OceanBase defaults, bundled channels, and builtin skills, without adding a second CLI surface on top of `bub`.
1212

1313
## Features
1414

1515
- **Lightweight and on-demand** — Trigger analysis when needed instead of maintaining large offline pipelines.
1616
- **Explainability first** — Conclusions are returned together with agent reasoning context.
1717
- **Cloud-edge ready** — Supports distributed deployment and local execution boundaries.
1818
- **Agent observability** — Treats agent behavior as governed, inspectable runtime data.
19-
- **Bub-compatible**Forwards Bub commands directly; no fork of the core runtime.
19+
- **Bub-compatible**Uses Bub directly as the runtime and command surface; no fork of the core runtime.
2020

2121
## Quick start
2222

@@ -26,23 +26,15 @@ Requires [uv](https://docs.astral.sh/uv/) (recommended) or pip, and Python 3.12+
2626
git clone https://github.com/ob-labs/bubseek.git
2727
cd bubseek
2828
uv sync
29-
uv run bubseek --help
30-
uv run bubseek chat
29+
uv run bub --help
30+
uv run bub chat
3131
```
3232

33-
If your runtime reads credentials from `.env`, bubseek forwards them to the Bub subprocess:
34-
35-
```dotenv
36-
BUB_MODEL=openrouter:qwen/qwen3-coder-next
37-
BUB_API_KEY=sk-or-v1-...
38-
BUB_API_BASE=https://openrouter.ai/api/v1
39-
```
40-
41-
Configure SeekDB or OceanBase before running `bubseek`, using `BUB_TAPESTORE_SQLALCHEMY_URL=mysql+oceanbase://...` or the matching `OCEANBASE_*` variables.
33+
Configure SeekDB or OceanBase before running `bubseek`, using `BUB_TAPESTORE_SQLALCHEMY_URL=mysql+oceanbase://...`.
4234

4335
## Add contrib
4436

45-
Contrib packages remain standard Python packages. Add them as normal dependencies. The bundled channel extras resolve from GitHub-hosted `bub-contrib` packages instead of local workspace packages.
37+
Contrib packages remain standard Python packages. Add them as normal dependencies. bubseek ships its built-in channels and marimo support by default, and resolves bundled contrib packages from GitHub-hosted `bub-contrib` packages instead of local workspace packages.
4638

4739
```toml
4840
[project]
@@ -58,8 +50,6 @@ Then sync your environment:
5850
uv sync
5951
```
6052

61-
- Optional extras: Feishu `uv sync --extra feishu`, DingTalk `uv sync --extra dingtalk`, WeChat `uv sync --extra wechat`, Discord `uv sync --extra discord`, Marimo `uv sync --extra marimo`.
62-
6353
## Documentation
6454

6555
## Development

contrib/bubseek-marimo/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,15 +12,15 @@ Marimo channel for Bub — native marimo dashboard with chat and insights index.
1212
## Installation
1313

1414
```bash
15-
uv sync --extra marimo
15+
uv sync
1616
# or
17-
pip install bubseek[marimo]
17+
pip install .
1818
```
1919

2020
## Gateway
2121

2222
```bash
23-
bubseek gateway --enable-channel marimo
23+
bub gateway --enable-channel marimo
2424
```
2525

2626
Open `http://localhost:2718/` — marimo gallery. Click **dashboard** for chat + index. The dashboard queues turns asynchronously and refreshes transcript events from the channel backend.

contrib/bubseek-marimo/scripts/verify_marimo.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,5 @@
33
# Requires: .env with OPENROUTER_API_KEY (or equivalent) for chat.
44
set -e
55
cd "$(dirname "$0")/../.."
6-
uv sync --extra marimo
6+
uv sync
77
uv run pytest contrib/bubseek-marimo/tests/test_marimo_e2e.py -v "$@"

contrib/bubseek-marimo/src/bubseek_marimo/channel.py

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -104,17 +104,13 @@ def _insights_dir(self) -> Path:
104104

105105
def _tapestore_url(self) -> str:
106106
if resolve_tapestore_url is not None:
107-
return resolve_tapestore_url(self._workspace_dir())
108-
env = env_with_workspace_dotenv(self._workspace_dir()) if env_with_workspace_dotenv else self._marimo_env()
109-
url = (env.get("BUB_TAPESTORE_SQLALCHEMY_URL") or "").strip()
107+
url = resolve_tapestore_url(self._workspace_dir())
108+
else:
109+
env = env_with_workspace_dotenv(self._workspace_dir()) if env_with_workspace_dotenv else self._marimo_env()
110+
url = (env.get("BUB_TAPESTORE_SQLALCHEMY_URL") or "").strip()
110111
if url:
111112
return url
112-
host = (env.get("OCEANBASE_HOST") or "127.0.0.1").strip()
113-
port = int((env.get("OCEANBASE_PORT") or "2881").strip())
114-
user = (env.get("OCEANBASE_USER") or "root").strip()
115-
password = env.get("OCEANBASE_PASSWORD") or ""
116-
database = (env.get("OCEANBASE_DATABASE") or "bub").strip()
117-
return f"mysql+oceanbase://{user}:{password}@{host}:{port}/{database}"
113+
raise RuntimeError("BUB_TAPESTORE_SQLALCHEMY_URL is required for the marimo channel")
118114

119115
def _ensure_seed_notebooks(self) -> None:
120116
insights_dir = self._insights_dir()

contrib/bubseek-marimo/tests/test_marimo_e2e.py

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
import sys
1313
import time
1414
from pathlib import Path
15+
from types import ModuleType
1516
from urllib.parse import urlsplit
1617

1718
import pytest
@@ -39,6 +40,17 @@ async def _noop_handler(*_args, **_kwargs) -> None:
3940
return None
4041

4142

43+
def _stub_bubseek_oceanbase(monkeypatch: pytest.MonkeyPatch) -> None:
44+
monkeypatch.setitem(sys.modules, "bubseek.oceanbase", ModuleType("bubseek.oceanbase"))
45+
46+
47+
def _require_tapestore_url() -> str:
48+
url = (os.environ.get("BUB_TAPESTORE_SQLALCHEMY_URL") or "").strip()
49+
if not url:
50+
pytest.skip("BUB_TAPESTORE_SQLALCHEMY_URL is required for marimo gateway tests")
51+
return url
52+
53+
4254
def _port_ready(host: str, port: int, timeout: float = 2.0) -> bool:
4355
try:
4456
with socket.create_connection((host, port), timeout=timeout):
@@ -91,10 +103,12 @@ def _assert_notebook_loads(filename: str) -> tuple[int, str]:
91103

92104

93105
def test_workspace_resolution_priority(monkeypatch, tmp_path) -> None:
106+
_stub_bubseek_oceanbase(monkeypatch)
94107
from bubseek_marimo.channel import MarimoChannel
95108

96109
marimo_workspace = tmp_path / "marimo-workspace"
97110
bubb_workspace = tmp_path / "bub-workspace"
111+
monkeypatch.setenv("BUB_TAPESTORE_SQLALCHEMY_URL", "mysql+oceanbase://seek:secret@seekdb.example:2881/analytics")
98112
monkeypatch.setenv("BUB_MARIMO_WORKSPACE", str(marimo_workspace))
99113
monkeypatch.setenv("BUB_WORKSPACE_PATH", str(bubb_workspace))
100114

@@ -105,8 +119,10 @@ def test_workspace_resolution_priority(monkeypatch, tmp_path) -> None:
105119

106120

107121
def test_workspace_resolution_falls_back_to_cwd(monkeypatch, tmp_path) -> None:
122+
_stub_bubseek_oceanbase(monkeypatch)
108123
from bubseek_marimo.channel import MarimoChannel
109124

125+
monkeypatch.setenv("BUB_TAPESTORE_SQLALCHEMY_URL", "mysql+oceanbase://seek:secret@seekdb.example:2881/analytics")
110126
monkeypatch.delenv("BUB_MARIMO_WORKSPACE", raising=False)
111127
monkeypatch.delenv("BUB_WORKSPACE_PATH", raising=False)
112128
monkeypatch.chdir(tmp_path)
@@ -120,13 +136,29 @@ def test_workspace_resolution_falls_back_to_cwd(monkeypatch, tmp_path) -> None:
120136
assert channel._insights_dir() == tmp_path.resolve() / "insights"
121137

122138

139+
def test_marimo_channel_requires_explicit_tapestore_url(monkeypatch, tmp_path) -> None:
140+
_stub_bubseek_oceanbase(monkeypatch)
141+
from bubseek_marimo.channel import MarimoChannel
142+
143+
monkeypatch.delenv("BUB_TAPESTORE_SQLALCHEMY_URL", raising=False)
144+
monkeypatch.delenv("BUB_MARIMO_WORKSPACE", raising=False)
145+
monkeypatch.delenv("BUB_WORKSPACE_PATH", raising=False)
146+
monkeypatch.chdir(tmp_path)
147+
monkeypatch.setattr("bubseek_marimo.channel.discover_project_root", lambda start: None)
148+
monkeypatch.setattr("bubseek_marimo.channel._discover_project_root_fallback", lambda start: None)
149+
150+
with pytest.raises(RuntimeError, match="BUB_TAPESTORE_SQLALCHEMY_URL is required"):
151+
MarimoChannel(_noop_handler)
152+
153+
123154
@pytest.fixture(scope="module")
124155
def gateway_process():
125156
"""Start gateway with marimo channel, yield process, cleanup on teardown."""
126157
global PORT, MARIMO_PORT
127158

128159
workspace = REPO_ROOT
129160
env = os.environ.copy()
161+
env["BUB_TAPESTORE_SQLALCHEMY_URL"] = _require_tapestore_url()
130162
PORT = _pick_free_port()
131163
MARIMO_PORT = _pick_free_port()
132164
while MARIMO_PORT == PORT:
@@ -142,7 +174,7 @@ def gateway_process():
142174
pytest.fail("uv executable is required for marimo gateway tests")
143175

144176
proc = subprocess.Popen( # noqa: S603
145-
[uv_executable, "run", "bubseek", "gateway", "--enable-channel", "marimo"],
177+
[uv_executable, "run", "bub", "gateway", "--enable-channel", "marimo"],
146178
cwd=str(REPO_ROOT),
147179
env=env,
148180
stdout=subprocess.DEVNULL,

contrib/bubseek-schedule/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ dependencies = [
3737
- **`load_state` starts the scheduler** on the first inbound message. That way `bub chat` (CLI-only: only the `cli` channel is enabled) still persists jobs to SeekDB. Previously, `AsyncIOScheduler` was only started by the `schedule` channel, so CLI chat left jobs in memory-only `_pending_jobs` and **nothing was written to `apscheduler_jobs`**.
3838
- The channel name is `schedule`. Enabling it in `bub gateway` is optional for persistence; it still starts/stops the scheduler cleanly when you use gateway with that channel.
3939
- Jobs are persisted to:
40-
- **OceanBase/SeekDB**: Same URL as the tape store (`BUB_TAPESTORE_SQLALCHEMY_URL` / `OCEANBASE_*`), table `apscheduler_jobs`.
40+
- **OceanBase/SeekDB**: Same URL as the tape store (`BUB_TAPESTORE_SQLALCHEMY_URL`), table `apscheduler_jobs`.
4141

4242
## Provided Tools
4343

@@ -47,7 +47,7 @@ dependencies = [
4747

4848
## Debug: job in chat but not in Marimo kanban / DB
4949

50-
The gateway resolves the job store URL from `BUB_TAPESTORE_SQLALCHEMY_URL` or workspace `.env` (`OCEANBASE_*`). Marimo must use the **same** URL. If `insights/schedule_kanban.py` pointed at the default `127.0.0.1:2881/bub` while your `.env` uses another host/db, the table will look empty.
50+
The gateway resolves the job store URL from `BUB_TAPESTORE_SQLALCHEMY_URL` in the workspace `.env` or process environment. Marimo must use the **same** URL.
5151

5252
From the bubseek repo root:
5353

contrib/bubseek-schedule/src/tests/test_bubseek_schedule.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,11 @@ def _seekdb_url() -> str:
1818
def test_jobstore_roundtrip():
1919
"""Test jobstore roundtrip via APScheduler on SeekDB/OceanBase."""
2020
from apscheduler.schedulers.background import BackgroundScheduler
21+
22+
url = _seekdb_url()
2123
from bubseek_schedule.jobstore import OceanBaseJobStore
2224

23-
store = OceanBaseJobStore(url=_seekdb_url(), tablename="apscheduler_jobs_test_roundtrip")
25+
store = OceanBaseJobStore(url=url, tablename="apscheduler_jobs_test_roundtrip")
2426
scheduler = BackgroundScheduler(jobstores={"default": store})
2527
scheduler.start()
2628

@@ -38,9 +40,11 @@ def test_jobstore_roundtrip():
3840
def test_jobstore_get_due_jobs():
3941
"""Test get_due_jobs and get_next_run_time."""
4042
from apscheduler.schedulers.background import BackgroundScheduler
43+
44+
url = _seekdb_url()
4145
from bubseek_schedule.jobstore import OceanBaseJobStore
4246

43-
store = OceanBaseJobStore(url=_seekdb_url(), tablename="apscheduler_jobs_test_due")
47+
store = OceanBaseJobStore(url=url, tablename="apscheduler_jobs_test_due")
4448
scheduler = BackgroundScheduler(jobstores={"default": store})
4549
scheduler.start()
4650

docs/api-reference.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@
22

33
The public Python surface is intentionally small.
44

5-
## bubseek
5+
## bubseek.config
66

7-
Package root. Re-exports the CLI entry function.
7+
Configuration helpers for resolving tapestore settings.
88

9-
::: bubseek
9+
::: bubseek.config
1010

11-
## bubseek.__main__
11+
## bubseek.database
1212

13-
CLI entry point. `main()` forwards CLI arguments and `.env` values to the `bub` subprocess.
13+
Database bootstrap helpers used by maintenance scripts.
1414

15-
::: bubseek.__main__
15+
::: bubseek.database

docs/architecture.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,9 @@ This page explains what bubseek is responsible for, and what it deliberately lea
44

55
## What bubseek does
66

7-
- provides the `bubseek` executable as a single bootstrap entry point over `bub`
8-
- forwards `.env` values to the Bub subprocess
97
- standardizes tape storage on SeekDB/OceanBase
108
- ships a small set of builtin skills with the package
9+
- bundles a practical set of contrib channels and tools by default
1110
- pins a practical default Bub runtime version
1211

1312
## What bubseek does not do
@@ -25,7 +24,7 @@ Bub remains the runtime, command surface, and extension host.
2524

2625
### bubseek
2726

28-
bubseek is the distribution layer: packaging, bootstrap behavior, runtime defaults, and builtin skills.
27+
bubseek is the distribution layer: packaging, runtime defaults, plugin wiring, and builtin skills.
2928

3029
### Python packaging
3130

@@ -35,7 +34,7 @@ Python packaging handles dependency resolution, lockfiles, and installation. Con
3534

3635
From a user perspective, the benefit is simple: there is less to learn.
3736

38-
- run `bubseek` the same way you would run `bub`
37+
- run `bub`
3938
- add contrib the same way you add any Python dependency
4039
- use builtin skills without an extra sync step
4140
- treat generated marimo notebooks as runtime artifacts under `insights/`, not committed templates

docs/configuration.md

Lines changed: 19 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -30,22 +30,25 @@ dependencies = [
3030
]
3131
```
3232

33-
If you do not want them installed by default, put them under `optional-dependencies` instead:
33+
For bubseek itself, the official distribution keeps its built-in channels and marimo support in the default dependency set:
3434

3535
```toml
36-
[project.optional-dependencies]
37-
feishu = ["bub-feishu"]
38-
dingtalk = ["bub-dingtalk"]
39-
wechat = ["bub-wechat"]
40-
discord = ["bub-discord"]
41-
marimo = ["bubseek-marimo"]
36+
[project]
37+
dependencies = [
38+
"bub",
39+
"bub-feishu",
40+
"bub-dingtalk",
41+
"bub-wechat",
42+
"bub-discord",
43+
"bubseek-marimo",
44+
]
4245
```
4346

44-
Install with: `uv sync --extra feishu` / `pip install bubseek[feishu]` (Feishu); `uv sync --extra dingtalk` / `pip install bubseek[dingtalk]` (DingTalk); `uv sync --extra wechat` / `pip install bubseek[wechat]` ([WeChat](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-wechat)); `uv sync --extra discord` / `pip install bubseek[discord]` ([Discord](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-discord)); `uv sync --extra marimo` / `pip install bubseek[marimo]` (Marimo channel with bundled notebook skills).
47+
Install with the normal project sync or package install: `uv sync` / `pip install .`.
4548

4649
## Runtime credentials
4750

48-
bubseek forwards `.env` values to the Bub subprocess. Bub reads `BUB_*` variables (see [Bub deployment](https://github.com/bubbuild/bub/blob/main/docs/deployment.md)).
51+
Bub reads `BUB_*` variables directly (see [Bub deployment](https://github.com/bubbuild/bub/blob/main/docs/deployment.md)).
4952

5053
**Minimal OpenRouter setup:**
5154

@@ -68,21 +71,21 @@ BUB_API_BASE=https://openrouter.ai/api/v1
6871
| `BUB_TELEGRAM_ALLOW_CHATS` | Comma-separated chat allowlist |
6972
| `BUB_SEARCH_OLLAMA_API_KEY` | Required for web.search tool (bundled) |
7073
| `BUB_SEARCH_OLLAMA_API_BASE` | Ollama API base (default: `https://ollama.com/api`) |
71-
| `BUB_FEISHU_APP_ID` | Required for Feishu channel (optional extra: `bubseek[feishu]`) |
74+
| `BUB_FEISHU_APP_ID` | Required for Feishu channel |
7275
| `BUB_FEISHU_APP_SECRET` | Required for Feishu channel |
73-
| `BUB_DINGTALK_CLIENT_ID` | AppKey for DingTalk channel (optional extra: `bubseek[dingtalk]`) |
76+
| `BUB_DINGTALK_CLIENT_ID` | AppKey for DingTalk channel |
7477
| `BUB_DINGTALK_CLIENT_SECRET` | AppSecret for DingTalk channel |
7578
| `BUB_DINGTALK_ALLOW_USERS` | Comma-separated staff_ids, or `*` for all |
76-
| WeChat token file | After `bub login wechat`, credentials live under `~/.bub/wechat_token.json` (optional extra: `bubseek[wechat]`); see [bub-wechat](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-wechat) |
77-
| `BUB_DISCORD_TOKEN` | Discord bot token (optional extra: `bubseek[discord]`); see [bub-discord](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-discord) |
79+
| WeChat token file | After `bub login wechat`, credentials live under `~/.bub/wechat_token.json`; see [bub-wechat](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-wechat) |
80+
| `BUB_DISCORD_TOKEN` | Discord bot token; see [bub-discord](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-discord) |
7881
| `BUB_DISCORD_ALLOW_USERS` | Optional comma-separated allowlist (user id / username / global name) |
7982
| `BUB_DISCORD_ALLOW_CHANNELS` | Optional comma-separated channel id allowlist |
8083
| `BUB_MARIMO_HOST` | Marimo channel bind host (default: `127.0.0.1`) |
8184
| `BUB_MARIMO_PORT` | Marimo channel bind port (default: `2718`) |
8285
| `BUB_MARIMO_WORKSPACE` | Workspace for insights (default: `BUB_WORKSPACE_PATH` or `.`) |
8386
| `BUB_TAPESTORE_SQLALCHEMY_URL` | SQLAlchemy tape store URL (bundled) |
8487

85-
When `BUB_TAPESTORE_SQLALCHEMY_URL` is unset, bubseek builds a SeekDB/OceanBase URL from the `OCEANBASE_*` variables. Set either the full `mysql+oceanbase://...` URL or the `OCEANBASE_*` fields before running.
88+
Set `BUB_TAPESTORE_SQLALCHEMY_URL` to the full `mysql+oceanbase://...` URL before running any tapestore-backed features.
8689

8790
## Builtin skills
8891

@@ -93,14 +96,14 @@ bubseek also vendors skills at build time via `pdm-build-skills`; these are merg
9396
- `friendly-python` and `piglet` from [PsiACE/skills](https://github.com/PsiACE/skills)
9497
- `plugin-creator` from [bub-contrib/.agents/skills/plugin-creator](https://github.com/bubbuild/bub-contrib/tree/main/.agents/skills/plugin-creator)
9598

96-
The optional `bubseek[marimo]` extra provides:
99+
The bundled marimo support provides:
97100
- **MarimoChannel** — inbound WebSocket for gateway; chat dashboard at `http://0.0.0.0:2718/`
98101
- **marimo skill** — output data insights as marimo `.py` notebooks; index of charts in `{workspace}/insights/`
99102
- References [marimo-team/skills](https://github.com/marimo-team/skills) marimo-notebook conventions
100103

101104
The dashboard and index are generated into `{workspace}/insights/` at runtime from one canonical template source. They should not be hand-edited inside the repository.
102105

103-
Run `bubseek gateway --enable-channel marimo` to enable the marimo dashboard.
106+
Run `bub gateway --enable-channel marimo` to enable the marimo dashboard.
104107

105108
## Advanced: downstream skill packaging
106109

0 commit comments

Comments
 (0)