verl-project · tardis-key · May 20, 2026 · May 6, 2026 · tardis-key · May 11, 2026
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -11,7 +11,7 @@ repos:
     rev: "v1.17.0"
     hooks:
       - id: mypy
-        additional_dependencies: [types-requests]
+        additional_dependencies: [types-PyYAML, types-requests]
 
   - repo: local
     hooks:

diff --git a/experimental/README.md b/experimental/README.md
@@ -0,0 +1,153 @@
+# RL-Insight Monitor
+
+RL-Insight Monitor provides an observability stack for RL training metrics and traces based on Prometheus, Tempo, and Grafana.
+
+It has two parts:
+
+- `rl-insight server ...`: manage the observability Docker stack.
+- `rl_insight`: training-side Python APIs for metrics and traces.
+
+## Quickstart
+
+### 1. Install
+
+From the repository root:
+
+```bash
+pip install -r requirements.txt
+pip install -e .
+```
+
+### 2. Start the observability stack
+
+Default foreground mode:
+
+```bash
+rl-insight server start
+```
+
+This mode starts Docker Compose silently, keeps the CLI attached, and stops the whole stack when you press `Ctrl+C`.
+
+Grafana will be provisioned automatically with Prometheus and Tempo datasources plus an empty starter dashboard. The datasources follow the configured Prometheus and Tempo published ports.
+
+Background mode:
+
+```bash
+rl-insight server start --detach
+```
+
+Foreground mode with compose/container logs attached:
+
+```bash
+rl-insight server start --attach-logs
+```
+
+Use a custom config file:
+
+```bash
+rl-insight server start --config path/to/config.yaml
+```
+
+Stop the stack explicitly from another terminal:
+
+```bash
+rl-insight server stop
+```
+
+After startup, the CLI prints:
+
+- Prometheus config file path
+- Trainer OTLP traces URL
+- Prometheus, Tempo, and Grafana access URLs
+
+### 3. Initialize the training side
+
+```python
+import os
+import ray
+import rl_insight as insight
+
+os.environ["OTEL_EXPORTER_OTLP_TRACES_ENDPOINT"] = "http://<server-ip>:4318/v1/traces"
+
+ray.init(address="auto", namespace="rl-insight-monitor")
+insight.init()
+```
+
+Notes:
+
+- `ray.init(namespace="rl-insight-monitor")` is used to find the monitor hub actor.
+- `OTEL_EXPORTER_OTLP_TRACES_ENDPOINT` takes precedence over `insight.init(config)` -> `otel.traces_endpoint`.
+
+### 4. Emit metrics and traces
+
+```python
+import rl_insight as insight
+
+insight.metric_count("train_step_total", amount=1, worker="trainer_0")
+insight.metric_value("reward_mean", value=1.23, worker="trainer_0")
+insight.metric_distribution("step_latency_ms", value=42.5, worker="trainer_0")
+
+with insight.trace_state("rollout", state_lane_id="trainer_0", step=10):
+    run_rollout()
+
+@insight.trace_op("update_model", stage="optimizer")
+def update_model(batch):
+    ...
+```
+
+## APIs
+
+| API | Purpose |
+|---|---|
+| `init(config=None)` | Initialize training-side monitoring |
+| `close()` | Reset monitor state in the current process |
+| `metric_count()` | Report a counter |
+| `metric_value()` | Report a gauge |
+| `metric_distribution()` | Report a histogram |
+| `trace_state()` | Report a state interval |
+| `trace_op()` | Decorator for operation latency traces |
+
+## CLI Reference
+
+### `rl-insight server start`
+
+| Argument | Default | Description |
+|---|---:|---|
+| `--detach` | `false` | Start in background and return immediately |
+| `--attach-logs` | `false` | Run in foreground and stream compose/container logs |
+| `--config` | `experimental/config/services/config.yaml` | Server config file path |
+| `--log-level` | `INFO` | Python log level |
+
+### `rl-insight server stop`
+
+| Argument | Default | Description |
+|---|---:|---|
+| `--config` | `experimental/config/services/config.yaml` | Server config file path |
+| `--log-level` | `INFO` | Python log level |
+
+## Server YAML
+
+| Key | Default | Description |
+|---|---:|---|
+| `server.backend` | `docker_compose` | Stack startup backend |
+| `server.compose_file` | `docker-compose.yaml` | Compose file path |
+| `server.project_name` | `rl-insight-monitor` | Compose project name |
+| `prometheus.prometheus_port` | `9090` | Prometheus HTTP port |
+| `prometheus.config_file` | `prometheus.yml` | Prometheus config file |
+| `tempo.query_port` | `3200` | Tempo query port |
+| `otel.traces_endpoint` | `http://127.0.0.1:4318/v1/traces` | Trainer trace export endpoint |
+| `grafana.port` | `3000` | Grafana HTTP port |
+| `grafana.provisioning_dir` | `provisioning` | Grafana provisioning directory mounted into the container |
+| `grafana.dashboards_dir` | `dashboards` | Grafana dashboard JSON directory mounted into the container |
+
+## `insight.init(config)`
+
+| Key | Default | Description |
+|---|---:|---|
+| `namespace` | `rl_insight_monitor` | Metrics and trace namespace |
+| `backend.type` | `ray` | Currently only `ray` is supported |
+| `prometheus.metrics_report_port` | `9092` | Monitor hub `/metrics` port |
+| `prometheus.prometheus_port` | `9090` | Prometheus HTTP port used for reload |
+| `prometheus.config_file` | bundled absolute path | Prometheus config file to rewrite |
+| `prometheus.reload.mode` | `ray` | `ray` or `none` |
+| `otel.traces_endpoint` | `http://127.0.0.1:4318/v1/traces` | Trainer trace export endpoint |
diff --git a/experimental/__init__.py b/experimental/__init__.py
@@ -0,0 +1,51 @@
+# Copyright (c) 2026 verl-project authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Experimental online monitoring: Ray hub, Prometheus ``/metrics``, and OTLP trace export."""
+
+from .api import (
+    close,
+    init,
+    metric_count,
+    metric_distribution,
+    metric_value,
+    trace_op,
+    trace_state,
+)
+from .config import (
+    MONITOR_HUB_ACTOR_NAME,
+    MONITOR_RAY_NAMESPACE,
+    load_monitor_config,
+    load_server_config_file,
+    resolve_monitor_stack_paths,
+)
+from .utils import PROMETHEUS_SCRAPE_JOB_NAME, update_prometheus_config
+
+
+__all__ = [
+    "close",
+    "init",
+    "load_monitor_config",
+    "load_server_config_file",
+    "MONITOR_HUB_ACTOR_NAME",
+    "MONITOR_RAY_NAMESPACE",
+    "metric_count",
+    "metric_distribution",
+    "metric_value",
+    "PROMETHEUS_SCRAPE_JOB_NAME",
+    "resolve_monitor_stack_paths",
+    "trace_op",
+    "trace_state",
+    "update_prometheus_config",
+]
diff --git a/experimental/__main__.py b/experimental/__main__.py
@@ -0,0 +1,21 @@
+# Copyright (c) 2026 verl-project authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+from .cli import main
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())