From ba8c5b31b7122ddc02d16cf447ab9855031c956b Mon Sep 17 00:00:00 2001
From: hlin99 <hlin99@users.noreply.github.com>
Date: Mon, 6 Apr 2026 21:01:15 +0800
Subject: [PATCH] =?UTF-8?q?docs:=20unified=20structure=20=E2=80=94=20LICEN?=
 =?UTF-8?q?SE,=20README,=20CONTRIBUTING,=20bot/?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Add Apache 2.0 LICENSE
- Rewrite README.md (concise, one-screen, links to guide)
- Add CONTRIBUTING.md
- Move bot files to bot/ (DESIGN_PRINCIPLES, DEV_LOOP, REVIEW_POLICY, iterations/)
- Add documentation update rules to DEV_LOOP.md
---
 CONTRIBUTING.md                     |  33 +++++
 LICENSE                             | 202 ++++++++++++++++++++++++++++
 README.md                           |  61 +++++----
 REVIEW_POLICY.md                    |  62 ---------
 bot/AUTHOR_POLICY.md                |  55 ++++++++
 bot/BOT_POLICY.md                   |  23 ++++
 bot/DESIGN_PRINCIPLES.md            |  37 +++++
 bot/DEV_LOOP.md                     |  34 +++++
 bot/ENTRY.md                        |  18 +++
 bot/REVIEW_POLICY.md                |  32 +++++
 {docs => bot}/iterations/current.md |   0
 docs/DESIGN_PRINCIPLES.md           |  90 -------------
 docs/DEV_LOOP.md                    |  85 ------------
 13 files changed, 466 insertions(+), 266 deletions(-)
 create mode 100644 CONTRIBUTING.md
 create mode 100644 LICENSE
 delete mode 100644 REVIEW_POLICY.md
 create mode 100644 bot/AUTHOR_POLICY.md
 create mode 100644 bot/BOT_POLICY.md
 create mode 100644 bot/DESIGN_PRINCIPLES.md
 create mode 100644 bot/DEV_LOOP.md
 create mode 100644 bot/ENTRY.md
 create mode 100644 bot/REVIEW_POLICY.md
 rename {docs => bot}/iterations/current.md (100%)
 delete mode 100644 docs/DESIGN_PRINCIPLES.md
 delete mode 100644 docs/DEV_LOOP.md

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..31c5fea
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,33 @@
+# Contributing to xPyD-bench
+
+## Development Setup
+
+```bash
+git clone https://github.com/xPyD-hub/xPyD-bench
+cd xPyD-bench
+pip install -e ".[dev]"
+```
+
+## Running Tests
+
+```bash
+pytest tests/ -q
+```
+
+## Code Style
+
+- Python 3.10+
+- Ruff for linting: `ruff check xpyd_bench/ tests/`
+- All PRs must pass CI (lint + tests on 3.10/3.11/3.12 + integration trigger)
+
+## PR Process
+
+1. Create a branch from `main`
+2. Make changes, add tests
+3. Push and open PR
+4. CI runs: unit tests + integration tests (via trigger)
+5. Review and merge
+
+## Bot Development
+
+See [bot/](bot/) for automated development policies and iteration records.
diff --git a/LICENSE b/LICENSE
new file mode 100644
index 0000000..d645695
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,202 @@
+
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
diff --git a/README.md b/README.md
index d0a47a1..0438b70 100644
--- a/README.md
+++ b/README.md
@@ -1,14 +1,17 @@
-📖 **[完整使用指南 → docs/guide.md](docs/guide.md)**
-
 # xPyD-bench
 
-Benchmarking & PD ratio planning tool for [xPyD-proxy](https://github.com/xPyD-hub/xPyD-proxy).
+**Benchmarking & PD ratio planning tool for LLM inference endpoints.**
 
-## Features
+xPyD-bench measures the performance of OpenAI-compatible LLM serving endpoints with detailed latency, throughput, and quality metrics. Built as a superset of vLLM bench with full CLI compatibility.
 
-- **`xpyd-bench`** — Benchmark xPyD proxy with configurable concurrency, request patterns, and both `/v1/completions` and `/v1/chat/completions` endpoints
+## Key Features
 
-For PD ratio planning, see [xPyD-plan](https://github.com/xPyD-hub/xPyD-plan).
+- **vLLM bench compatible CLI** — drop-in replacement, same arguments
+- **Rich metrics** — TTFT, TPOT, ITL, P50/P90/P95/P99, throughput
+- **Flexible load patterns** — constant, burst, ramp, poisson, custom
+- **Multiple datasets** — JSONL, CSV, JSON, synthetic generation
+- **Advanced analysis** — comparison, regression detection, SLA validation, cost estimation
+- **Reports** — JSON, CSV, Markdown, HTML dashboard, JUnit XML, Prometheus
 
 ## Install
 
@@ -16,37 +19,37 @@ For PD ratio planning, see [xPyD-plan](https://github.com/xPyD-hub/xPyD-plan).
 pip install xpyd-bench
 ```
 
-## Quick Start
+Or as part of the full xPyD toolkit:
+
+```bash
+pip install xpyd
+```
 
-### Benchmark
+## Quick Start
 
 ```bash
-# Run benchmark against a running xPyD proxy
-xpyd-bench --target http://localhost:8080 \
-           --endpoint chat \
-           --concurrency 16 \
-           --num-requests 200 \
-           --output results.json
-
-# Use completion endpoint
-xpyd-bench --target http://localhost:8080 \
-           --endpoint completion \
-           --concurrency 8 \
-           --num-requests 100
+# Benchmark a running endpoint
+xpyd-bench --base-url http://localhost:8080 \
+           --model my-model \
+           --dataset-name random \
+           --num-prompts 100
+
+# Compare two runs
+xpyd-bench compare baseline.json candidate.json
 ```
 
-## Configuration
+## Part of xPyD
 
-See [examples/](examples/) for sample configs and scenarios.
+xPyD-bench is part of the [xPyD ecosystem](https://github.com/xPyD-hub/xPyD) for PD-disaggregated LLM serving:
 
-## Output Metrics
+| Component | Description |
+|-----------|-------------|
+| [xpyd-proxy](https://github.com/xPyD-hub/xPyD-proxy) | Prefill-Decode disaggregated proxy |
+| [xpyd-sim](https://github.com/xPyD-hub/xPyD-sim) | OpenAI-compatible inference simulator |
+| **xpyd-bench** | Benchmarking & planning tool |
 
-- **TTFT** — Time to first token
-- **TPS** — Tokens per second (per request & aggregate)
-- **Latency** — P50 / P90 / P99 end-to-end latency
-- **Throughput** — Total requests/sec and tokens/sec
-- **Error rate** — Failed requests count and percentage
+📖 **[Full Guide →](docs/guide.md)** | 💡 **[Examples →](examples/)** | 🏗️ **[Contributing →](CONTRIBUTING.md)**
 
 ## License
 
-TBD
+Apache 2.0 — see [LICENSE](LICENSE)
diff --git a/REVIEW_POLICY.md b/REVIEW_POLICY.md
deleted file mode 100644
index 4e3ee8a..0000000
--- a/REVIEW_POLICY.md
+++ /dev/null
@@ -1,62 +0,0 @@
-# Review Policy
-
-## Roles
-
-| Role | GitHub Account | Action |
-|------|---------------|--------|
-| Implementer | `hlin99` | Write code, submit PRs, fix issues |
-| Reviewer 1 | `hlin99-Review-Bot` | Review PRs: approve / request changes / close |
-| Reviewer 2 | `hlin99-Review-BotX` | Review PRs: approve / request changes / close |
-
-## Timing
-
-| Parameter | Value |
-|-----------|-------|
-| Iteration interval | 10 minutes |
-| PR wait for review | max 15 minutes |
-| Fix after request changes | max 10 minutes |
-| Reviewer check frequency | every 5 minutes |
-| Reviewer response deadline | 15 minutes after assign |
-| Reviewer timeout action | close PR (iteration failed) |
-| Total round timeout | 1 hour from PR creation |
-| Round timeout action | close PR (iteration failed) |
-
-## Review Criteria
-
-Reviewers evaluate each PR on two dimensions:
-
-### 1. Idea Value
-- Is the direction/approach valuable for the project?
-- Does it align with the project goals?
-- **If NO → close PR immediately** (one close = PR rejected)
-
-### 2. Code Quality
-- Is the code correct?
-- Are tests included/passing?
-- Is `docs/iterations/current.md` updated with clear description?
-- Does `docs/guide.md` reflect changes (if applicable)?
-- **If idea is good but code has issues → request changes**
-
-## Decision Rules
-
-| Scenario | Action |
-|----------|--------|
-| Both reviewers approve | Auto-merge |
-| One approves, one requests changes | Implementer fixes, reviewers re-review |
-| Either reviewer closes | PR closed, iteration failed |
-| Both approve after fixes | Auto-merge |
-| Timeout (15min no review) | PR closed, iteration failed |
-| Total timeout (1 hour) | PR closed, iteration failed |
-
-## Iteration Record
-
-Every PR MUST update `docs/iterations/current.md` with:
-- What was done this iteration
-- Result: merged / closed (with reason)
-- Reviewer scores/comments summary
-
-## Auto-Merge Requirements
-
-- 2 approvals from designated reviewers
-- CI passes (all checks green)
-- No unresolved review comments
diff --git a/bot/AUTHOR_POLICY.md b/bot/AUTHOR_POLICY.md
new file mode 100644
index 0000000..9f28aa9
--- /dev/null
+++ b/bot/AUTHOR_POLICY.md
@@ -0,0 +1,55 @@
+<!-- CRITICAL: DO NOT SUMMARIZE OR COMPRESS THIS FILE -->
+<!-- This file contains precise rules that must be read in full. -->
+
+# Author Policy — xPyD-bench
+
+Rules for the bot that writes code and submits PRs.
+
+## Identity
+
+| Role | GitHub Account |
+|------|---------------|
+| Author | `hlin99` |
+
+## Before Coding
+
+1. Pull latest main: `git pull origin main`
+2. Create feature branch: `git checkout -b <type>/<short-description>`
+3. Read [DESIGN_PRINCIPLES.md](DESIGN_PRINCIPLES.md) for architecture constraints.
+
+## Code Quality
+
+1. Run lint: `ruff check xpyd_bench tests`
+2. Run tests: `pytest tests/ -q`
+3. Rebase before push: `git pull --rebase origin main`
+
+## PR Submission
+
+1. **One PR per task.** Don't bundle unrelated changes.
+2. **Descriptive title.** Format: `type: short description` (e.g., `feat: add SLA validation`).
+3. **PR body must include:** what changed, why, test coverage, breaking changes.
+4. **All CI must pass** before requesting review.
+
+## Responding to Review
+
+1. Fix all blockers before re-requesting review.
+2. Reply to every comment — "Fixed in <commit>" or explain disagreement with evidence.
+3. Push new commits (don't force push over reviewer comments).
+4. **Never force push.** If the branch is too messy, close the PR and open a new one.
+
+## Documentation Updates
+
+Every PR must update relevant documentation:
+
+| Change Type | Update |
+|---|---|
+| New feature / CLI argument | `docs/guide.md` — add usage section |
+| Architecture change | `docs/architecture.md` — update descriptions |
+| Design decision | `docs/design.md` — append decision record |
+| Quick Start affected | `README.md` — update (keep it one screen max, link to guide.md) |
+| PR completed | `bot/iterations/current.md` — append summary |
+
+`docs/guide.md` is the source of truth for how to use the tool.
+`docs/architecture.md` and `docs/design.md` are append-only — never delete history.
+
+When current iteration is complete, rename `bot/iterations/current.md` to `YYYY-MM-DD-<topic>.md` and create a fresh `current.md`.
diff --git a/bot/BOT_POLICY.md b/bot/BOT_POLICY.md
new file mode 100644
index 0000000..934ebfc
--- /dev/null
+++ b/bot/BOT_POLICY.md
@@ -0,0 +1,23 @@
+<!-- CRITICAL: DO NOT SUMMARIZE OR COMPRESS THIS FILE -->
+<!-- This file contains precise rules that must be read in full. -->
+
+# Bot Policy — xPyD-bench
+
+## Language
+- **English only** — all code, docs, issues, PRs, comments on GitHub must be in English. No Chinese characters.
+
+## Code Rules
+- All changes go through PR. Never push directly to main.
+- Every PR must have tests. No untested code.
+- CI must be 100% green before merge. No skips allowed.
+- No test may be skipped. If a test can't run, fix it or remove it.
+- Rebase to latest main before pushing.
+
+## Testing
+- Unit tests in `tests/` — pure bench logic, no external dependencies.
+- Integration tests in [xPyD-integration](https://github.com/xPyD-hub/xPyD-integration) — cross-component tests.
+
+## Architecture
+- Bench is a pure client tool. No server components.
+- All inference backend interaction goes through xPyD-integration tests.
+- Follow vLLM bench CLI compatibility (see [DESIGN_PRINCIPLES.md](DESIGN_PRINCIPLES.md)).
diff --git a/bot/DESIGN_PRINCIPLES.md b/bot/DESIGN_PRINCIPLES.md
new file mode 100644
index 0000000..69ff7f9
--- /dev/null
+++ b/bot/DESIGN_PRINCIPLES.md
@@ -0,0 +1,37 @@
+<!-- CRITICAL: DO NOT SUMMARIZE OR COMPRESS THIS FILE -->
+<!-- This file contains precise rules that must be read in full. -->
+
+# xPyD-bench Design Principles
+
+## Core Positioning
+A comprehensive benchmarking tool for LLM inference endpoints, built as an enhancement on top of vLLM bench.
+
+## CLI Compatibility
+- **CLI arguments must be fully compatible with vLLM bench** — users can switch from `python benchmark_serving.py` to `xpyd-bench` without changing their command line
+- Extended features beyond vLLM bench CLI should be configured via **YAML config file** (`--config config.yaml`)
+- Basic usage = CLI only (vLLM bench compatible), advanced usage = CLI + YAML
+
+## Alignment with vLLM Bench
+- CLI arguments must align with vLLM bench where applicable
+- Output format must align with vLLM bench
+- We build incremental improvements on top of vLLM bench, not a replacement
+
+## Areas of Enhancement
+- **Full OpenAI API coverage**: every parameter matters — all 4 input formats, temperature, top_k, top_p, frequency_penalty, presence_penalty, stop sequences, logprobs, etc. No omissions.
+- **Flexible request rate patterns**: vLLM bench only supports per-second rate. Support per-5s, per-10s, burst patterns, ramp-up/ramp-down, Poisson distribution, custom patterns.
+- **Rich dataset input**: support JSONL, JSON, CSV — let users bring their own data easily.
+- **Extended metrics**: beyond what vLLM bench provides.
+
+## Architecture
+- Bench is a **pure client tool** — it sends requests and measures responses.
+- No built-in server or simulator. Backend simulation is handled by [xPyD-sim](https://github.com/xPyD-hub/xPyD-sim).
+- Integration tests (bench + sim, bench + proxy + sim) live in [xPyD-integration](https://github.com/xPyD-hub/xPyD-integration).
+- Code in `xpyd_bench/`, tests in `tests/`.
+
+## Code Organization
+- `xpyd_bench/bench/` — core benchmark runner, metrics, rate patterns
+- `xpyd_bench/reporting/` — output formats (JSON, CSV, HTML, Prometheus)
+- `xpyd_bench/scenarios/` — preset configurations
+- `xpyd_bench/distributed/` — multi-worker coordination
+- `xpyd_bench/plugins/` — backend plugin system
+- `tests/` — unit tests only (no external dependencies)
diff --git a/bot/DEV_LOOP.md b/bot/DEV_LOOP.md
new file mode 100644
index 0000000..06b4eb2
--- /dev/null
+++ b/bot/DEV_LOOP.md
@@ -0,0 +1,34 @@
+<!-- CRITICAL: DO NOT SUMMARIZE OR COMPRESS THIS FILE -->
+<!-- This file contains operational steps that must be followed exactly. -->
+
+# Development Loop — xPyD-bench
+
+Autonomous iteration loop. References policies for rules — this file only describes the operational workflow.
+
+## Rules
+
+All rules are defined in policy files. Read them first:
+- [BOT_POLICY.md](BOT_POLICY.md) — hard constraints
+- [AUTHOR_POLICY.md](AUTHOR_POLICY.md) — code quality, PR process, doc updates
+- [REVIEW_POLICY.md](REVIEW_POLICY.md) — timing, review standards
+
+## Setup (every iteration)
+```
+git config user.email "tony.lin@intel.com"
+git config user.name "hlin99"
+```
+
+## Each Iteration
+
+1. Pull latest code
+2. Read `ROADMAP.md` — find the next incomplete milestone
+3. Read `DESIGN_PRINCIPLES.md` — follow the rules
+4. Check open issues/PRs — handle unmerged PRs first
+5. Create GitHub Issue: problem, solution, acceptance criteria, tests
+6. Create branch, implement code + tests
+7. Verify locally (lint + tests per AUTHOR_POLICY.md)
+8. Push, create PR, request review
+9. Wait for review (timing per REVIEW_POLICY.md)
+10. Fix review comments, iterate until approved
+11. Merge and update `bot/iterations/current.md`
+12. Next iteration
diff --git a/bot/ENTRY.md b/bot/ENTRY.md
new file mode 100644
index 0000000..3cf7125
--- /dev/null
+++ b/bot/ENTRY.md
@@ -0,0 +1,18 @@
+<!-- CRITICAL: DO NOT SUMMARIZE OR COMPRESS THIS FILE -->
+<!-- This file contains precise rules that must be read in full. -->
+<!-- Skipping or abbreviating any section may cause policy violations. -->
+
+# Bot Entry Point
+
+Read this file first when starting any automated task on this repo.
+
+## Required Reading (in order)
+
+1. **[BOT_POLICY.md](BOT_POLICY.md)** — Hard rules. Must follow.
+2. **[AUTHOR_POLICY.md](AUTHOR_POLICY.md)** — Rules for writing code and submitting PRs.
+3. **[REVIEW_POLICY.md](REVIEW_POLICY.md)** — Rules for reviewing PRs.
+4. **[DESIGN_PRINCIPLES.md](DESIGN_PRINCIPLES.md)** — Architecture constraints and design rules.
+5. **[DEV_LOOP.md](DEV_LOOP.md)** — Development workflow (operational steps, references policies above).
+6. **[iterations/current.md](iterations/current.md)** — Current task context.
+
+Files 1-4 are mandatory for all repos. Files 5-6 are repo-specific.
diff --git a/bot/REVIEW_POLICY.md b/bot/REVIEW_POLICY.md
new file mode 100644
index 0000000..9448264
--- /dev/null
+++ b/bot/REVIEW_POLICY.md
@@ -0,0 +1,32 @@
+<!-- CRITICAL: DO NOT SUMMARIZE OR COMPRESS THIS FILE -->
+<!-- This file contains precise rules that must be read in full. -->
+
+# Review Policy — xPyD-bench
+
+## Roles
+
+| Role | GitHub Account | Action |
+|------|---------------|--------|
+| Implementer | `hlin99` | Write code, submit PRs, fix issues |
+| Reviewer 1 | `hlin99-Review-Bot` | Review PRs: approve / request changes / close |
+| Reviewer 2 | `hlin99-Review-BotX` | Review PRs: approve / request changes / close |
+
+## Timing Parameters
+
+These are the single source of truth for all timing values:
+
+| Parameter | Value |
+|-----------|-------|
+| Iteration interval | 10 minutes |
+| PR wait for review | max 15 minutes |
+| Fix after request changes | max 10 minutes |
+| Reviewer check frequency | every 5 minutes |
+| Reviewer response deadline | 15 minutes after assign |
+| Reviewer timeout action | close PR (iteration failed) |
+
+## Review Standards
+
+- At least 1 approval required to merge.
+- Blockers (🔴) must be fixed. No exceptions.
+- Yellow (🟡) issues should be fixed unless author provides good reason.
+- All CI checks must pass.
diff --git a/docs/iterations/current.md b/bot/iterations/current.md
similarity index 100%
rename from docs/iterations/current.md
rename to bot/iterations/current.md
diff --git a/docs/DESIGN_PRINCIPLES.md b/docs/DESIGN_PRINCIPLES.md
deleted file mode 100644
index 01853b7..0000000
--- a/docs/DESIGN_PRINCIPLES.md
+++ /dev/null
@@ -1,90 +0,0 @@
-# xPyD-bench Design Principles
-
-## Core Positioning
-A comprehensive benchmarking tool for LLM inference endpoints, built as an enhancement on top of vLLM bench.
-
-## CLI Compatibility
-- **CLI arguments must be fully compatible with vLLM bench** — users can switch from `python benchmark_serving.py` to `xpyd-bench` without changing their command line
-- Extended features beyond vLLM bench CLI should be configured via **YAML config file** (`--config config.yaml`)
-- Basic usage = CLI only (vLLM bench compatible), advanced usage = CLI + YAML
-
-## Alignment with vLLM Bench
-- CLI arguments must align with vLLM bench where applicable
-- Output format must align with vLLM bench
-- We build incremental improvements on top of vLLM bench, not a replacement
-
-## Areas of Enhancement (think creatively, these are examples)
-- **Full OpenAI API coverage**: every parameter matters — all 4 input formats, temperature, top_k, top_p, frequency_penalty, presence_penalty, stop sequences, logprobs, etc. No omissions.
-- **Flexible request rate patterns**: vLLM bench only supports per-second rate. Support per-5s, per-10s, burst patterns, ramp-up/ramp-down, Poisson distribution, custom patterns.
-- **Rich dataset input**: support JSONL, JSON, CSV — let users bring their own data easily.
-- **Extended metrics**: beyond what vLLM bench provides.
-- Think about what else users need that vLLM bench doesn't offer.
-
-## Dummy Server — vLLM Boundary Rule (IMPORTANT)
-
-The dummy server simulates a **vLLM backend** for bench validation. It must stay strictly within vLLM's API surface:
-
-### Hard Rules
-1. **Only implement features that vLLM actually supports.** If vLLM doesn't have it, the dummy server must not have it.
-2. **OpenAI API parameters**: only those that vLLM's OpenAI-compatible server accepts (including vLLM extensions like `best_of`, `top_k`, `min_p`, etc.).
-3. **Response format**: must match vLLM's response structure, including vLLM-specific fields (`stop_reason`, `service_tier`, `kv_transfer_params`).
-4. **No test-only hacks**: features like gzip decompression, rate-limit simulation (429/X-RateLimit headers), custom header echo, speculative decoding metadata injection, or online /v1/batches API do NOT belong in the dummy server — they are not vLLM behaviors.
-
-### API Compatibility Levels
-
-The dummy server implements two levels of API compatibility, matching xPyD-sim:
-
-**Level 1: OpenAI API Spec**
-- Accept and validate all OpenAI API parameters
-- Response format matches OpenAI spec
-- Parameter range validation (temperature, top_p, penalties)
-- response_format (json_object / json_schema)
-- Embedding encoding_format (float / base64)
-
-**Level 2: vLLM Backend Extensions**
-- Accept all vLLM-specific sampling params without error
-- Response includes vLLM fields: `stop_reason`, `service_tier`
-- base64 encoding uses little-endian byte order
-
-### What Belongs Here vs. Elsewhere
-| Need | Where to implement |
-|---|---|
-| Simulating vLLM inference behavior | ✅ Dummy server |
-| Testing bench's own features (compression, rate-limit tracking, header injection) | ❌ NOT dummy server — use a separate test fixture/middleware |
-| Features vLLM doesn't support | ❌ NOT dummy server |
-
-### API Compatibility Levels
-
-The dummy server implements two levels of API compatibility, matching xPyD-sim:
-
-**Level 1: OpenAI API Spec**
-- Accept and validate all OpenAI API parameters
-- Response format matches OpenAI spec
-- Parameter range validation (temperature, top_p, penalties)
-- response_format (json_object / json_schema)
-- Embedding encoding_format (float / base64, little-endian byte order)
-
-**Level 2: vLLM Backend Extensions**
-- Accept all vLLM-specific sampling params without error
-- Response includes vLLM fields: `stop_reason`, `service_tier`
-- Responses match vLLM's shape, not just OpenAI's
-
-### Co-Evolution with xPyD-sim
-- The dummy server will eventually be **replaced by xPyD-sim** as the canonical vLLM simulator.
-- Any feature added to the dummy must also exist (or be planned) in xPyD-sim.
-- When in doubt, check vLLM's source: `vllm/entrypoints/openai/` is the reference.
-- Code must be decoupled from bench — separate module, no imports between them.
-
-## Principles
-- **Independent thinking**: reference vLLM bench for alignment, but design our own enhancements
-- **Data-driven**: all metrics from real measurements
-- **User-friendly**: easy CLI, sensible defaults, clear output
-- **Rigorous**: every parameter, every edge case matters
-
-## Rules
-- Committer must be `hlin99 <tony.lin@intel.com>`
-- All code, docs, issues, PRs in English
-- Commit messages: conventional commits format
-- Code in `xpyd_bench/`, tests in `tests/`
-- Dummy server in `xpyd_bench/dummy/` (decoupled from bench code)
-- Follow pyproject.toml ruff/isort config
diff --git a/docs/DEV_LOOP.md b/docs/DEV_LOOP.md
deleted file mode 100644
index a889be4..0000000
--- a/docs/DEV_LOOP.md
+++ /dev/null
@@ -1,85 +0,0 @@
-# Development Loop
-
-Autonomous infinite loop. Runs until explicitly stopped.
-
-## Setup (every iteration)
-```
-git config user.email "tony.lin@intel.com"
-git config user.name "hlin99"
-```
-
-## Each Iteration
-
-1. Pull latest code
-2. Read `ROADMAP.md` — find the next incomplete milestone
-3. Read `DESIGN_PRINCIPLES.md` — follow the rules
-4. Check open issues/PRs — handle unmerged PRs first (fix CI failures, address review comments)
-5. If no milestone left, create new ones (see Phase 2 below)
-6. Create GitHub Issue: problem, solution, acceptance criteria, tests
-7. Create branch, implement code + tests
-8. Pass lint: `ruff check src tests && isort --check src tests`
-9. Update `docs/iterations/current.md` with what you did this iteration
-10. Create PR (body contains `Closes #N`)
-11. Wait for CI green. Fix failures. Never merge red CI.
-12. **Wait for reviewer bots** — do NOT self-merge. Two reviewer bots (`hlin99-Review-Bot` and `hlin99-Review-BotX`) will be auto-assigned.
-13. Handle review result:
-    - **2 approvals** → auto-merge → update ROADMAP.md → go to step 1
-    - **request changes** → fix code, push to same PR → wait for re-review (max 10 min to fix)
-    - **closed by reviewer** → iteration failed → push update to `docs/iterations/current.md` on main recording the failure (what was attempted, why rejected, reviewer comments) → go to step 1 with a different task
-14. Go to step 1
-
-## Review Rules (see REVIEW_POLICY.md)
-
-- 2 reviewer bots are auto-assigned on PR creation
-- Either reviewer can close the PR (idea rejected) — one close = PR dead
-- Both must approve for merge
-- Reviewer timeout: 15 minutes → PR auto-closed
-- Total round timeout: 1 hour → PR auto-closed
-- Implementer (hlin99) must NEVER approve or merge their own PR
-
-## Timing
-
-| Parameter | Value |
-|-----------|-------|
-| Iteration interval | 10 minutes |
-| PR wait for review | max 15 minutes |
-| Fix after request changes | max 10 minutes |
-| Total round timeout | 1 hour |
-
-## Deliverables (every iteration)
-
-Every PR MUST include:
-- Code changes (if any)
-- Tests for new code
-- Updated `docs/iterations/current.md` describing what was done
-
-## Rules
-- Committer must be `hlin99 <tony.lin@intel.com>` — always set git config before any commit
-- All code, docs, issues, PRs in English
-- Commit messages: conventional commits format
-- Never self-merge — wait for reviewer bots
-
-## Phase 1: Roadmap-Driven
-Follow ROADMAP.md milestones in order.
-
-## Phase 2: Continuous Evolution
-When all milestones are done:
-1. Review the project — find limitations, improvements, new scenarios
-2. Create new milestones in ROADMAP.md
-3. Return to Phase 1
-
-## Iteration Tracking
-
-`docs/iterations/current.md` must maintain a running log at the bottom:
-
-```markdown
-## Iteration History
-
-| # | Date | Task | Result | Reviewer Comments |
-|---|------|------|--------|-------------------|
-| 1 | 2026-04-06 | Added X feature | ✅ merged | Both approved |
-| 2 | 2026-04-06 | Refactored Y | ❌ closed | BotX: idea not valuable |
-| 3 | 2026-04-06 | Fixed Z bug | ✅ merged | Bot requested changes, fixed |
-```
-
-This table is the source of truth for iteration success/failure rate.