docs: update README for current ModelExpress capabilities#258
docs: update README for current ModelExpress capabilities#258ganeshku1 wants to merge 17 commits into
Conversation
WalkthroughModelExpress documentation is substantially rewritten to reposition the project as an intelligent model distribution control layer, emphasizing coordinated downloads, air-gapped deployment patterns, and distributed state coordination. README narrative and features are updated; new sections cover provider patterns and deployment scenarios. DEPLOYMENT.md reference is refreshed. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
README.md (1)
107-122: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winConvert ASCII art to mermaid diagram.
The architecture diagram uses ASCII art with box-drawing characters. As per coding guidelines, markdown files should use mermaid diagrams instead of ASCII art for better maintainability and rendering.
🎨 Proposed mermaid diagram
-``` - ┌─────────────────────────────────────────────────────────────────┐ - │ ModelExpress Server │ - │ Health • Model • P2P Metadata • Redis/K8s CRD backends │ - └──────────────────────┬──────────────────────────────────────────┘ - │ - ┌─────────────────┼─────────────────┐ - │ metadata │ │ metadata - ▼ │ ▼ - ┌──────────────────┐ │ ┌──────────────────┐ - │ Source (vLLM) │ RDMA │ │ Target (vLLM) │ - │ mx loader │════════►│ │ mx loader │ - │ Load → NIXL │ NIXL │ │ Receive → FP8 │ - │ Publish metadata│ │ │ Serve inference │ - └──────────────────┘ │ └──────────────────┘ -``` +```mermaid +graph TB + Server["ModelExpress Server<br/>Health • Model • P2P Metadata<br/>Redis/K8s CRD backends"] + Source["Source (vLLM)<br/>mx loader<br/>Load → NIXL<br/>Publish metadata"] + Target["Target (vLLM)<br/>mx loader<br/>Receive → FP8<br/>Serve inference"] + + Server -->|metadata| Source + Server -->|metadata| Target + Source ==>|RDMA/NIXL| Target +```As per coding guidelines: "Use mermaid diagrams instead of ASCII art in markdown files."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 107 - 122, Replace the ASCII art block with a mermaid diagram: create a ```mermaid``` code block using graph TB and define nodes named Server, Source, and Target (matching the proposed labels) and connect them with Server -->|metadata| Source, Server -->|metadata| Target, and Source ==>|RDMA/NIXL| Target; ensure each node includes the same multi-line content (use HTML <br/> or sublabels) so the diagram content and edge labels match the original ASCII diagram.
🧹 Nitpick comments (1)
docs/DEPLOYMENT.md (1)
8-8: 💤 Low valueConsider rephrasing to reduce repetition.
Three consecutive sentences begin with "For," which creates minor repetition. Consider consolidating:
♻️ Suggested rewording
-User-facing guide for configuring and deploying ModelExpress. For architecture details, see [`ARCHITECTURE.md`](ARCHITECTURE.md). For development setup, see [`../CONTRIBUTING.md`](../CONTRIBUTING.md). For a concise overview of offline operation, see the air-gapped section in [`../README.md`](../README.md). +User-facing guide for configuring and deploying ModelExpress. See [`ARCHITECTURE.md`](ARCHITECTURE.md) for architecture details, [`../CONTRIBUTING.md`](../CONTRIBUTING.md) for development setup, and the air-gapped section in [`../README.md`](../README.md) for offline operation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/DEPLOYMENT.md` at line 8, The three consecutive sentences starting "For architecture details, ...", "For development setup, ...", and "For a concise overview..." are repetitive; rewrite that fragment so the three links are combined into a single sentence (e.g., "See ARCHITECTURE.md for architecture details, CONTRIBUTING.md for development setup, and the air-gapped section in README.md for offline operation."), replacing the three "For ..." sentences with one concise, comma-separated sentence while preserving each link and context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@README.md`:
- Around line 107-122: Replace the ASCII art block with a mermaid diagram:
create a ```mermaid``` code block using graph TB and define nodes named Server,
Source, and Target (matching the proposed labels) and connect them with Server
-->|metadata| Source, Server -->|metadata| Target, and Source ==>|RDMA/NIXL|
Target; ensure each node includes the same multi-line content (use HTML <br/> or
sublabels) so the diagram content and edge labels match the original ASCII
diagram.
---
Nitpick comments:
In `@docs/DEPLOYMENT.md`:
- Line 8: The three consecutive sentences starting "For architecture details,
...", "For development setup, ...", and "For a concise overview..." are
repetitive; rewrite that fragment so the three links are combined into a single
sentence (e.g., "See ARCHITECTURE.md for architecture details, CONTRIBUTING.md
for development setup, and the air-gapped section in README.md for offline
operation."), replacing the three "For ..." sentences with one concise,
comma-separated sentence while preserving each link and context.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a94d7af6-3689-4318-9b16-eea7ffa19b53
📒 Files selected for processing (2)
README.mddocs/DEPLOYMENT.md
… item - Update MLA known issue to reflect merged adopt_hidden_tensors workaround and verified correct P2P transfer for Kimi-K2.5-NVFP4; add GLM-5.1 to blocked model list; correct fallback chain (not disk-only) - Add MX_SKIP_FEATURE_CHECK to configuration table - Add MLA P2P transfer as active roadmap item Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ASCII layout was misaligned; Mermaid renders correctly on GitHub and is the project standard per CLAUDE.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace minimal 3-node diagram with one that covers: - External model sources (HF, NGC, GCS) with one-time download / ingress reduction - ModelExpress Server with Redis / K8s CRD backend - Source pod load pipeline (cache → post-process → NIXL → publish metadata) - Target pod ordered fallback chain (RDMA → ModelStreamer → GDS → Default) - Scale impact callout: 1 download → N pods at ~15 s each via P2P Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace single crowded diagram with two clean diagrams: - Phase 1: external download and cache (sources → server → cache) - Phase 2: autoscale and rolling update (source pod → RDMA → N target pods with ordered fallback chain) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace compact Model Store Providers table with full breakdown of server-side providers (HF, NGC, GCS) and ModelStreamer backends (S3/S3-compatible, GCS, Azure Blob, local filesystem, HF cache) with URI format and auth notes for each - Add GDS entry with activation conditions - Update SGLang row from "coming soon" to in-progress GDS PR link Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
DCO check fails |
Summary
Testing
Summary by CodeRabbit