Skip to content

docs: update README for current ModelExpress capabilities#258

Open
ganeshku1 wants to merge 17 commits into
mainfrom
docs/readme-storage-support
Open

docs: update README for current ModelExpress capabilities#258
ganeshku1 wants to merge 17 commits into
mainfrom
docs/readme-storage-support

Conversation

@ganeshku1
Copy link
Copy Markdown
Member

@ganeshku1 ganeshku1 commented Apr 30, 2026

Summary

  • update the top-level README to reflect shipped ModelExpress capabilities on latest main
  • clarify ingress reduction, distributed registry, provider support, ModelStreamer, GDS, and air-gapped operation
  • link the deployment guide back to the air-gapped overview
  • keep roadmap items limited to work that is not yet merged

Testing

  • git diff --check -- README.md docs/DEPLOYMENT.md

Summary by CodeRabbit

  • Documentation
    • Repositioned ModelExpress documentation to emphasize intelligent model distribution, coordinated downloads, and multiple model source support.
    • Added comprehensive deployment guidance including air-gapped operations, backend prerequisites, and provider patterns.
    • Enhanced offline operation documentation reference.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

Walkthrough

ModelExpress documentation is substantially rewritten to reposition the project as an intelligent model distribution control layer, emphasizing coordinated downloads, air-gapped deployment patterns, and distributed state coordination. README narrative and features are updated; new sections cover provider patterns and deployment scenarios. DEPLOYMENT.md reference is refreshed.

Changes

Cohort / File(s) Summary
Documentation Repositioning
README.md
Comprehensive rewrite repositioning ModelExpress as distribution/control layer; architecture narrative updated to reflect coordinated downloads, multiple model sources, direct GPU loading; new sections for air-gapped guidance, provider patterns, and deployment examples added; some roadmap priorities removed.
Deployment Guide Updates
docs/DEPLOYMENT.md
Introductory reference updated to include additional link to air-gapped operation section.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A rabbit hops through words with care,
Repositioning clouds into the air,
With offline paths and downloads sincere,
ModelExpress's purpose grows crystal clear! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: updating the README to reflect current ModelExpress capabilities. It matches the primary focus of the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
README.md (1)

107-122: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Convert ASCII art to mermaid diagram.

The architecture diagram uses ASCII art with box-drawing characters. As per coding guidelines, markdown files should use mermaid diagrams instead of ASCII art for better maintainability and rendering.

🎨 Proposed mermaid diagram
-```
-                    ┌─────────────────────────────────────────────────────────────────┐
-                    │                    ModelExpress Server                          │
-                    │   Health • Model • P2P Metadata • Redis/K8s CRD backends        │
-                    └──────────────────────┬──────────────────────────────────────────┘
-
-                         ┌─────────────────┼─────────────────┐
-                         │ metadata        │                 │ metadata
-                         ▼                 │                 ▼
-              ┌──────────────────┐         │       ┌──────────────────┐
-              │  Source (vLLM)   │  RDMA   │       │  Target (vLLM)   │
-              │  mx loader       │════════►│       │  mx loader       │
-              │  Load → NIXL     │  NIXL   │       │  Receive → FP8   │
-              │  Publish metadata│         │       │  Serve inference │
-              └──────────────────┘         │       └──────────────────┘
-```
+```mermaid
+graph TB
+    Server["ModelExpress Server<br/>Health • Model • P2P Metadata<br/>Redis/K8s CRD backends"]
+    Source["Source (vLLM)<br/>mx loader<br/>Load → NIXL<br/>Publish metadata"]
+    Target["Target (vLLM)<br/>mx loader<br/>Receive → FP8<br/>Serve inference"]
+    
+    Server -->|metadata| Source
+    Server -->|metadata| Target
+    Source ==>|RDMA/NIXL| Target
+```

As per coding guidelines: "Use mermaid diagrams instead of ASCII art in markdown files."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 107 - 122, Replace the ASCII art block with a mermaid
diagram: create a ```mermaid``` code block using graph TB and define nodes named
Server, Source, and Target (matching the proposed labels) and connect them with
Server -->|metadata| Source, Server -->|metadata| Target, and Source
==>|RDMA/NIXL| Target; ensure each node includes the same multi-line content
(use HTML <br/> or sublabels) so the diagram content and edge labels match the
original ASCII diagram.
🧹 Nitpick comments (1)
docs/DEPLOYMENT.md (1)

8-8: 💤 Low value

Consider rephrasing to reduce repetition.

Three consecutive sentences begin with "For," which creates minor repetition. Consider consolidating:

♻️ Suggested rewording
-User-facing guide for configuring and deploying ModelExpress. For architecture details, see [`ARCHITECTURE.md`](ARCHITECTURE.md). For development setup, see [`../CONTRIBUTING.md`](../CONTRIBUTING.md). For a concise overview of offline operation, see the air-gapped section in [`../README.md`](../README.md).
+User-facing guide for configuring and deploying ModelExpress. See [`ARCHITECTURE.md`](ARCHITECTURE.md) for architecture details, [`../CONTRIBUTING.md`](../CONTRIBUTING.md) for development setup, and the air-gapped section in [`../README.md`](../README.md) for offline operation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/DEPLOYMENT.md` at line 8, The three consecutive sentences starting "For
architecture details, ...", "For development setup, ...", and "For a concise
overview..." are repetitive; rewrite that fragment so the three links are
combined into a single sentence (e.g., "See ARCHITECTURE.md for architecture
details, CONTRIBUTING.md for development setup, and the air-gapped section in
README.md for offline operation."), replacing the three "For ..." sentences with
one concise, comma-separated sentence while preserving each link and context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@README.md`:
- Around line 107-122: Replace the ASCII art block with a mermaid diagram:
create a ```mermaid``` code block using graph TB and define nodes named Server,
Source, and Target (matching the proposed labels) and connect them with Server
-->|metadata| Source, Server -->|metadata| Target, and Source ==>|RDMA/NIXL|
Target; ensure each node includes the same multi-line content (use HTML <br/> or
sublabels) so the diagram content and edge labels match the original ASCII
diagram.

---

Nitpick comments:
In `@docs/DEPLOYMENT.md`:
- Line 8: The three consecutive sentences starting "For architecture details,
...", "For development setup, ...", and "For a concise overview..." are
repetitive; rewrite that fragment so the three links are combined into a single
sentence (e.g., "See ARCHITECTURE.md for architecture details, CONTRIBUTING.md
for development setup, and the air-gapped section in README.md for offline
operation."), replacing the three "For ..." sentences with one concise,
comma-separated sentence while preserving each link and context.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a94d7af6-3689-4318-9b16-eea7ffa19b53

📥 Commits

Reviewing files that changed from the base of the PR and between b0c94ed and 1bd782c.

📒 Files selected for processing (2)
  • README.md
  • docs/DEPLOYMENT.md

… item

- Update MLA known issue to reflect merged adopt_hidden_tensors workaround
  and verified correct P2P transfer for Kimi-K2.5-NVFP4; add GLM-5.1 to
  blocked model list; correct fallback chain (not disk-only)
- Add MX_SKIP_FEATURE_CHECK to configuration table
- Add MLA P2P transfer as active roadmap item

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ASCII layout was misaligned; Mermaid renders correctly on GitHub
and is the project standard per CLAUDE.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace minimal 3-node diagram with one that covers:
- External model sources (HF, NGC, GCS) with one-time download / ingress reduction
- ModelExpress Server with Redis / K8s CRD backend
- Source pod load pipeline (cache → post-process → NIXL → publish metadata)
- Target pod ordered fallback chain (RDMA → ModelStreamer → GDS → Default)
- Scale impact callout: 1 download → N pods at ~15 s each via P2P

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace single crowded diagram with two clean diagrams:
- Phase 1: external download and cache (sources → server → cache)
- Phase 2: autoscale and rolling update (source pod → RDMA → N target pods
  with ordered fallback chain)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace compact Model Store Providers table with full breakdown of
  server-side providers (HF, NGC, GCS) and ModelStreamer backends
  (S3/S3-compatible, GCS, Azure Blob, local filesystem, HF cache)
  with URI format and auth notes for each
- Add GDS entry with activation conditions
- Update SGLang row from "coming soon" to in-progress GDS PR link

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@zhengluo-nv
Copy link
Copy Markdown
Contributor

DCO check fails

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants