docs: update README for current ModelExpress capabilities by ganeshku1 · Pull Request #258 · ai-dynamo/modelexpress

ganeshku1 · 2026-04-30T16:57:40Z

Summary

update the top-level README to reflect shipped ModelExpress capabilities on latest main
clarify ingress reduction, distributed registry, provider support, ModelStreamer, GDS, and air-gapped operation
link the deployment guide back to the air-gapped overview
keep roadmap items limited to work that is not yet merged

Testing

git diff --check -- README.md docs/DEPLOYMENT.md

Summary by CodeRabbit

Documentation
- Repositioned ModelExpress documentation to emphasize intelligent model distribution, coordinated downloads, and multiple model source support.
- Added comprehensive deployment guidance including air-gapped operations, backend prerequisites, and provider patterns.
- Enhanced offline operation documentation reference.

coderabbitai · 2026-04-30T17:01:54Z

Walkthrough

ModelExpress documentation is substantially rewritten to reposition the project as an intelligent model distribution control layer, emphasizing coordinated downloads, air-gapped deployment patterns, and distributed state coordination. README narrative and features are updated; new sections cover provider patterns and deployment scenarios. DEPLOYMENT.md reference is refreshed.

Changes

Cohort / File(s)	Summary
Documentation Repositioning `README.md`	Comprehensive rewrite repositioning ModelExpress as distribution/control layer; architecture narrative updated to reflect coordinated downloads, multiple model sources, direct GPU loading; new sections for air-gapped guidance, provider patterns, and deployment examples added; some roadmap priorities removed.
Deployment Guide Updates `docs/DEPLOYMENT.md`	Introductory reference updated to include additional link to air-gapped operation section.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A rabbit hops through words with care,
Repositioning clouds into the air,
With offline paths and downloads sincere,
ModelExpress's purpose grows crystal clear! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: updating the README to reflect current ModelExpress capabilities. It matches the primary focus of the changeset.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 60 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

README.md (1)

107-122: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Convert ASCII art to mermaid diagram.

The architecture diagram uses ASCII art with box-drawing characters. As per coding guidelines, markdown files should use mermaid diagrams instead of ASCII art for better maintainability and rendering.

🎨 Proposed mermaid diagram

-```
-                    ┌─────────────────────────────────────────────────────────────────┐
-                    │                    ModelExpress Server                          │
-                    │   Health • Model • P2P Metadata • Redis/K8s CRD backends        │
-                    └──────────────────────┬──────────────────────────────────────────┘
-                                           │
-                         ┌─────────────────┼─────────────────┐
-                         │ metadata        │                 │ metadata
-                         ▼                 │                 ▼
-              ┌──────────────────┐         │       ┌──────────────────┐
-              │  Source (vLLM)   │  RDMA   │       │  Target (vLLM)   │
-              │  mx loader       │════════►│       │  mx loader       │
-              │  Load → NIXL     │  NIXL   │       │  Receive → FP8   │
-              │  Publish metadata│         │       │  Serve inference │
-              └──────────────────┘         │       └──────────────────┘
-```
+```mermaid
+graph TB
+    Server["ModelExpress Server<br/>Health • Model • P2P Metadata<br/>Redis/K8s CRD backends"]
+    Source["Source (vLLM)<br/>mx loader<br/>Load → NIXL<br/>Publish metadata"]
+    Target["Target (vLLM)<br/>mx loader<br/>Receive → FP8<br/>Serve inference"]
+    
+    Server -->|metadata| Source
+    Server -->|metadata| Target
+    Source ==>|RDMA/NIXL| Target
+```

As per coding guidelines: "Use mermaid diagrams instead of ASCII art in markdown files."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 107 - 122, Replace the ASCII art block with a mermaid
diagram: create a ```mermaid``` code block using graph TB and define nodes named
Server, Source, and Target (matching the proposed labels) and connect them with
Server -->|metadata| Source, Server -->|metadata| Target, and Source
==>|RDMA/NIXL| Target; ensure each node includes the same multi-line content
(use HTML <br/> or sublabels) so the diagram content and edge labels match the
original ASCII diagram.

🧹 Nitpick comments (1)

docs/DEPLOYMENT.md (1)

8-8: 💤 Low value

Consider rephrasing to reduce repetition.

Three consecutive sentences begin with "For," which creates minor repetition. Consider consolidating:

♻️ Suggested rewording

-User-facing guide for configuring and deploying ModelExpress. For architecture details, see [`ARCHITECTURE.md`](ARCHITECTURE.md). For development setup, see [`../CONTRIBUTING.md`](../CONTRIBUTING.md). For a concise overview of offline operation, see the air-gapped section in [`../README.md`](../README.md).
+User-facing guide for configuring and deploying ModelExpress. See [`ARCHITECTURE.md`](ARCHITECTURE.md) for architecture details, [`../CONTRIBUTING.md`](../CONTRIBUTING.md) for development setup, and the air-gapped section in [`../README.md`](../README.md) for offline operation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/DEPLOYMENT.md` at line 8, The three consecutive sentences starting "For
architecture details, ...", "For development setup, ...", and "For a concise
overview..." are repetitive; rewrite that fragment so the three links are
combined into a single sentence (e.g., "See ARCHITECTURE.md for architecture
details, CONTRIBUTING.md for development setup, and the air-gapped section in
README.md for offline operation."), replacing the three "For ..." sentences with
one concise, comma-separated sentence while preserving each link and context.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@README.md`:
- Around line 107-122: Replace the ASCII art block with a mermaid diagram:
create a ```mermaid``` code block using graph TB and define nodes named Server,
Source, and Target (matching the proposed labels) and connect them with Server
-->|metadata| Source, Server -->|metadata| Target, and Source ==>|RDMA/NIXL|
Target; ensure each node includes the same multi-line content (use HTML <br/> or
sublabels) so the diagram content and edge labels match the original ASCII
diagram.

---

Nitpick comments:
In `@docs/DEPLOYMENT.md`:
- Line 8: The three consecutive sentences starting "For architecture details,
...", "For development setup, ...", and "For a concise overview..." are
repetitive; rewrite that fragment so the three links are combined into a single
sentence (e.g., "See ARCHITECTURE.md for architecture details, CONTRIBUTING.md
for development setup, and the air-gapped section in README.md for offline
operation."), replacing the three "For ..." sentences with one concise,
comma-separated sentence while preserving each link and context.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a94d7af6-3689-4318-9b16-eea7ffa19b53

📥 Commits

Reviewing files that changed from the base of the PR and between b0c94ed and 1bd782c.

📒 Files selected for processing (2)

README.md
docs/DEPLOYMENT.md

… item - Update MLA known issue to reflect merged adopt_hidden_tensors workaround and verified correct P2P transfer for Kimi-K2.5-NVFP4; add GLM-5.1 to blocked model list; correct fallback chain (not disk-only) - Add MX_SKIP_FEATURE_CHECK to configuration table - Add MLA P2P transfer as active roadmap item Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ASCII layout was misaligned; Mermaid renders correctly on GitHub and is the project standard per CLAUDE.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace minimal 3-node diagram with one that covers: - External model sources (HF, NGC, GCS) with one-time download / ingress reduction - ModelExpress Server with Redis / K8s CRD backend - Source pod load pipeline (cache → post-process → NIXL → publish metadata) - Target pod ordered fallback chain (RDMA → ModelStreamer → GDS → Default) - Scale impact callout: 1 download → N pods at ~15 s each via P2P Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace single crowded diagram with two clean diagrams: - Phase 1: external download and cache (sources → server → cache) - Phase 2: autoscale and rolling update (source pod → RDMA → N target pods with ordered fallback chain) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Replace compact Model Store Providers table with full breakdown of server-side providers (HF, NGC, GCS) and ModelStreamer backends (S3/S3-compatible, GCS, Azure Blob, local filesystem, HF cache) with URI format and auth notes for each - Add GDS entry with activation conditions - Update SGLang row from "coming soon" to in-progress GDS PR link Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

zhengluo-nv · 2026-05-07T00:39:59Z

DCO check fails

ganeshku1 added 9 commits April 30, 2026 11:09

docs: refine README storage and provider support

ca4cffc

docs: clarify README ingress reduction

6653470

docs: align README with latest mainline features

e5e3f65

docs: add air-gapped README guidance

5934f9c

docs: link deployment guide to air-gapped overview

f672d51

docs: add ingress problem statement to README intro

5288458

docs: clarify README download responsibility

6c1f350

docs: rewrite README in product style

0b5cb05

docs: restore roadmap and storage wording

1bd782c

pull-request-size Bot added the size/L label Apr 30, 2026

ganeshku1 had a problem deploying to GITLAB April 30, 2026 16:57 — with GitHub Actions Error

github-actions Bot added the docs label Apr 30, 2026

docs: broaden README product framing

861b3b4

ganeshku1 had a problem deploying to GITLAB April 30, 2026 17:00 — with GitHub Actions Failure

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

ganeshku1 had a problem deploying to GITLAB May 1, 2026 04:06 — with GitHub Actions Failure

docs: replace ASCII architecture diagram with Mermaid flowchart

1890e7e

ASCII layout was misaligned; Mermaid renders correctly on GitHub and is the project standard per CLAUDE.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ganeshku1 had a problem deploying to GITLAB May 1, 2026 04:17 — with GitHub Actions Failure

ganeshku1 had a problem deploying to GITLAB May 1, 2026 04:20 — with GitHub Actions Error

ganeshku1 had a problem deploying to GITLAB May 1, 2026 04:22 — with GitHub Actions Error

docs: replace inaccurate IB bandwidth with ConnectX / fast interconnect

11ddb6e

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ganeshku1 had a problem deploying to GITLAB May 1, 2026 04:24 — with GitHub Actions Error

docs: remove benchmark numbers from architecture diagram

8b8057e

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ganeshku1 had a problem deploying to GITLAB May 1, 2026 04:25 — with GitHub Actions Error

ganeshku1 had a problem deploying to GITLAB May 1, 2026 04:28 — with GitHub Actions Failure

ganeshku1 temporarily deployed to GITLAB May 7, 2026 20:24 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: update README for current ModelExpress capabilities#258

docs: update README for current ModelExpress capabilities#258
ganeshku1 wants to merge 17 commits into
mainfrom
docs/readme-storage-support

ganeshku1 commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 30, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

zhengluo-nv commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ganeshku1 commented Apr 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 30, 2026

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

zhengluo-nv commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ganeshku1 commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading