From 7ced141bf8792ba1114afdb326c601f3f4ff003e Mon Sep 17 00:00:00 2001 From: mesutoezdil Date: Thu, 7 May 2026 22:11:15 +0200 Subject: [PATCH 1/3] docs: comprehensive style cleanup across versioned docs and i18n MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Replace emoji: ✅/❌ with Yes/No in device-supported tables, remove 💖 from ladder/adopters, ⚠️ from volcano-vgpu and upgrade docs, 👉 from kcd-beijing blog - Replace triple-asterisk formatting (***text***) with **text** - Fix HAMi capitalization in protocol.md (HAMI -> HAMi) - Remove marketing words: robust, seamlessly - Remove first-person: We/we/our throughout contributor guides, user guides, developer docs, and faq across all versioned snapshots - Remove filler words: just, Note that, Please note, It's worth noting, In summary - Replace HAMi now supports pattern for device vendor guides Signed-off-by: mesutoezdil --- docs/contributor/contributing.md | 2 +- docs/contributor/github-workflow.md | 2 +- docs/contributor/governance.md | 2 +- .../index.md | 18 ++++++++--------- .../current/installation/upgrade.md | 2 +- .../current/userguide/device-supported.md | 20 +++++++++---------- .../enable-enflame-gcu-sharing.md | 6 +++--- .../enable-illuvatar-gpu-sharing.md | 8 ++++---- .../version-v1.3.0/contributor/ladder.md | 2 +- .../userguide/device-supported.md | 14 ++++++------- .../userguide/device-supported.md | 14 ++++++------- .../userguide/device-supported.md | 18 ++++++++--------- .../userguide/device-supported.md | 16 +++++++-------- .../enable-enflame-gcu-sharing.md | 6 +++--- .../enable-illuvatar-gpu-sharing.md | 8 ++++---- .../nvidia-gpu/how-to-use-volcano-vgpu.md | 2 +- .../userguide/device-supported.md | 20 +++++++++---------- .../enable-enflame-gcu-sharing.md | 6 +++--- .../enable-illuvatar-gpu-sharing.md | 8 ++++---- .../nvidia-gpu/how-to-use-volcano-vgpu.md | 2 +- .../userguide/device-supported.md | 20 +++++++++---------- .../enable-enflame-gcu-sharing.md | 6 +++--- .../enable-illuvatar-gpu-sharing.md | 8 ++++---- .../nvidia-gpu/how-to-use-volcano-vgpu.md | 2 +- .../contributor/contributing.md | 12 +++++------ .../version-v1.3.0/contributor/governance.md | 2 +- .../version-v1.3.0/contributor/ladder.md | 8 ++++---- .../bash-auto-completion-on-linux.md | 2 +- .../version-v1.3.0/developers/dynamic-mig.md | 6 +++--- .../version-v1.3.0/developers/protocol.md | 2 +- .../version-v1.3.0/developers/scheduling.md | 10 +++++----- versioned_docs/version-v1.3.0/faq/faq.md | 7 +++---- versioned_docs/version-v1.3.0/roadmap.md | 2 +- .../troubleshooting/troubleshooting.md | 2 +- .../enable-cambricon-mlu-sharing.md | 10 +++++----- .../version-v1.3.0/userguide/configure.md | 2 +- .../userguide/device-supported.md | 14 ++++++------- .../hygon-device/enable-hygon-dcu-sharing.md | 10 +++++----- .../metax-device/enable-metax-gpu-schedule.md | 2 +- .../userguide/monitoring/device-allocation.md | 2 +- .../enable-mthreads-gpu-sharing.md | 8 ++++---- .../nvidia-device/dynamic-mig-support.md | 12 +++++------ .../examples/specify-card-type-to-use.md | 2 +- .../contributor/contribute-docs.md | 14 ++++++------- .../contributor/contributing.md | 12 +++++------ .../contributor/github-workflow.md | 2 +- .../version-v2.4.1/contributor/governance.md | 2 +- .../version-v2.4.1/contributor/ladder.md | 8 ++++---- .../bash-auto-completion-on-linux.md | 2 +- .../version-v2.4.1/developers/dynamic-mig.md | 6 +++--- .../version-v2.4.1/developers/protocol.md | 2 +- .../version-v2.4.1/developers/scheduling.md | 10 +++++----- versioned_docs/version-v2.4.1/faq/faq.md | 7 +++---- versioned_docs/version-v2.4.1/roadmap.md | 2 +- .../troubleshooting/troubleshooting.md | 2 +- .../enable-cambricon-mlu-sharing.md | 10 +++++----- .../version-v2.4.1/userguide/configure.md | 2 +- .../userguide/device-supported.md | 14 ++++++------- .../hygon-device/enable-hygon-dcu-sharing.md | 10 +++++----- .../metax-device/enable-metax-gpu-schedule.md | 2 +- .../userguide/monitoring/device-allocation.md | 2 +- .../enable-mthreads-gpu-sharing.md | 8 ++++---- .../nvidia-device/dynamic-mig-support.md | 12 +++++------ .../examples/specify-card-type-to-use.md | 2 +- .../version-v2.5.0/contributor/adopters.md | 6 +++--- .../contributor/contribute-docs.md | 14 ++++++------- .../contributor/contributing.md | 12 +++++------ .../contributor/github-workflow.md | 2 +- .../version-v2.5.0/contributor/governance.md | 2 +- .../version-v2.5.0/contributor/ladder.md | 8 ++++---- .../bash-auto-completion-on-linux.md | 2 +- .../version-v2.5.0/developers/dynamic-mig.md | 6 +++--- .../version-v2.5.0/developers/protocol.md | 2 +- .../version-v2.5.0/developers/scheduling.md | 10 +++++----- versioned_docs/version-v2.5.0/faq/faq.md | 7 +++---- versioned_docs/version-v2.5.0/roadmap.md | 2 +- .../troubleshooting/troubleshooting.md | 2 +- .../enable-cambricon-mlu-sharing.md | 10 +++++----- .../version-v2.5.0/userguide/configure.md | 2 +- .../userguide/device-supported.md | 16 +++++++-------- .../enable-enflame-gpu-sharing.md | 12 +++++------ .../hygon-device/enable-hygon-dcu-sharing.md | 10 +++++----- .../enable-iluvatar-gpu-sharing.md | 12 +++++------ .../metax-device/enable-metax-gpu-schedule.md | 2 +- .../metax-device/enable-metax-gpu-sharing.md | 8 ++++---- .../userguide/monitoring/device-allocation.md | 2 +- .../enable-mthreads-gpu-sharing.md | 8 ++++---- .../nvidia-device/dynamic-mig-support.md | 12 +++++------ .../examples/specify-card-type-to-use.md | 2 +- .../version-v2.5.1/contributor/adopters.md | 6 +++--- .../contributor/contribute-docs.md | 14 ++++++------- .../contributor/contributing.md | 12 +++++------ .../contributor/github-workflow.md | 2 +- .../version-v2.5.1/contributor/governance.md | 2 +- .../version-v2.5.1/contributor/ladder.md | 8 ++++---- .../version-v2.5.1/developers/dynamic-mig.md | 6 +++--- .../version-v2.5.1/developers/protocol.md | 2 +- .../version-v2.5.1/developers/scheduling.md | 10 +++++----- .../key-features/device-sharing.md | 2 +- .../troubleshooting/troubleshooting.md | 2 +- .../enable-cambricon-mlu-sharing.md | 10 +++++----- .../userguide/device-supported.md | 18 ++++++++--------- .../hygon-device/enable-hygon-dcu-sharing.md | 10 +++++----- .../metax-device/enable-metax-gpu-schedule.md | 2 +- .../userguide/monitoring/device-allocation.md | 2 +- .../enable-mthreads-gpu-sharing.md | 8 ++++---- .../nvidia-device/dynamic-mig-support.md | 2 +- .../examples/specify-card-type-to-use.md | 2 +- .../version-v2.6.0/contributor/adopters.md | 6 +++--- .../contributor/contribute-docs.md | 14 ++++++------- .../contributor/contributing.md | 12 +++++------ .../contributor/github-workflow.md | 2 +- .../version-v2.6.0/contributor/governance.md | 2 +- .../version-v2.6.0/contributor/ladder.md | 8 ++++---- .../version-v2.6.0/developers/dynamic-mig.md | 6 +++--- .../version-v2.6.0/developers/protocol.md | 2 +- .../version-v2.6.0/developers/scheduling.md | 14 ++++++------- versioned_docs/version-v2.6.0/faq/faq.md | 4 ++-- .../key-features/device-sharing.md | 2 +- .../troubleshooting-copy/troubleshooting.md | 2 +- .../troubleshooting/troubleshooting.md | 2 +- .../enable-cambricon-mlu-sharing.md | 12 +++++------ .../userguide/device-supported.md | 18 ++++++++--------- .../enable-enflame-gcu-sharing.md | 12 +++++------ .../hygon-device/enable-hygon-dcu-sharing.md | 10 +++++----- .../enable-illuvatar-gpu-sharing.md | 12 +++++------ .../metax-gpu/enable-metax-gpu-schedule.md | 2 +- .../metax-sgpu/enable-metax-gpu-sharing.md | 8 ++++---- .../userguide/monitoring/device-allocation.md | 2 +- .../enable-mthreads-gpu-sharing.md | 8 ++++---- .../nvidia-device/dynamic-mig-support.md | 2 +- .../examples/specify-card-type-to-use.md | 2 +- .../version-v2.7.0/contributor/adopters.md | 6 +++--- .../contributor/contribute-docs.md | 14 ++++++------- .../contributor/contributing.md | 12 +++++------ .../contributor/github-workflow.md | 2 +- .../version-v2.7.0/contributor/governance.md | 2 +- .../version-v2.7.0/contributor/ladder.md | 8 ++++---- .../version-v2.7.0/developers/dynamic-mig.md | 6 +++--- .../developers/kunlunxin-topology.md | 2 +- .../version-v2.7.0/developers/protocol.md | 2 +- .../version-v2.7.0/developers/scheduling.md | 10 +++++----- versioned_docs/version-v2.7.0/faq/faq.md | 4 ++-- .../key-features/device-sharing.md | 2 +- .../userguide/device-supported.md | 20 +++++++++---------- .../enable-enflame-gcu-sharing.md | 12 +++++------ .../hygon-device/enable-hygon-dcu-sharing.md | 10 +++++----- .../enable-illuvatar-gpu-sharing.md | 12 +++++------ .../kunlunxin-device/enable-kunlunxin-vxpu.md | 6 +++--- .../userguide/monitoring/device-allocation.md | 2 +- .../enable-mthreads-gpu-sharing.md | 8 ++++---- .../nvidia-device/dynamic-mig-support.md | 2 +- .../examples/specify-card-type-to-use.md | 2 +- .../version-v2.8.0/contributor/adopters.md | 6 +++--- .../contributor/contribute-docs.md | 14 ++++++------- .../contributor/contributing.md | 12 +++++------ .../contributor/github-workflow.md | 2 +- .../version-v2.8.0/contributor/governance.md | 2 +- .../version-v2.8.0/contributor/ladder.md | 8 ++++---- .../version-v2.8.0/developers/dynamic-mig.md | 6 +++--- .../developers/kunlunxin-topology.md | 2 +- .../version-v2.8.0/developers/protocol.md | 2 +- .../version-v2.8.0/developers/scheduling.md | 10 +++++----- versioned_docs/version-v2.8.0/faq/faq.md | 4 ++-- .../key-features/device-sharing.md | 2 +- .../userguide/device-supported.md | 20 +++++++++---------- .../enable-enflame-gcu-sharing.md | 12 +++++------ .../hygon-device/enable-hygon-dcu-sharing.md | 10 +++++----- .../enable-illuvatar-gpu-sharing.md | 12 +++++------ .../kunlunxin-device/enable-kunlunxin-vxpu.md | 6 +++--- .../userguide/monitoring/device-allocation.md | 2 +- .../enable-mthreads-gpu-sharing.md | 8 ++++---- .../nvidia-device/dynamic-mig-support.md | 2 +- .../examples/specify-card-type-to-use.md | 2 +- 174 files changed, 599 insertions(+), 602 deletions(-) diff --git a/docs/contributor/contributing.md b/docs/contributor/contributing.md index 726b5d41..5686eff9 100644 --- a/docs/contributor/contributing.md +++ b/docs/contributor/contributing.md @@ -6,7 +6,7 @@ Welcome to HAMi! ## Code of Conduct -Please make sure to read and observe our [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) +Please make sure to read and observe the [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) ## Community Expectations diff --git a/docs/contributor/github-workflow.md b/docs/contributor/github-workflow.md index 8582a392..a362a3b5 100644 --- a/docs/contributor/github-workflow.md +++ b/docs/contributor/github-workflow.md @@ -110,7 +110,7 @@ in a few cycles. ## Push -When ready to review (or just to establish an offsite backup of your work), +When ready to review (or to establish an offsite backup of your work), push your branch to your fork on `github.com`: ```sh diff --git a/docs/contributor/governance.md b/docs/contributor/governance.md index 5cbbfb25..ca077537 100644 --- a/docs/contributor/governance.md +++ b/docs/contributor/governance.md @@ -19,7 +19,7 @@ The HAMi and its leadership embrace the following values: priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. -* Inclusivity: We innovate through different perspectives and skill sets, which +* Inclusivity: Innovation comes from different perspectives and skill sets, and this can only be accomplished in a welcoming and respectful environment. * Participation: Responsibilities within the project are earned through diff --git a/i18n/zh/docusaurus-plugin-content-blog/kcd-beijing-2026-dra-gpu-scheduling/index.md b/i18n/zh/docusaurus-plugin-content-blog/kcd-beijing-2026-dra-gpu-scheduling/index.md index 3a66c562..3043661a 100644 --- a/i18n/zh/docusaurus-plugin-content-blog/kcd-beijing-2026-dra-gpu-scheduling/index.md +++ b/i18n/zh/docusaurus-plugin-content-blog/kcd-beijing-2026-dra-gpu-scheduling/index.md @@ -69,7 +69,7 @@ HAMi 社区不仅受邀进行了技术分享,也在现场设立了展台,与 - 多卡组合 - 拓扑(NUMA / NVLink) -👉 这直接导致: +这直接导致: - 调度逻辑外溢(extender / sidecar) - 系统复杂度上升 @@ -91,7 +91,7 @@ DRA 的核心优势是: PPT 里有一页非常关键,很多人会忽略: -### 👉 DRA 请求长这样 +### DRA 请求长这样 ```yaml spec: @@ -119,7 +119,7 @@ resources: nvidia.com/gpu: 1 ``` -👉 结论非常明确: +结论非常明确: > **DRA 是能力升级,但 UX 明显退化。** @@ -127,7 +127,7 @@ resources: 这是这次分享最有价值的部分之一: -### 👉 Webhook 自动生成 ResourceClaim +### Webhook 自动生成 ResourceClaim HAMi 的做法不是让用户"直接用 DRA",而是: @@ -178,7 +178,7 @@ DRA driver 并不只是"注册资源",而是完整 lifecycle 管理: - 环境变量管理 - 临时目录(cache / lock) -👉 这意味着: +这意味着: > **GPU 调度已经进入 runtime orchestration 层,而不是简单资源分配。** @@ -191,7 +191,7 @@ PPT 中给出了一个很关键的 benchmark: - HAMi(传统):最高 ~42,000 - HAMi-DRA:显著下降(~30%+ 改善) -👉 这说明: +这说明: > **DRA 的资源预绑定机制,可以减少调度阶段冲突和重试** @@ -211,7 +211,7 @@ PPT 中给出了一个很关键的 benchmark: - ResourceClaim:资源分配 - → **资源视角是第一等公民** -👉 这带来的变化: +这带来的变化: > **Observability 从"推导"变成"直接建模"** @@ -227,7 +227,7 @@ PPT 提出了一个非常关键的未来方向: - PCI bus ID - GPU attributes -👉 这其实是一个更大的叙事: +这其实是一个更大的叙事: > **DRA 是 heterogeneous compute abstraction 的起点** @@ -249,7 +249,7 @@ PPT 提出了一个非常关键的未来方向: - 调度逻辑 → 资源声明 -👉 本质上: +本质上: > **Kubernetes 正在进化为 AI Infra Control Plane** diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/installation/upgrade.md b/i18n/zh/docusaurus-plugin-content-docs/current/installation/upgrade.md index 13366edd..e8eb43fd 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/installation/upgrade.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/installation/upgrade.md @@ -40,7 +40,7 @@ kubectl get all -n kube-system -l app=hami -o yaml > hami-state-backup.yaml ### 3. 清理运行中的工作负载 -⚠️ **关键提醒:** 升级前必须停止或重新调度所有 GPU 工作负载。在存在运行任务的情况下升级,可能导致段错误(segmentation fault)或不可预测行为。 +**关键提醒:** 升级前必须停止或重新调度所有 GPU 工作负载。在存在运行任务的情况下升级,可能导致段错误(segmentation fault)或不可预测行为。 **优雅清理 GPU 工作负载:** diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/userguide/device-supported.md b/i18n/zh/docusaurus-plugin-content-docs/current/userguide/device-supported.md index 20b98df2..6a7770ad 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/userguide/device-supported.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/userguide/device-supported.md @@ -7,13 +7,13 @@ HAMi 支持的设备如下表所示: | 生产商 | 制造商 | 类型 | 内存隔离 | 核心隔离 | 多卡支持 | |-------|-------|-----|---------|--------|---------| -| GPU | NVIDIA | 全部 | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| NPU | Huawei Ascend | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | iluvatar | 全部 | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| GPU | Metax | MXC500 | ✅ | ✅ | ❌ | -| GCU | Enflame | S60 | ✅ | ✅ | ❌ | -| XPU | Kunlunxin | P800 | ✅ | ✅ | ❌ | -| DPU | Teco | 检查中 | 进行中 | 进行中 | ❌ | \ No newline at end of file +| GPU | NVIDIA | 全部 | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| NPU | Huawei Ascend | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | iluvatar | 全部 | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| GPU | Metax | MXC500 | Yes | Yes | No | +| GCU | Enflame | S60 | Yes | Yes | No | +| XPU | Kunlunxin | P800 | Yes | Yes | No | +| DPU | Teco | 检查中 | 进行中 | 进行中 | No | \ No newline at end of file diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/userguide/enflame-device/enable-enflame-gcu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/current/userguide/enflame-device/enable-enflame-gcu-sharing.md index 55a9f13f..15c6e444 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/userguide/enflame-device/enable-enflame-gcu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/userguide/enflame-device/enable-enflame-gcu-sharing.md @@ -8,11 +8,11 @@ linktitle: GPU 共享 本组件支持复用燧原 GCU 设备 (S60),并为此提供以下几种与 vGPU 类似的复用功能,包括: -***GPU 共享***: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 +**GPU 共享**: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 -***百分比切片能力***: 你现在可以用百分比来申请一个 GCU 切片(例如 20%),本组件会确保任务使用的显存和算力不会超过这个百分比对应的数值 +**百分比切片能力**: 你现在可以用百分比来申请一个 GCU 切片(例如 20%),本组件会确保任务使用的显存和算力不会超过这个百分比对应的数值 -***设备 UUID 选择***: 你可以通过注解指定使用或排除特定的 GCU 设备 +**设备 UUID 选择**: 你可以通过注解指定使用或排除特定的 GCU 设备 **部署说明**: 部署本组件后,只需要部署厂家提供的 gcushare-device-plugin 即可使用 diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/current/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md index ae671d5e..e7258589 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md @@ -8,13 +8,13 @@ translated: true 本组件支持复用天数智芯 GPU 设备 (MR-V100、BI-V150、BI-V100),并为此提供以下几种与 vGPU 类似的复用功能,包括: -***GPU 共享***: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 +**GPU 共享**: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 -***可限制分配的显存大小***: 你现在可以用显存值(例如 3000M)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 +**可限制分配的显存大小**: 你现在可以用显存值(例如 3000M)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 -***可限制分配的算力核组比例***: 你现在可以用算力比例(例如 60%)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 +**可限制分配的算力核组比例**: 你现在可以用算力比例(例如 60%)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 -***设备 UUID 选择***: 你可以通过注解指定使用或排除特定的 GPU 设备 +**设备 UUID 选择**: 你可以通过注解指定使用或排除特定的 GPU 设备 **部署说明**: 部署本组件后,只需要部署厂家提供的 gpu-manager 即可使用 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/ladder.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/ladder.md index f79195c0..7ace3e75 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/ladder.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/ladder.md @@ -47,7 +47,7 @@ Description: A Contributor contributes directly to the project and adds value to * Invitations to contributor events * Eligible to become an Organization Member -A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. We wouldn't be where we are today without your contributions. Thank you! 💖 +A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. Thanks to everyone who contributed and helped maintain the project. As long as you contribute to HAMi, your name will be added [to the AUTHORS list](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md). If you don't find your name, please contact us to add it. diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/device-supported.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/device-supported.md index 73144619..39ba619d 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/device-supported.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/device-supported.md @@ -6,10 +6,10 @@ The view of device supported by HAMi is shown in this table below: | Production | manufactor | Type |MemoryIsolation | CoreIsolation | MultiCard support | |-------------|------------|-------------|-----------|---------------|-------------------| -| GPU | NVIDIA | All | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| Ascend | Huawei | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | iluvatar | All | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| DPU | Teco | Checking | In progress | In progress | ❌ | +| GPU | NVIDIA | All | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| Ascend | Huawei | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | iluvatar | All | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| DPU | Teco | Checking | In progress | In progress | No | diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/userguide/device-supported.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/userguide/device-supported.md index 1465e994..6ce8c199 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/userguide/device-supported.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/userguide/device-supported.md @@ -7,10 +7,10 @@ HAMi 支持的设备如下表所示: | 类型 | 制造商 | 型号 | 内存隔离 | 核心隔离 | 多卡支持 | |-----|-------|-----|---------|---------|--------| -| GPU | 英伟达 | 全部 | ✅ | ✅ | ✅ | -| MLU | 寒武纪 | 370, 590 | ✅ | ✅ | ❌ | -| DCU | 中科海光 | Z100, Z100L | ✅ | ✅ | ❌ | -| NPU | 华为昇腾 | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | 天数智芯 | 全部 | ✅ | ✅ | ❌ | -| GPU | 摩尔线程 | MTT S4000 | ✅ | ✅ | ❌ | -| DPU | Teco | 检查中 | 进行中 | 进行中 | ❌ | +| GPU | 英伟达 | 全部 | Yes | Yes | Yes | +| MLU | 寒武纪 | 370, 590 | Yes | Yes | No | +| DCU | 中科海光 | Z100, Z100L | Yes | Yes | No | +| NPU | 华为昇腾 | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | 天数智芯 | 全部 | Yes | Yes | No | +| GPU | 摩尔线程 | MTT S4000 | Yes | Yes | No | +| DPU | Teco | 检查中 | 进行中 | 进行中 | No | diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/userguide/device-supported.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/userguide/device-supported.md index 5c9f1f72..e15a042a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/userguide/device-supported.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/userguide/device-supported.md @@ -7,12 +7,12 @@ HAMi支持的设备视图如下表所示: | 生产商 | 制造商 | 类型 | 内存隔离 | 核心隔离 | 多卡支持 | |-------------|------------|-------------|-----------|---------------|-------------------| -| GPU | NVIDIA | 全部 | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| Ascend | Huawei | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | iluvatar | 全部 | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| GPU | Metax | MXC500 | ✅ | ✅ | ❌ | -| GCU | Enflame | S60 | ✅ | ✅ | ❌ | -| DPU | Teco | 检查中 | 进行中 | 进行中 | ❌ | \ No newline at end of file +| GPU | NVIDIA | 全部 | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| Ascend | Huawei | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | iluvatar | 全部 | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| GPU | Metax | MXC500 | Yes | Yes | No | +| GCU | Enflame | S60 | Yes | Yes | No | +| DPU | Teco | 检查中 | 进行中 | 进行中 | No | \ No newline at end of file diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/device-supported.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/device-supported.md index d37fbc30..90f47542 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/device-supported.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/device-supported.md @@ -7,11 +7,11 @@ HAMi支持的设备视图如下表所示: | 生产商 | 制造商 | 类型 | 内存隔离 | 核心隔离 | 多卡支持 | |-------------|------------|-------------|-----------|---------------|-------------------| -| GPU | NVIDIA | 全部 | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| Ascend | Huawei | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | iluvatar | 全部 | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| GPU | Metax | MXC500 | ✅ | ✅ | ❌ | -| DPU | Teco | 检查中 | 进行中 | 进行中 | ❌ | \ No newline at end of file +| GPU | NVIDIA | 全部 | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| Ascend | Huawei | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | iluvatar | 全部 | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| GPU | Metax | MXC500 | Yes | Yes | No | +| DPU | Teco | 检查中 | 进行中 | 进行中 | No | \ No newline at end of file diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md index a549f754..9d65fae6 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md @@ -7,11 +7,11 @@ title: 启用燧原 GPU 共享 本组件支持复用燧原 GCU 设备 (S60),并为此提供以下几种与 vGPU 类似的复用功能,包括: -***GPU 共享***: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 +**GPU 共享**: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 -***百分比切片能力***: 你现在可以用百分比来申请一个 GCU 切片(例如 20%),本组件会确保任务使用的显存和算力不会超过这个百分比对应的数值 +**百分比切片能力**: 你现在可以用百分比来申请一个 GCU 切片(例如 20%),本组件会确保任务使用的显存和算力不会超过这个百分比对应的数值 -***设备 UUID 选择***: 你可以通过注解指定使用或排除特定的 GCU 设备 +**设备 UUID 选择**: 你可以通过注解指定使用或排除特定的 GCU 设备 **部署说明**: 部署本组件后,只需要部署厂家提供的 gcushare-device-plugin 即可使用 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md index 03829747..9b08227b 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md @@ -7,13 +7,13 @@ title: 启用天数智芯 GPU 共享 本组件支持复用天数智芯 GPU 设备 (MR-V100、BI-V150、BI-V100),并为此提供以下几种与 vGPU 类似的复用功能,包括: -***GPU 共享***: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 +**GPU 共享**: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 -***可限制分配的显存大小***: 你现在可以用显存值(例如 3000M)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 +**可限制分配的显存大小**: 你现在可以用显存值(例如 3000M)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 -***可限制分配的算力核组比例***: 你现在可以用算力比例(例如 60%)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 +**可限制分配的算力核组比例**: 你现在可以用算力比例(例如 60%)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 -***设备 UUID 选择***: 你可以通过注解指定使用或排除特定的 GPU 设备 +**设备 UUID 选择**: 你可以通过注解指定使用或排除特定的 GPU 设备 **部署说明**: 部署本组件后,只需要部署厂家提供的 gpu-manager 即可使用 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md index 3e59d28f..bc32f350 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md @@ -127,7 +127,7 @@ EOF 你可以在容器内使用 `nvidia-smi` 验证设备显存使用情况: -> **⚠️ 警告:** +> **警告:** > 如果你在使用 device plugin 配合 NVIDIA 镜像时未显式请求 GPU, > 那么该节点上所有 GPU 都会暴露在你的容器中。 > 容器中使用的 vGPU 数量不能超过该节点上的 GPU 总数。 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/device-supported.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/device-supported.md index bd82234c..304f98d4 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/device-supported.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/device-supported.md @@ -7,13 +7,13 @@ HAMi支持的设备视图如下表所示: | 生产商 | 制造商 | 类型 | 内存隔离 | 核心隔离 | 多卡支持 | |-------------|------------|-------------|-----------|---------------|-------------------| -| GPU | NVIDIA | 全部 | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| Ascend | Huawei | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | iluvatar | 全部 | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| GPU | Metax | MXC500 | ✅ | ✅ | ❌ | -| GCU | Enflame | S60 | ✅ | ✅ | ❌ | -| XPU | Kunlunxin | P800 | ✅ | ✅ | ❌ | -| DPU | Teco | 检查中 | 进行中 | 进行中 | ❌ | \ No newline at end of file +| GPU | NVIDIA | 全部 | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| Ascend | Huawei | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | iluvatar | 全部 | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| GPU | Metax | MXC500 | Yes | Yes | No | +| GCU | Enflame | S60 | Yes | Yes | No | +| XPU | Kunlunxin | P800 | Yes | Yes | No | +| DPU | Teco | 检查中 | 进行中 | 进行中 | No | \ No newline at end of file diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md index a549f754..9d65fae6 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md @@ -7,11 +7,11 @@ title: 启用燧原 GPU 共享 本组件支持复用燧原 GCU 设备 (S60),并为此提供以下几种与 vGPU 类似的复用功能,包括: -***GPU 共享***: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 +**GPU 共享**: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 -***百分比切片能力***: 你现在可以用百分比来申请一个 GCU 切片(例如 20%),本组件会确保任务使用的显存和算力不会超过这个百分比对应的数值 +**百分比切片能力**: 你现在可以用百分比来申请一个 GCU 切片(例如 20%),本组件会确保任务使用的显存和算力不会超过这个百分比对应的数值 -***设备 UUID 选择***: 你可以通过注解指定使用或排除特定的 GCU 设备 +**设备 UUID 选择**: 你可以通过注解指定使用或排除特定的 GCU 设备 **部署说明**: 部署本组件后,只需要部署厂家提供的 gcushare-device-plugin 即可使用 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md index 8d1b173a..d15a3082 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md @@ -7,13 +7,13 @@ title: 启用天数智芯 GPU 共享 本组件支持复用天数智芯 GPU 设备 (MR-V100、BI-V150、BI-V100),并为此提供以下几种与 vGPU 类似的复用功能,包括: -***GPU 共享***: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 +**GPU 共享**: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 -***可限制分配的显存大小***: 你现在可以用显存值(例如 3000M)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 +**可限制分配的显存大小**: 你现在可以用显存值(例如 3000M)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 -***可限制分配的算力核组比例***: 你现在可以用算力比例(例如 60%)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 +**可限制分配的算力核组比例**: 你现在可以用算力比例(例如 60%)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 -***设备 UUID 选择***: 你可以通过注解指定使用或排除特定的 GPU 设备 +**设备 UUID 选择**: 你可以通过注解指定使用或排除特定的 GPU 设备 **部署说明**: 部署本组件后,只需要部署厂家提供的 gpu-manager 即可使用 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md index 3e59d28f..bc32f350 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md @@ -127,7 +127,7 @@ EOF 你可以在容器内使用 `nvidia-smi` 验证设备显存使用情况: -> **⚠️ 警告:** +> **警告:** > 如果你在使用 device plugin 配合 NVIDIA 镜像时未显式请求 GPU, > 那么该节点上所有 GPU 都会暴露在你的容器中。 > 容器中使用的 vGPU 数量不能超过该节点上的 GPU 总数。 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/device-supported.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/device-supported.md index d11cb466..a195027a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/device-supported.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/device-supported.md @@ -7,13 +7,13 @@ HAMi支持的设备视图如下表所示: | 生产商 | 制造商 | 类型 | 内存隔离 | 核心隔离 | 多卡支持 | |-------------|------------|-------------|-----------|---------------|-------------------| -| GPU | NVIDIA | 全部 | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| Ascend | Huawei | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | iluvatar | 全部 | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| GPU | Metax | MXC500 | ✅ | ✅ | ❌ | -| GCU | Enflame | S60 | ✅ | ✅ | ❌ | -| XPU | Kunlunxin | P800 | ✅ | ✅ | ❌ | -| DPU | Teco | 检查中 | 进行中 | 进行中 | ❌ | \ No newline at end of file +| GPU | NVIDIA | 全部 | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| Ascend | Huawei | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | iluvatar | 全部 | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| GPU | Metax | MXC500 | Yes | Yes | No | +| GCU | Enflame | S60 | Yes | Yes | No | +| XPU | Kunlunxin | P800 | Yes | Yes | No | +| DPU | Teco | 检查中 | 进行中 | 进行中 | No | \ No newline at end of file diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md index 91303632..0dea3f0a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md @@ -8,11 +8,11 @@ linktitle: GPU 共享 本组件支持复用燧原 GCU 设备 (S60),并为此提供以下几种与 vGPU 类似的复用功能,包括: -***GPU 共享***: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 +**GPU 共享**: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 -***百分比切片能力***: 你现在可以用百分比来申请一个 GCU 切片(例如 20%),本组件会确保任务使用的显存和算力不会超过这个百分比对应的数值 +**百分比切片能力**: 你现在可以用百分比来申请一个 GCU 切片(例如 20%),本组件会确保任务使用的显存和算力不会超过这个百分比对应的数值 -***设备 UUID 选择***: 你可以通过注解指定使用或排除特定的 GCU 设备 +**设备 UUID 选择**: 你可以通过注解指定使用或排除特定的 GCU 设备 **部署说明**: 部署本组件后,只需要部署厂家提供的 gcushare-device-plugin 即可使用 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md index ae671d5e..e7258589 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md @@ -8,13 +8,13 @@ translated: true 本组件支持复用天数智芯 GPU 设备 (MR-V100、BI-V150、BI-V100),并为此提供以下几种与 vGPU 类似的复用功能,包括: -***GPU 共享***: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 +**GPU 共享**: 每个任务可以只占用一部分显卡,多个任务可以共享一张显卡 -***可限制分配的显存大小***: 你现在可以用显存值(例如 3000M)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 +**可限制分配的显存大小**: 你现在可以用显存值(例如 3000M)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 -***可限制分配的算力核组比例***: 你现在可以用算力比例(例如 60%)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 +**可限制分配的算力核组比例**: 你现在可以用算力比例(例如 60%)来分配 GPU,本组件会确保任务使用的显存不会超过分配数值 -***设备 UUID 选择***: 你可以通过注解指定使用或排除特定的 GPU 设备 +**设备 UUID 选择**: 你可以通过注解指定使用或排除特定的 GPU 设备 **部署说明**: 部署本组件后,只需要部署厂家提供的 gpu-manager 即可使用 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md index a67d805c..554e2abf 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/userguide/volcano-vgpu/nvidia-gpu/how-to-use-volcano-vgpu.md @@ -128,7 +128,7 @@ EOF 你可以在容器内使用 `nvidia-smi` 验证设备显存使用情况: -> **⚠️ 警告:** +> **警告:** > 如果你在使用 device plugin 配合 NVIDIA 镜像时未显式请求 GPU, > 那么该节点上所有 GPU 都会暴露在你的容器中。 > 容器中使用的 vGPU 数量不能超过该节点上的 GPU 总数。 diff --git a/versioned_docs/version-v1.3.0/contributor/contributing.md b/versioned_docs/version-v1.3.0/contributor/contributing.md index fab6d31c..0c88de85 100644 --- a/versioned_docs/version-v1.3.0/contributor/contributing.md +++ b/versioned_docs/version-v1.3.0/contributor/contributing.md @@ -6,7 +6,7 @@ Welcome to HAMi! ## Code of Conduct -Please make sure to read and observe our [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) +Please make sure to read and observe the [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) ## Community Expectations @@ -20,7 +20,7 @@ HAMi is a community project driven by its community which strives to promote a h ## Your First Contribution -We will help you to contribute in different areas like filing issues, developing features, fixing critical bugs and +Help is available for contributing in areas like filing issues, developing features, fixing critical bugs and getting your work reviewed and merged. If you have questions about the development process, @@ -28,7 +28,7 @@ feel free to [file an issue](https://github.com/Project-HAMi/HAMi/issues/new/cho ## Find something to work on -We are always in need of help, be it fixing documentation, reporting bugs or writing some code. +Help is always welcome - fixing documentation, reporting bugs, writing code. Look at places where you feel best coding practices aren't followed, code refactoring is needed or tests are missing. Here is how you get started. @@ -40,18 +40,18 @@ For example, [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi) has [help wanted](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) and [good first issue](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) labels for issues that should not need deep knowledge of the system. -We can help new contributors who wish to work on such issues. +Maintainers can help new contributors who wish to work on such issues. Another good way to contribute is to find a documentation improvement, such as a missing/broken link. Please see [Contributor Workflow](#contributor-workflow) below for the workflow. #### Work on an issue -When you are willing to take on an issue, just reply on the issue. The maintainer will assign it to you. +When you are willing to take on an issue, reply on the issue. The maintainer will assign it to you. ### File an Issue -While we encourage everyone to contribute code, it is also appreciated when someone reports an issue. +Code contributions are welcome, and bug reports are equally appreciated. Issues should be filed under the appropriate HAMi sub-repository. *Example:* a HAMi issue should be opened to [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi/issues). diff --git a/versioned_docs/version-v1.3.0/contributor/governance.md b/versioned_docs/version-v1.3.0/contributor/governance.md index f49b23b7..aaf1e568 100644 --- a/versioned_docs/version-v1.3.0/contributor/governance.md +++ b/versioned_docs/version-v1.3.0/contributor/governance.md @@ -19,7 +19,7 @@ The HAMi and its leadership embrace the following values: priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. -* Inclusivity: We innovate through different perspectives and skill sets, which +* Inclusivity: Innovation comes from different perspectives and skill sets, and this can only be accomplished in a welcoming and respectful environment. * Participation: Responsibilities within the project are earned through diff --git a/versioned_docs/version-v1.3.0/contributor/ladder.md b/versioned_docs/version-v1.3.0/contributor/ladder.md index 68ee4d27..88d8a2b8 100644 --- a/versioned_docs/version-v1.3.0/contributor/ladder.md +++ b/versioned_docs/version-v1.3.0/contributor/ladder.md @@ -4,7 +4,7 @@ title: Contributor Ladder This docs different ways to get involved and level up within the project. You can see different roles within the project in the contributor roles. -Hello! We are excited that you want to learn more about our project contributor ladder! This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Community members generally start at the first levels of the "ladder" and advance up it as their involvement in the project grows. Our project members are happy to help you advance along the contributor ladder. +This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Each of the contributor roles below is organized into lists of three types of things. "Responsibilities" are things that a contributor is expected to do. "Requirements" are qualifications a person needs to meet to be in that role, and "Privileges" are things contributors on that level are entitled to. @@ -45,7 +45,7 @@ Description: A Contributor contributes directly to the project and adds value to * Invitations to contributor events * Eligible to become an Organization Member -A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. We wouldn't be where we are today without your contributions. Thank you! 💖 +A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. Thanks to everyone who contributed and helped maintain the project. As long as you contribute to HAMi, your name will be added [here](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md). If you don't find your name, please contact us to add it. @@ -126,7 +126,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github ## An active maintainer should -* Actively participate in reviewing pull requests and incoming issues. Note that there are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. +* Actively participate in reviewing pull requests and incoming issues. There are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. * Actively participate in discussions about design and the future of the project. @@ -140,7 +140,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github New maintainers are added by consensus among the current group of maintainers. This can be done via a private discussion via Slack or email. A majority of maintainers should support the addition of the new person, and no single maintainer should object to adding the new maintainer. -When adding a new maintainer, we should file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. +When adding a new maintainer, file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. ## Removing Maintainers diff --git a/versioned_docs/version-v1.3.0/developers/bash-auto-completion-on-linux.md b/versioned_docs/version-v1.3.0/developers/bash-auto-completion-on-linux.md index 489f46c4..6b22baf8 100644 --- a/versioned_docs/version-v1.3.0/developers/bash-auto-completion-on-linux.md +++ b/versioned_docs/version-v1.3.0/developers/bash-auto-completion-on-linux.md @@ -47,4 +47,4 @@ Both approaches are equivalent. After reloading your shell, karmadactl autocompl ## Enable kubectl-karmada autocompletion Currently, kubectl plugins do not support autocomplete, but it is already planned in [Command line completion for kubectl plugins](https://github.com/kubernetes/kubernetes/issues/74178). -We will update the documentation as soon as it does. +Documentation will be updated when support is added. diff --git a/versioned_docs/version-v1.3.0/developers/dynamic-mig.md b/versioned_docs/version-v1.3.0/developers/dynamic-mig.md index ccdde8a9..65f5c93b 100644 --- a/versioned_docs/version-v1.3.0/developers/dynamic-mig.md +++ b/versioned_docs/version-v1.3.0/developers/dynamic-mig.md @@ -10,8 +10,8 @@ This feature will not be implemented without the help of @sailorvii. ## Introduction -The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, so we chose the MPS and MIG. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. We want to develop an automatic slice plugin and create the slice when the user require it. -For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, we consider the CPU, Mem, GPU memory and other user-defined resource. +The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, MPS and MIG are preferred. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. The goal is an automatic slice plugin that creates slices on demand. +For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, the scheduler considers CPU, memory, GPU memory, and other user-defined resources. HAMi is done by using [hami-core](https://github.com/Project-HAMi/HAMi-core), which is a cuda-hacking library. But mig is also widely used across the world. A unified API for dynamic-mig and hami-core is needed. ## Targets @@ -150,7 +150,7 @@ The Procedure of a vGPU task which uses dynamic-mig is shown below: HAMi dynamic MIG procedure flowchart showing task scheduling process -Note that after submitted a task, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. +After a task is submitted, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. If you submit the example on an empty A100-PCIE-40GB node, then it will select a GPU and choose MIG template below: diff --git a/versioned_docs/version-v1.3.0/developers/protocol.md b/versioned_docs/version-v1.3.0/developers/protocol.md index 1206de7b..382f0490 100644 --- a/versioned_docs/version-v1.3.0/developers/protocol.md +++ b/versioned_docs/version-v1.3.0/developers/protocol.md @@ -6,7 +6,7 @@ title: Protocol design ### Device Registration -In order to perform more accurate scheduling, the HAMI scheduler needs to perceive the specifications of the device during device registration, including UUID, video memory, computing power, model, numa number, etc +In order to perform more accurate scheduling, the HAMi scheduler needs to perceive the specifications of the device during device registration, including UUID, video memory, computing power, model, numa number, etc However, the device-plugin device registration API does not provide corresponding parameter acquisition, so HAMi-device-plugin stores these supplementary information in the node annotations during registering for the scheduler to read, as the following figure shows: diff --git a/versioned_docs/version-v1.3.0/developers/scheduling.md b/versioned_docs/version-v1.3.0/developers/scheduling.md index 02270146..7f8ce100 100644 --- a/versioned_docs/version-v1.3.0/developers/scheduling.md +++ b/versioned_docs/version-v1.3.0/developers/scheduling.md @@ -8,7 +8,7 @@ Current in a cluster with many GPU nodes, nodes are not `binpack` or `spread` wh ## Proposal -We add a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and +The scheduler adds a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and use can set Pod annotation to change this default policy, use `hami.io/node-scheduler-policy` and `hami.io/gpu-scheduler-policy` to overlay scheduler config. ### User Stories @@ -104,7 +104,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Binpack` policy we can select `Node1`. +So, in `Binpack` policy, the selected node is `Node1`. #### Spread @@ -124,7 +124,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Spread` policy we can select `Node2`. +So, in `Spread` policy, the selected node is `Node2`. ### GPU-scheduler-policy @@ -147,7 +147,7 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Binpack` policy we can select `GPU2`. +So, in `Binpack` policy, the selected node is `GPU2`. #### Spread @@ -166,4 +166,4 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Spread` policy we can select `GPU1`. +So, in `Spread` policy, the selected node is `GPU1`. diff --git a/versioned_docs/version-v1.3.0/faq/faq.md b/versioned_docs/version-v1.3.0/faq/faq.md index c52d50a2..c7403c18 100644 --- a/versioned_docs/version-v1.3.0/faq/faq.md +++ b/versioned_docs/version-v1.3.0/faq/faq.md @@ -19,15 +19,14 @@ Both of them are used to hold the propagation declaration, but they have differe `kube-controller-manager` is composed of a bunch of controllers, Karmada inherits some controllers from it to keep a consistent user experience and behavior. -It's worth noting that not all controllers are needed by Karmada, for the recommended controllers please +Not all controllers are needed by Karmada, for the recommended controllers please ## Can I install Karmada in a Kubernetes cluster and reuse the kube-apiserver as Karmada apiserver? The quick answer is `yes`. In that case, you can save the effort to deploy -[karmada-apiserver](https://github.com/karmada-io/karmada/blob/master/artifacts/deploy/karmada-apiserver.yaml) and just -share the APIServer between Kubernetes and Karmada. In addition, the high availability capabilities in the origin clusters -can be inherited seamlessly. We do have some users using Karmada in this way. +[karmada-apiserver](https://github.com/karmada-io/karmada/blob/master/artifacts/deploy/karmada-apiserver.yaml) and share the APIServer between Kubernetes and Karmada. In addition, the high availability capabilities in the origin clusters +can be inherited. Some users run Karmada this way. There are some things you should consider before doing so: diff --git a/versioned_docs/version-v1.3.0/roadmap.md b/versioned_docs/version-v1.3.0/roadmap.md index a50c8db3..8793e529 100644 --- a/versioned_docs/version-v1.3.0/roadmap.md +++ b/versioned_docs/version-v1.3.0/roadmap.md @@ -6,7 +6,7 @@ title: Karmada Roadmap This document defines a high level roadmap for Karmada development and upcoming releases. Community and contributor involvement is vital for successfully implementing all desired items for each release. -We hope that the items listed below will inspire further engagement from the community to keep karmada progressing and shipping exciting and valuable features. +The items below are intended to inspire further community engagement to keep HAMi progressing and shipping exciting and valuable features. ## 2022 H1 diff --git a/versioned_docs/version-v1.3.0/troubleshooting/troubleshooting.md b/versioned_docs/version-v1.3.0/troubleshooting/troubleshooting.md index f4e9a15a..f99bfb8a 100644 --- a/versioned_docs/version-v1.3.0/troubleshooting/troubleshooting.md +++ b/versioned_docs/version-v1.3.0/troubleshooting/troubleshooting.md @@ -6,6 +6,6 @@ title: Troubleshooting - Currently, A100 MIG can be supported in only "none" and "mixed" modes. - Tasks with the "nodeName" field cannot be scheduled at the moment; please use "nodeSelector" instead. - Only computing tasks are currently supported; video codec processing is not supported. -- We change `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: +- The `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: - Manually execute `kubectl edit daemonset` to modify the `device-plugin` env var from `NodeName` to `NODE_NAME`. - Upgrade to the latest version using helm, the latest version of `device-plugin` image version is `v2.3.10`, execute `helm upgrade hami hami/hami -n kube-system`, it will be fixed automatically. diff --git a/versioned_docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/versioned_docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index ae498abe..92b3f2fb 100644 --- a/versioned_docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/versioned_docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -4,15 +4,15 @@ title: Enable cambricon MLU sharing ## Introduction -**We now support cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: -***MLU sharing***: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. +**MLU sharing**: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. -***Device Memory Control***: MLUs can be allocated with certain device memory size on certain type(i.e 370) and have made it that it does not exceed the boundary. +**Device Memory Control**: MLUs can be allocated with certain device memory size on certain type(i.e 370) and have made it that it does not exceed the boundary. -***MLU Type Specification***: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. +**MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. ## Prerequisites diff --git a/versioned_docs/version-v1.3.0/userguide/configure.md b/versioned_docs/version-v1.3.0/userguide/configure.md index 8e3a34f3..536c889e 100644 --- a/versioned_docs/version-v1.3.0/userguide/configure.md +++ b/versioned_docs/version-v1.3.0/userguide/configure.md @@ -18,7 +18,7 @@ You can update these configurations using one of the following methods: 2. Modify Helm Chart: Update the corresponding values in the [ConfigMap](https://raw.githubusercontent.com/archlitchi/HAMi/refs/heads/master/charts/hami/templates/scheduler/device-configmap.yaml), then reapply the Helm Chart to regenerate the ConfigMap. * `nvidia.deviceMemoryScaling:` - Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if we set `nvidia.deviceMemoryScaling` argument to *S*, vGPUs split by this GPU will totally get `S * M` memory in Kubernetes with our device plugin. + Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if `nvidia.deviceMemoryScaling` is set argument to *S*, vGPUs split by this GPU will totally get `S * M` memory in Kubernetes with the HAMi device plugin. * `nvidia.deviceSplitCount:` Integer type, by default: equals 10. Maximum tasks assigned to a simple GPU device. * `nvidia.migstrategy:` diff --git a/versioned_docs/version-v1.3.0/userguide/device-supported.md b/versioned_docs/version-v1.3.0/userguide/device-supported.md index 73144619..39ba619d 100644 --- a/versioned_docs/version-v1.3.0/userguide/device-supported.md +++ b/versioned_docs/version-v1.3.0/userguide/device-supported.md @@ -6,10 +6,10 @@ The view of device supported by HAMi is shown in this table below: | Production | manufactor | Type |MemoryIsolation | CoreIsolation | MultiCard support | |-------------|------------|-------------|-----------|---------------|-------------------| -| GPU | NVIDIA | All | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| Ascend | Huawei | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | iluvatar | All | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| DPU | Teco | Checking | In progress | In progress | ❌ | +| GPU | NVIDIA | All | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| Ascend | Huawei | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | iluvatar | All | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| DPU | Teco | Checking | In progress | In progress | No | diff --git a/versioned_docs/version-v1.3.0/userguide/hygon-device/enable-hygon-dcu-sharing.md b/versioned_docs/version-v1.3.0/userguide/hygon-device/enable-hygon-dcu-sharing.md index a90f4086..64fd849b 100644 --- a/versioned_docs/version-v1.3.0/userguide/hygon-device/enable-hygon-dcu-sharing.md +++ b/versioned_docs/version-v1.3.0/userguide/hygon-device/enable-hygon-dcu-sharing.md @@ -4,15 +4,15 @@ title: Enable Hygon DCU sharing ## Introduction -**We now support hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: -***DCU sharing***: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. +**DCU sharing**: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. -***Device Memory Control***: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. +**Device Memory Control**: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. -***Device compute core limitation***: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) +**Device compute core limitation**: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) -***DCU Type Specification***: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. +**DCU Type Specification**: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. ## Prerequisites diff --git a/versioned_docs/version-v1.3.0/userguide/metax-device/enable-metax-gpu-schedule.md b/versioned_docs/version-v1.3.0/userguide/metax-device/enable-metax-gpu-schedule.md index 164f7403..2b3cf90f 100644 --- a/versioned_docs/version-v1.3.0/userguide/metax-device/enable-metax-gpu-schedule.md +++ b/versioned_docs/version-v1.3.0/userguide/metax-device/enable-metax-gpu-schedule.md @@ -2,7 +2,7 @@ title: Enable Metax GPU topology-aware scheduling --- -**We now support metax.com/gpu by implementing topo-awareness among metax GPUs**: +**HAMi now supports metax.com/gpu by implementing topo-awareness among metax GPUs**: When multiple GPUs are configured on a single server, the GPU cards are connected to the same PCIe Switch or MetaXLink depending on whether they are connected , there is a near-far relationship. This forms a topology among all the cards on the server, as shown in the following figure: diff --git a/versioned_docs/version-v1.3.0/userguide/monitoring/device-allocation.md b/versioned_docs/version-v1.3.0/userguide/monitoring/device-allocation.md index 94bbf0a5..7060687e 100644 --- a/versioned_docs/version-v1.3.0/userguide/monitoring/device-allocation.md +++ b/versioned_docs/version-v1.3.0/userguide/monitoring/device-allocation.md @@ -21,4 +21,4 @@ It contains the following metrics: | GPUDeviceSharedNum | Number of containers sharing this GPU | `{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",nodeid="aio-node67",zone="vGPU"}` 1 | | vGPUPodsDeviceAllocated | vGPU Allocated from pods | `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 | -> **Note** Please note that, this is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file +> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file diff --git a/versioned_docs/version-v1.3.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md b/versioned_docs/version-v1.3.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md index 9942c982..1b548b3f 100644 --- a/versioned_docs/version-v1.3.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md +++ b/versioned_docs/version-v1.3.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md @@ -4,13 +4,13 @@ title: Enable Mthreads GPU sharing ## Introduction -**We now support mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. ## Important Notes diff --git a/versioned_docs/version-v1.3.0/userguide/nvidia-device/dynamic-mig-support.md b/versioned_docs/version-v1.3.0/userguide/nvidia-device/dynamic-mig-support.md index 13dd62d1..5f2a1c80 100644 --- a/versioned_docs/version-v1.3.0/userguide/nvidia-device/dynamic-mig-support.md +++ b/versioned_docs/version-v1.3.0/userguide/nvidia-device/dynamic-mig-support.md @@ -4,17 +4,17 @@ title: Enable dynamic-mig feature ## Introduction -**We now support dynamic-mig by using mig-parted to adjust mig-devices dynamically**, including: +**HAMi now supports dynamic-mig by using mig-parted to adjust mig-devices dynamically**, including: -***Dynamic MIG instance management***: User don't need to operate on GPU node, using 'nvidia-smi -i 0 -mig 1' or other command to manage MIG instance, all will be done by HAMi-device-plugin. +**Dynamic MIG instance management**: User don't need to operate on GPU node, using 'nvidia-smi -i 0 -mig 1' or other command to manage MIG instance, all will be done by HAMi-device-plugin. -***Dynamic MIG Adjustment***: Each MIG device managed by HAMi will dynamically adjust their MIG template according to tasks submitted when necessary. +**Dynamic MIG Adjustment**: Each MIG device managed by HAMi will dynamically adjust their MIG template according to tasks submitted when necessary. -***Device MIG Observation***: Each MIG instance generated by HAMi will be shown in scheduler-monitor, including task information. user can get a clear overview of MIG nodes. +**Device MIG Observation**: Each MIG instance generated by HAMi will be shown in scheduler-monitor, including task information. user can get a clear overview of MIG nodes. -***Compatible with HAMi-core nodes***: HAMi can manage a unified GPU pool of `HAMi-core node` and `mig node`. A task can be scheduled to either node if not appointed manually by using `nvidia.com/vgpu-mode` annotation. +**Compatible with HAMi-core nodes**: HAMi can manage a unified GPU pool of `HAMi-core node` and `mig node`. A task can be scheduled to either node if not appointed manually by using `nvidia.com/vgpu-mode` annotation. -***Unified API with HAMi-core***: Zero work needs to be done to make the job compatible with dynamic-mig feature. +**Unified API with HAMi-core**: Zero work needs to be done to make the job compatible with dynamic-mig feature. ## Prerequisites diff --git a/versioned_docs/version-v1.3.0/userguide/nvidia-device/examples/specify-card-type-to-use.md b/versioned_docs/version-v1.3.0/userguide/nvidia-device/examples/specify-card-type-to-use.md index 397e984f..946f0e50 100644 --- a/versioned_docs/version-v1.3.0/userguide/nvidia-device/examples/specify-card-type-to-use.md +++ b/versioned_docs/version-v1.3.0/userguide/nvidia-device/examples/specify-card-type-to-use.md @@ -24,4 +24,4 @@ spec: nvidia.com/gpu: 2 # requesting 2 vGPUs ``` -> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, we want to run this job on A100 or V100* +> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, the job targets A100 or V100* diff --git a/versioned_docs/version-v2.4.1/contributor/contribute-docs.md b/versioned_docs/version-v2.4.1/contributor/contribute-docs.md index 5fe0302e..7ebbb69c 100644 --- a/versioned_docs/version-v2.4.1/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.4.1/contributor/contribute-docs.md @@ -9,12 +9,12 @@ the `Project-HAMi/website` repository. ## Prerequisites - Docs, like codes, are also categorized and stored by version. - 1.3 is the first version we have archived. + 1.3 is the first version is the first archived. - Docs need to be translated into multiple languages for readers from different regions. The community now supports both Chinese and English. English is the official language of documentation. -- For our docs we use markdown. If you are unfamiliar with Markdown, please see [https://guides.github.com/features/mastering-markdown/](https://guides.github.com/features/mastering-markdown/) or [https://www.markdownguide.org/](https://www.markdownguide.org/) if you are looking for something more substantial. -- We get some additions through [Docusaurus 2](https://docusaurus.io/), a model static website generator. +- The docs use markdown. If you are unfamiliar with Markdown, please see [https://guides.github.com/features/mastering-markdown/](https://guides.github.com/features/mastering-markdown/) or [https://www.markdownguide.org/](https://www.markdownguide.org/) if you are looking for something more substantial. +- The site uses [Docusaurus 2](https://docusaurus.io/), a model static website generator. ## Setup @@ -85,7 +85,7 @@ title: A doc with tags ## secondary title ``` -The top section between two lines of --- is the Front Matter section. Here we define a couple of entries which tell Docusaurus how to handle the article: +The top section between two lines of --- is the Front Matter section. These entries tell Docusaurus how to handle the article: - Title is the equivalent of the `

` in a HTML document or `# ` in a Markdown article. - Each document has a unique ID. By default, a document ID is the name of the document (without the extension) related to the root docs directory. @@ -101,7 +101,7 @@ You can easily route to other places by adding any of the following links: You can use relative paths to index the corresponding files. - Link to pictures or other resources. If your article contains images, prefer storing them in `/static/img/docs/` and linking - with absolute paths. We use language-aware folders: + with absolute paths. Language-aware folders are used: - `/static/img/docs/common/` for shared images - `/static/img/docs/en/` for English-only images - `/static/img/docs/zh/` for Chinese-only images @@ -187,5 +187,5 @@ If the previewed page is not what you expected, please check your docs again. ### Versioning -For the newly supplemented documents of each version, we will synchronize to the latest version on the release date of each version, and the documents of the old version will not be modified. -For errata found in the documentation, we will fix it with every release. +For the newly supplemented documents of each version, they are synchronized to the latest version on the release date of each version, and the documents of the old version will not be modified. +For errata found in the documentation, fixes are applied with every release. diff --git a/versioned_docs/version-v2.4.1/contributor/contributing.md b/versioned_docs/version-v2.4.1/contributor/contributing.md index fab6d31c..0c88de85 100644 --- a/versioned_docs/version-v2.4.1/contributor/contributing.md +++ b/versioned_docs/version-v2.4.1/contributor/contributing.md @@ -6,7 +6,7 @@ Welcome to HAMi! ## Code of Conduct -Please make sure to read and observe our [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) +Please make sure to read and observe the [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) ## Community Expectations @@ -20,7 +20,7 @@ HAMi is a community project driven by its community which strives to promote a h ## Your First Contribution -We will help you to contribute in different areas like filing issues, developing features, fixing critical bugs and +Help is available for contributing in areas like filing issues, developing features, fixing critical bugs and getting your work reviewed and merged. If you have questions about the development process, @@ -28,7 +28,7 @@ feel free to [file an issue](https://github.com/Project-HAMi/HAMi/issues/new/cho ## Find something to work on -We are always in need of help, be it fixing documentation, reporting bugs or writing some code. +Help is always welcome - fixing documentation, reporting bugs, writing code. Look at places where you feel best coding practices aren't followed, code refactoring is needed or tests are missing. Here is how you get started. @@ -40,18 +40,18 @@ For example, [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi) has [help wanted](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) and [good first issue](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) labels for issues that should not need deep knowledge of the system. -We can help new contributors who wish to work on such issues. +Maintainers can help new contributors who wish to work on such issues. Another good way to contribute is to find a documentation improvement, such as a missing/broken link. Please see [Contributor Workflow](#contributor-workflow) below for the workflow. #### Work on an issue -When you are willing to take on an issue, just reply on the issue. The maintainer will assign it to you. +When you are willing to take on an issue, reply on the issue. The maintainer will assign it to you. ### File an Issue -While we encourage everyone to contribute code, it is also appreciated when someone reports an issue. +Code contributions are welcome, and bug reports are equally appreciated. Issues should be filed under the appropriate HAMi sub-repository. *Example:* a HAMi issue should be opened to [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi/issues). diff --git a/versioned_docs/version-v2.4.1/contributor/github-workflow.md b/versioned_docs/version-v2.4.1/contributor/github-workflow.md index 2018d45e..e582f7f6 100644 --- a/versioned_docs/version-v2.4.1/contributor/github-workflow.md +++ b/versioned_docs/version-v2.4.1/contributor/github-workflow.md @@ -107,7 +107,7 @@ in a few cycles. ### 6 Push -When ready to review (or just to establish an offsite backup of your work), +When ready to review (or to establish an offsite backup of your work), push your branch to your fork on `github.com`: ```sh diff --git a/versioned_docs/version-v2.4.1/contributor/governance.md b/versioned_docs/version-v2.4.1/contributor/governance.md index f49b23b7..aaf1e568 100644 --- a/versioned_docs/version-v2.4.1/contributor/governance.md +++ b/versioned_docs/version-v2.4.1/contributor/governance.md @@ -19,7 +19,7 @@ The HAMi and its leadership embrace the following values: priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. -* Inclusivity: We innovate through different perspectives and skill sets, which +* Inclusivity: Innovation comes from different perspectives and skill sets, and this can only be accomplished in a welcoming and respectful environment. * Participation: Responsibilities within the project are earned through diff --git a/versioned_docs/version-v2.4.1/contributor/ladder.md b/versioned_docs/version-v2.4.1/contributor/ladder.md index 26ea756f..14aff5d7 100644 --- a/versioned_docs/version-v2.4.1/contributor/ladder.md +++ b/versioned_docs/version-v2.4.1/contributor/ladder.md @@ -4,7 +4,7 @@ title: Contributor Ladder This docs different ways to get involved and level up within the project. You can see different roles within the project in the contributor roles. -Hello! We are excited that you want to learn more about our project contributor ladder! This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Community members generally start at the first levels of the "ladder" and advance up it as their involvement in the project grows. Our project members are happy to help you advance along the contributor ladder. +This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Each of the contributor roles below is organized into lists of three types of things. "Responsibilities" are things that a contributor is expected to do. "Requirements" are qualifications a person needs to meet to be in that role, and "Privileges" are things contributors on that level are entitled to. @@ -45,7 +45,7 @@ Description: A Contributor contributes directly to the project and adds value to * Invitations to contributor events * Eligible to become an Organization Member -A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. We wouldn't be where we are today without your contributions. Thank you! 💖 +A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. Thanks to everyone who contributed and helped maintain the project. As long as you contribute to HAMi, your name will be added to the [HAMi AUTHORS list](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md). If you don't find your name, please contact us to add it. @@ -126,7 +126,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github ## An active maintainer should -* Actively participate in reviewing pull requests and incoming issues. Note that there are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. +* Actively participate in reviewing pull requests and incoming issues. There are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. * Actively participate in discussions about design and the future of the project. @@ -140,7 +140,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github New maintainers are added by consensus among the current group of maintainers. This can be done via a private discussion via Slack or email. A majority of maintainers should support the addition of the new person, and no single maintainer should object to adding the new maintainer. -When adding a new maintainer, we should file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. +When adding a new maintainer, file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. ## Removing Maintainers diff --git a/versioned_docs/version-v2.4.1/developers/bash-auto-completion-on-linux.md b/versioned_docs/version-v2.4.1/developers/bash-auto-completion-on-linux.md index 489f46c4..6b22baf8 100644 --- a/versioned_docs/version-v2.4.1/developers/bash-auto-completion-on-linux.md +++ b/versioned_docs/version-v2.4.1/developers/bash-auto-completion-on-linux.md @@ -47,4 +47,4 @@ Both approaches are equivalent. After reloading your shell, karmadactl autocompl ## Enable kubectl-karmada autocompletion Currently, kubectl plugins do not support autocomplete, but it is already planned in [Command line completion for kubectl plugins](https://github.com/kubernetes/kubernetes/issues/74178). -We will update the documentation as soon as it does. +Documentation will be updated when support is added. diff --git a/versioned_docs/version-v2.4.1/developers/dynamic-mig.md b/versioned_docs/version-v2.4.1/developers/dynamic-mig.md index fd22875b..8872a832 100644 --- a/versioned_docs/version-v2.4.1/developers/dynamic-mig.md +++ b/versioned_docs/version-v2.4.1/developers/dynamic-mig.md @@ -10,8 +10,8 @@ This feature will not be implemented without the help of @sailorvii. ## Introduction -The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, so we chose the MPS and MIG. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. We want to develop an automatic slice plugin and create the slice when the user require it. -For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, we consider the CPU, Mem, GPU memory and other user-defined resource. +The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, MPS and MIG are preferred. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. The goal is an automatic slice plugin that creates slices on demand. +For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, the scheduler considers CPU, memory, GPU memory, and other user-defined resources. HAMi is done by using [hami-core](https://github.com/Project-HAMi/HAMi-core), which is a cuda-hacking library. But mig is also widely used across the world. A unified API for dynamic-mig and hami-core is needed. ## Targets @@ -149,7 +149,7 @@ The Procedure of a vGPU task which uses dynamic-mig is shown below: <img src="https://github.com/Project-HAMi/HAMi/blob/master/docs/develop/imgs/hami-dynamic-mig-procedure.png?raw=true" width="800" alt="HAMi dynamic MIG procedure flowchart showing task scheduling process" /> -Note that after submitted a task, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. +After a task is submitted, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. If you submit the example on an empty A100-PCIE-40GB node, then it will select a GPU and choose MIG template below: diff --git a/versioned_docs/version-v2.4.1/developers/protocol.md b/versioned_docs/version-v2.4.1/developers/protocol.md index 0a02dba9..6e680bd6 100644 --- a/versioned_docs/version-v2.4.1/developers/protocol.md +++ b/versioned_docs/version-v2.4.1/developers/protocol.md @@ -6,7 +6,7 @@ title: Protocol design ### Device Registration -In order to perform more accurate scheduling, the HAMI scheduler needs to perceive the specifications of the device during device registration, including UUID, video memory, computing power, model, numa number, etc +In order to perform more accurate scheduling, the HAMi scheduler needs to perceive the specifications of the device during device registration, including UUID, video memory, computing power, model, numa number, etc However, the device-plugin device registration API does not provide corresponding parameter acquisition, so HAMi-device-plugin stores these supplementary information in the node annotations during registering for the scheduler to read, as the following figure shows: diff --git a/versioned_docs/version-v2.4.1/developers/scheduling.md b/versioned_docs/version-v2.4.1/developers/scheduling.md index 02270146..7f8ce100 100644 --- a/versioned_docs/version-v2.4.1/developers/scheduling.md +++ b/versioned_docs/version-v2.4.1/developers/scheduling.md @@ -8,7 +8,7 @@ Current in a cluster with many GPU nodes, nodes are not `binpack` or `spread` wh ## Proposal -We add a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and +The scheduler adds a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and use can set Pod annotation to change this default policy, use `hami.io/node-scheduler-policy` and `hami.io/gpu-scheduler-policy` to overlay scheduler config. ### User Stories @@ -104,7 +104,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Binpack` policy we can select `Node1`. +So, in `Binpack` policy, the selected node is `Node1`. #### Spread @@ -124,7 +124,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Spread` policy we can select `Node2`. +So, in `Spread` policy, the selected node is `Node2`. ### GPU-scheduler-policy @@ -147,7 +147,7 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Binpack` policy we can select `GPU2`. +So, in `Binpack` policy, the selected node is `GPU2`. #### Spread @@ -166,4 +166,4 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Spread` policy we can select `GPU1`. +So, in `Spread` policy, the selected node is `GPU1`. diff --git a/versioned_docs/version-v2.4.1/faq/faq.md b/versioned_docs/version-v2.4.1/faq/faq.md index c52d50a2..c7403c18 100644 --- a/versioned_docs/version-v2.4.1/faq/faq.md +++ b/versioned_docs/version-v2.4.1/faq/faq.md @@ -19,15 +19,14 @@ Both of them are used to hold the propagation declaration, but they have differe `kube-controller-manager` is composed of a bunch of controllers, Karmada inherits some controllers from it to keep a consistent user experience and behavior. -It's worth noting that not all controllers are needed by Karmada, for the recommended controllers please +Not all controllers are needed by Karmada, for the recommended controllers please ## Can I install Karmada in a Kubernetes cluster and reuse the kube-apiserver as Karmada apiserver? The quick answer is `yes`. In that case, you can save the effort to deploy -[karmada-apiserver](https://github.com/karmada-io/karmada/blob/master/artifacts/deploy/karmada-apiserver.yaml) and just -share the APIServer between Kubernetes and Karmada. In addition, the high availability capabilities in the origin clusters -can be inherited seamlessly. We do have some users using Karmada in this way. +[karmada-apiserver](https://github.com/karmada-io/karmada/blob/master/artifacts/deploy/karmada-apiserver.yaml) and share the APIServer between Kubernetes and Karmada. In addition, the high availability capabilities in the origin clusters +can be inherited. Some users run Karmada this way. There are some things you should consider before doing so: diff --git a/versioned_docs/version-v2.4.1/roadmap.md b/versioned_docs/version-v2.4.1/roadmap.md index a50c8db3..8793e529 100644 --- a/versioned_docs/version-v2.4.1/roadmap.md +++ b/versioned_docs/version-v2.4.1/roadmap.md @@ -6,7 +6,7 @@ title: Karmada Roadmap This document defines a high level roadmap for Karmada development and upcoming releases. Community and contributor involvement is vital for successfully implementing all desired items for each release. -We hope that the items listed below will inspire further engagement from the community to keep karmada progressing and shipping exciting and valuable features. +The items below are intended to inspire further community engagement to keep HAMi progressing and shipping exciting and valuable features. ## 2022 H1 diff --git a/versioned_docs/version-v2.4.1/troubleshooting/troubleshooting.md b/versioned_docs/version-v2.4.1/troubleshooting/troubleshooting.md index f4e9a15a..f99bfb8a 100644 --- a/versioned_docs/version-v2.4.1/troubleshooting/troubleshooting.md +++ b/versioned_docs/version-v2.4.1/troubleshooting/troubleshooting.md @@ -6,6 +6,6 @@ title: Troubleshooting - Currently, A100 MIG can be supported in only "none" and "mixed" modes. - Tasks with the "nodeName" field cannot be scheduled at the moment; please use "nodeSelector" instead. - Only computing tasks are currently supported; video codec processing is not supported. -- We change `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: +- The `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: - Manually execute `kubectl edit daemonset` to modify the `device-plugin` env var from `NodeName` to `NODE_NAME`. - Upgrade to the latest version using helm, the latest version of `device-plugin` image version is `v2.3.10`, execute `helm upgrade hami hami/hami -n kube-system`, it will be fixed automatically. diff --git a/versioned_docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/versioned_docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index ae498abe..92b3f2fb 100644 --- a/versioned_docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/versioned_docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -4,15 +4,15 @@ title: Enable cambricon MLU sharing ## Introduction -**We now support cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: -***MLU sharing***: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. +**MLU sharing**: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. -***Device Memory Control***: MLUs can be allocated with certain device memory size on certain type(i.e 370) and have made it that it does not exceed the boundary. +**Device Memory Control**: MLUs can be allocated with certain device memory size on certain type(i.e 370) and have made it that it does not exceed the boundary. -***MLU Type Specification***: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. +**MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. ## Prerequisites diff --git a/versioned_docs/version-v2.4.1/userguide/configure.md b/versioned_docs/version-v2.4.1/userguide/configure.md index 8e3a34f3..536c889e 100644 --- a/versioned_docs/version-v2.4.1/userguide/configure.md +++ b/versioned_docs/version-v2.4.1/userguide/configure.md @@ -18,7 +18,7 @@ You can update these configurations using one of the following methods: 2. Modify Helm Chart: Update the corresponding values in the [ConfigMap](https://raw.githubusercontent.com/archlitchi/HAMi/refs/heads/master/charts/hami/templates/scheduler/device-configmap.yaml), then reapply the Helm Chart to regenerate the ConfigMap. * `nvidia.deviceMemoryScaling:` - Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if we set `nvidia.deviceMemoryScaling` argument to *S*, vGPUs split by this GPU will totally get `S * M` memory in Kubernetes with our device plugin. + Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if `nvidia.deviceMemoryScaling` is set argument to *S*, vGPUs split by this GPU will totally get `S * M` memory in Kubernetes with the HAMi device plugin. * `nvidia.deviceSplitCount:` Integer type, by default: equals 10. Maximum tasks assigned to a simple GPU device. * `nvidia.migstrategy:` diff --git a/versioned_docs/version-v2.4.1/userguide/device-supported.md b/versioned_docs/version-v2.4.1/userguide/device-supported.md index 73144619..39ba619d 100644 --- a/versioned_docs/version-v2.4.1/userguide/device-supported.md +++ b/versioned_docs/version-v2.4.1/userguide/device-supported.md @@ -6,10 +6,10 @@ The view of device supported by HAMi is shown in this table below: | Production | manufactor | Type |MemoryIsolation | CoreIsolation | MultiCard support | |-------------|------------|-------------|-----------|---------------|-------------------| -| GPU | NVIDIA | All | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| Ascend | Huawei | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | iluvatar | All | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| DPU | Teco | Checking | In progress | In progress | ❌ | +| GPU | NVIDIA | All | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| Ascend | Huawei | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | iluvatar | All | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| DPU | Teco | Checking | In progress | In progress | No | diff --git a/versioned_docs/version-v2.4.1/userguide/hygon-device/enable-hygon-dcu-sharing.md b/versioned_docs/version-v2.4.1/userguide/hygon-device/enable-hygon-dcu-sharing.md index a90f4086..64fd849b 100644 --- a/versioned_docs/version-v2.4.1/userguide/hygon-device/enable-hygon-dcu-sharing.md +++ b/versioned_docs/version-v2.4.1/userguide/hygon-device/enable-hygon-dcu-sharing.md @@ -4,15 +4,15 @@ title: Enable Hygon DCU sharing ## Introduction -**We now support hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: -***DCU sharing***: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. +**DCU sharing**: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. -***Device Memory Control***: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. +**Device Memory Control**: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. -***Device compute core limitation***: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) +**Device compute core limitation**: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) -***DCU Type Specification***: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. +**DCU Type Specification**: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. ## Prerequisites diff --git a/versioned_docs/version-v2.4.1/userguide/metax-device/enable-metax-gpu-schedule.md b/versioned_docs/version-v2.4.1/userguide/metax-device/enable-metax-gpu-schedule.md index 130992f4..5dc21839 100644 --- a/versioned_docs/version-v2.4.1/userguide/metax-device/enable-metax-gpu-schedule.md +++ b/versioned_docs/version-v2.4.1/userguide/metax-device/enable-metax-gpu-schedule.md @@ -2,7 +2,7 @@ title: Enable Metax GPU topology-aware scheduling --- -**We now support metax.com/gpu by implementing topo-awareness among metax GPUs**: +**HAMi now supports metax.com/gpu by implementing topo-awareness among metax GPUs**: When multiple GPUs are configured on a single server, the GPU cards are connected to the same PCIe Switch or MetaXLink depending on whether they are connected , there is a near-far relationship. This forms a topology among all the cards on the server, as shown in the following figure: diff --git a/versioned_docs/version-v2.4.1/userguide/monitoring/device-allocation.md b/versioned_docs/version-v2.4.1/userguide/monitoring/device-allocation.md index 94bbf0a5..7060687e 100644 --- a/versioned_docs/version-v2.4.1/userguide/monitoring/device-allocation.md +++ b/versioned_docs/version-v2.4.1/userguide/monitoring/device-allocation.md @@ -21,4 +21,4 @@ It contains the following metrics: | GPUDeviceSharedNum | Number of containers sharing this GPU | `{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",nodeid="aio-node67",zone="vGPU"}` 1 | | vGPUPodsDeviceAllocated | vGPU Allocated from pods | `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 | -> **Note** Please note that, this is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file +> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file diff --git a/versioned_docs/version-v2.4.1/userguide/mthreads-device/enable-mthreads-gpu-sharing.md b/versioned_docs/version-v2.4.1/userguide/mthreads-device/enable-mthreads-gpu-sharing.md index 9942c982..1b548b3f 100644 --- a/versioned_docs/version-v2.4.1/userguide/mthreads-device/enable-mthreads-gpu-sharing.md +++ b/versioned_docs/version-v2.4.1/userguide/mthreads-device/enable-mthreads-gpu-sharing.md @@ -4,13 +4,13 @@ title: Enable Mthreads GPU sharing ## Introduction -**We now support mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. ## Important Notes diff --git a/versioned_docs/version-v2.4.1/userguide/nvidia-device/dynamic-mig-support.md b/versioned_docs/version-v2.4.1/userguide/nvidia-device/dynamic-mig-support.md index 13dd62d1..5f2a1c80 100644 --- a/versioned_docs/version-v2.4.1/userguide/nvidia-device/dynamic-mig-support.md +++ b/versioned_docs/version-v2.4.1/userguide/nvidia-device/dynamic-mig-support.md @@ -4,17 +4,17 @@ title: Enable dynamic-mig feature ## Introduction -**We now support dynamic-mig by using mig-parted to adjust mig-devices dynamically**, including: +**HAMi now supports dynamic-mig by using mig-parted to adjust mig-devices dynamically**, including: -***Dynamic MIG instance management***: User don't need to operate on GPU node, using 'nvidia-smi -i 0 -mig 1' or other command to manage MIG instance, all will be done by HAMi-device-plugin. +**Dynamic MIG instance management**: User don't need to operate on GPU node, using 'nvidia-smi -i 0 -mig 1' or other command to manage MIG instance, all will be done by HAMi-device-plugin. -***Dynamic MIG Adjustment***: Each MIG device managed by HAMi will dynamically adjust their MIG template according to tasks submitted when necessary. +**Dynamic MIG Adjustment**: Each MIG device managed by HAMi will dynamically adjust their MIG template according to tasks submitted when necessary. -***Device MIG Observation***: Each MIG instance generated by HAMi will be shown in scheduler-monitor, including task information. user can get a clear overview of MIG nodes. +**Device MIG Observation**: Each MIG instance generated by HAMi will be shown in scheduler-monitor, including task information. user can get a clear overview of MIG nodes. -***Compatible with HAMi-core nodes***: HAMi can manage a unified GPU pool of `HAMi-core node` and `mig node`. A task can be scheduled to either node if not appointed manually by using `nvidia.com/vgpu-mode` annotation. +**Compatible with HAMi-core nodes**: HAMi can manage a unified GPU pool of `HAMi-core node` and `mig node`. A task can be scheduled to either node if not appointed manually by using `nvidia.com/vgpu-mode` annotation. -***Unified API with HAMi-core***: Zero work needs to be done to make the job compatible with dynamic-mig feature. +**Unified API with HAMi-core**: Zero work needs to be done to make the job compatible with dynamic-mig feature. ## Prerequisites diff --git a/versioned_docs/version-v2.4.1/userguide/nvidia-device/examples/specify-card-type-to-use.md b/versioned_docs/version-v2.4.1/userguide/nvidia-device/examples/specify-card-type-to-use.md index 397e984f..946f0e50 100644 --- a/versioned_docs/version-v2.4.1/userguide/nvidia-device/examples/specify-card-type-to-use.md +++ b/versioned_docs/version-v2.4.1/userguide/nvidia-device/examples/specify-card-type-to-use.md @@ -24,4 +24,4 @@ spec: nvidia.com/gpu: 2 # requesting 2 vGPUs ``` -> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, we want to run this job on A100 or V100* +> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, the job targets A100 or V100* diff --git a/versioned_docs/version-v2.5.0/contributor/adopters.md b/versioned_docs/version-v2.5.0/contributor/adopters.md index 8c521c76..0d87e5eb 100644 --- a/versioned_docs/version-v2.5.0/contributor/adopters.md +++ b/versioned_docs/version-v2.5.0/contributor/adopters.md @@ -1,12 +1,12 @@ # HAMi Adopters -So you and your organisation are using HAMi? That's great. We would love to hear from you! 💖 +HAMi is used in production by the organisations listed below. ## Adding yourself [Here](https://github.com/Project-HAMi/website/blob/master/src/pages/adopters.mdx) lists the organisations who adopted the HAMi project in production. -You just need to add an entry for your company and upon merging it will automatically be added to our website. +Add an entry for your company - it will be added to the website once the PR merges. To add your organisation follow these steps: @@ -25,4 +25,4 @@ To add your organisation follow these steps: 6. Push the commit with `git push origin main`. 7. Open a Pull Request to [HAMi-io/website](https://github.com/Project-HAMi/website) and a preview build will turn up. -Thanks a lot for being part of our community - we very much appreciate it! +Thanks to all adopters for being part of the community! diff --git a/versioned_docs/version-v2.5.0/contributor/contribute-docs.md b/versioned_docs/version-v2.5.0/contributor/contribute-docs.md index f095e61a..6b2ba2d5 100644 --- a/versioned_docs/version-v2.5.0/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.5.0/contributor/contribute-docs.md @@ -9,14 +9,14 @@ the `Project-HAMi/website` repository. ## Prerequisites - Docs, like codes, are also categorized and stored by version. - 1.3 is the first version we have archived. + 1.3 is the first version is the first archived. - Docs need to be translated into multiple languages for readers from different regions. The community now supports both Chinese and English. English is the official language of documentation. -- For our docs we use markdown. If you are unfamiliar with Markdown, +- The docs use markdown. If you are unfamiliar with Markdown, please see [https://guides.github.com/features/mastering-markdown/](https://guides.github.com/features/mastering-markdown/) or [https://www.markdownguide.org/](https://www.markdownguide.org/) if you are looking for something more substantial. -- We get some additions through [Docusaurus 2](https://docusaurus.io/), a model static website generator. +- The site uses [Docusaurus 2](https://docusaurus.io/), a model static website generator. ## Setup @@ -88,7 +88,7 @@ title: A doc with tags ``` The top section between two lines of --- is the Front Matter section. -Here we define a couple of entries which tell Docusaurus how to handle the article: +These entries tell Docusaurus how to handle the article: - Title is the equivalent of the `<h1>` in a HTML document or `# <title>` in a Markdown article. - Each document has a unique ID. By default, a document ID is the name of the document @@ -106,7 +106,7 @@ You can easily route to other places by adding any of the following links: You can use relative paths to index the corresponding files. - Link to pictures or other resources. If your article contains images, prefer storing them in `/static/img/docs/` and linking - with absolute paths. We use language-aware folders: + with absolute paths. Language-aware folders are used: - `/static/img/docs/common/` for shared images - `/static/img/docs/en/` for English-only images - `/static/img/docs/zh/` for Chinese-only images @@ -202,6 +202,6 @@ If the previewed page is not what you expected, please check your docs again. ### Versioning -For the newly supplemented documents of each version, we will synchronize to the latest version +For the newly supplemented documents of each version, they are synchronized to the latest version on the release date of each version, and the documents of the old version will not be modified. -For errata found in the documentation, we will fix it with every release. +For errata found in the documentation, fixes are applied with every release. diff --git a/versioned_docs/version-v2.5.0/contributor/contributing.md b/versioned_docs/version-v2.5.0/contributor/contributing.md index fab6d31c..0c88de85 100644 --- a/versioned_docs/version-v2.5.0/contributor/contributing.md +++ b/versioned_docs/version-v2.5.0/contributor/contributing.md @@ -6,7 +6,7 @@ Welcome to HAMi! ## Code of Conduct -Please make sure to read and observe our [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) +Please make sure to read and observe the [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) ## Community Expectations @@ -20,7 +20,7 @@ HAMi is a community project driven by its community which strives to promote a h ## Your First Contribution -We will help you to contribute in different areas like filing issues, developing features, fixing critical bugs and +Help is available for contributing in areas like filing issues, developing features, fixing critical bugs and getting your work reviewed and merged. If you have questions about the development process, @@ -28,7 +28,7 @@ feel free to [file an issue](https://github.com/Project-HAMi/HAMi/issues/new/cho ## Find something to work on -We are always in need of help, be it fixing documentation, reporting bugs or writing some code. +Help is always welcome - fixing documentation, reporting bugs, writing code. Look at places where you feel best coding practices aren't followed, code refactoring is needed or tests are missing. Here is how you get started. @@ -40,18 +40,18 @@ For example, [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi) has [help wanted](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) and [good first issue](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) labels for issues that should not need deep knowledge of the system. -We can help new contributors who wish to work on such issues. +Maintainers can help new contributors who wish to work on such issues. Another good way to contribute is to find a documentation improvement, such as a missing/broken link. Please see [Contributor Workflow](#contributor-workflow) below for the workflow. #### Work on an issue -When you are willing to take on an issue, just reply on the issue. The maintainer will assign it to you. +When you are willing to take on an issue, reply on the issue. The maintainer will assign it to you. ### File an Issue -While we encourage everyone to contribute code, it is also appreciated when someone reports an issue. +Code contributions are welcome, and bug reports are equally appreciated. Issues should be filed under the appropriate HAMi sub-repository. *Example:* a HAMi issue should be opened to [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi/issues). diff --git a/versioned_docs/version-v2.5.0/contributor/github-workflow.md b/versioned_docs/version-v2.5.0/contributor/github-workflow.md index 8582a392..a362a3b5 100644 --- a/versioned_docs/version-v2.5.0/contributor/github-workflow.md +++ b/versioned_docs/version-v2.5.0/contributor/github-workflow.md @@ -110,7 +110,7 @@ in a few cycles. ## Push -When ready to review (or just to establish an offsite backup of your work), +When ready to review (or to establish an offsite backup of your work), push your branch to your fork on `github.com`: ```sh diff --git a/versioned_docs/version-v2.5.0/contributor/governance.md b/versioned_docs/version-v2.5.0/contributor/governance.md index f49b23b7..aaf1e568 100644 --- a/versioned_docs/version-v2.5.0/contributor/governance.md +++ b/versioned_docs/version-v2.5.0/contributor/governance.md @@ -19,7 +19,7 @@ The HAMi and its leadership embrace the following values: priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. -* Inclusivity: We innovate through different perspectives and skill sets, which +* Inclusivity: Innovation comes from different perspectives and skill sets, and this can only be accomplished in a welcoming and respectful environment. * Participation: Responsibilities within the project are earned through diff --git a/versioned_docs/version-v2.5.0/contributor/ladder.md b/versioned_docs/version-v2.5.0/contributor/ladder.md index 26ea756f..14aff5d7 100644 --- a/versioned_docs/version-v2.5.0/contributor/ladder.md +++ b/versioned_docs/version-v2.5.0/contributor/ladder.md @@ -4,7 +4,7 @@ title: Contributor Ladder This docs different ways to get involved and level up within the project. You can see different roles within the project in the contributor roles. -Hello! We are excited that you want to learn more about our project contributor ladder! This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Community members generally start at the first levels of the "ladder" and advance up it as their involvement in the project grows. Our project members are happy to help you advance along the contributor ladder. +This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Each of the contributor roles below is organized into lists of three types of things. "Responsibilities" are things that a contributor is expected to do. "Requirements" are qualifications a person needs to meet to be in that role, and "Privileges" are things contributors on that level are entitled to. @@ -45,7 +45,7 @@ Description: A Contributor contributes directly to the project and adds value to * Invitations to contributor events * Eligible to become an Organization Member -A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. We wouldn't be where we are today without your contributions. Thank you! 💖 +A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. Thanks to everyone who contributed and helped maintain the project. As long as you contribute to HAMi, your name will be added to the [HAMi AUTHORS list](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md). If you don't find your name, please contact us to add it. @@ -126,7 +126,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github ## An active maintainer should -* Actively participate in reviewing pull requests and incoming issues. Note that there are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. +* Actively participate in reviewing pull requests and incoming issues. There are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. * Actively participate in discussions about design and the future of the project. @@ -140,7 +140,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github New maintainers are added by consensus among the current group of maintainers. This can be done via a private discussion via Slack or email. A majority of maintainers should support the addition of the new person, and no single maintainer should object to adding the new maintainer. -When adding a new maintainer, we should file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. +When adding a new maintainer, file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. ## Removing Maintainers diff --git a/versioned_docs/version-v2.5.0/developers/bash-auto-completion-on-linux.md b/versioned_docs/version-v2.5.0/developers/bash-auto-completion-on-linux.md index 489f46c4..6b22baf8 100644 --- a/versioned_docs/version-v2.5.0/developers/bash-auto-completion-on-linux.md +++ b/versioned_docs/version-v2.5.0/developers/bash-auto-completion-on-linux.md @@ -47,4 +47,4 @@ Both approaches are equivalent. After reloading your shell, karmadactl autocompl ## Enable kubectl-karmada autocompletion Currently, kubectl plugins do not support autocomplete, but it is already planned in [Command line completion for kubectl plugins](https://github.com/kubernetes/kubernetes/issues/74178). -We will update the documentation as soon as it does. +Documentation will be updated when support is added. diff --git a/versioned_docs/version-v2.5.0/developers/dynamic-mig.md b/versioned_docs/version-v2.5.0/developers/dynamic-mig.md index fd22875b..8872a832 100644 --- a/versioned_docs/version-v2.5.0/developers/dynamic-mig.md +++ b/versioned_docs/version-v2.5.0/developers/dynamic-mig.md @@ -10,8 +10,8 @@ This feature will not be implemented without the help of @sailorvii. ## Introduction -The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, so we chose the MPS and MIG. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. We want to develop an automatic slice plugin and create the slice when the user require it. -For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, we consider the CPU, Mem, GPU memory and other user-defined resource. +The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, MPS and MIG are preferred. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. The goal is an automatic slice plugin that creates slices on demand. +For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, the scheduler considers CPU, memory, GPU memory, and other user-defined resources. HAMi is done by using [hami-core](https://github.com/Project-HAMi/HAMi-core), which is a cuda-hacking library. But mig is also widely used across the world. A unified API for dynamic-mig and hami-core is needed. ## Targets @@ -149,7 +149,7 @@ The Procedure of a vGPU task which uses dynamic-mig is shown below: <img src="https://github.com/Project-HAMi/HAMi/blob/master/docs/develop/imgs/hami-dynamic-mig-procedure.png?raw=true" width="800" alt="HAMi dynamic MIG procedure flowchart showing task scheduling process" /> -Note that after submitted a task, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. +After a task is submitted, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. If you submit the example on an empty A100-PCIE-40GB node, then it will select a GPU and choose MIG template below: diff --git a/versioned_docs/version-v2.5.0/developers/protocol.md b/versioned_docs/version-v2.5.0/developers/protocol.md index 0a02dba9..6e680bd6 100644 --- a/versioned_docs/version-v2.5.0/developers/protocol.md +++ b/versioned_docs/version-v2.5.0/developers/protocol.md @@ -6,7 +6,7 @@ title: Protocol design ### Device Registration -In order to perform more accurate scheduling, the HAMI scheduler needs to perceive the specifications of the device during device registration, including UUID, video memory, computing power, model, numa number, etc +In order to perform more accurate scheduling, the HAMi scheduler needs to perceive the specifications of the device during device registration, including UUID, video memory, computing power, model, numa number, etc However, the device-plugin device registration API does not provide corresponding parameter acquisition, so HAMi-device-plugin stores these supplementary information in the node annotations during registering for the scheduler to read, as the following figure shows: diff --git a/versioned_docs/version-v2.5.0/developers/scheduling.md b/versioned_docs/version-v2.5.0/developers/scheduling.md index 02270146..7f8ce100 100644 --- a/versioned_docs/version-v2.5.0/developers/scheduling.md +++ b/versioned_docs/version-v2.5.0/developers/scheduling.md @@ -8,7 +8,7 @@ Current in a cluster with many GPU nodes, nodes are not `binpack` or `spread` wh ## Proposal -We add a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and +The scheduler adds a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and use can set Pod annotation to change this default policy, use `hami.io/node-scheduler-policy` and `hami.io/gpu-scheduler-policy` to overlay scheduler config. ### User Stories @@ -104,7 +104,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Binpack` policy we can select `Node1`. +So, in `Binpack` policy, the selected node is `Node1`. #### Spread @@ -124,7 +124,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Spread` policy we can select `Node2`. +So, in `Spread` policy, the selected node is `Node2`. ### GPU-scheduler-policy @@ -147,7 +147,7 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Binpack` policy we can select `GPU2`. +So, in `Binpack` policy, the selected node is `GPU2`. #### Spread @@ -166,4 +166,4 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Spread` policy we can select `GPU1`. +So, in `Spread` policy, the selected node is `GPU1`. diff --git a/versioned_docs/version-v2.5.0/faq/faq.md b/versioned_docs/version-v2.5.0/faq/faq.md index c52d50a2..c7403c18 100644 --- a/versioned_docs/version-v2.5.0/faq/faq.md +++ b/versioned_docs/version-v2.5.0/faq/faq.md @@ -19,15 +19,14 @@ Both of them are used to hold the propagation declaration, but they have differe `kube-controller-manager` is composed of a bunch of controllers, Karmada inherits some controllers from it to keep a consistent user experience and behavior. -It's worth noting that not all controllers are needed by Karmada, for the recommended controllers please +Not all controllers are needed by Karmada, for the recommended controllers please ## Can I install Karmada in a Kubernetes cluster and reuse the kube-apiserver as Karmada apiserver? The quick answer is `yes`. In that case, you can save the effort to deploy -[karmada-apiserver](https://github.com/karmada-io/karmada/blob/master/artifacts/deploy/karmada-apiserver.yaml) and just -share the APIServer between Kubernetes and Karmada. In addition, the high availability capabilities in the origin clusters -can be inherited seamlessly. We do have some users using Karmada in this way. +[karmada-apiserver](https://github.com/karmada-io/karmada/blob/master/artifacts/deploy/karmada-apiserver.yaml) and share the APIServer between Kubernetes and Karmada. In addition, the high availability capabilities in the origin clusters +can be inherited. Some users run Karmada this way. There are some things you should consider before doing so: diff --git a/versioned_docs/version-v2.5.0/roadmap.md b/versioned_docs/version-v2.5.0/roadmap.md index a50c8db3..8793e529 100644 --- a/versioned_docs/version-v2.5.0/roadmap.md +++ b/versioned_docs/version-v2.5.0/roadmap.md @@ -6,7 +6,7 @@ title: Karmada Roadmap This document defines a high level roadmap for Karmada development and upcoming releases. Community and contributor involvement is vital for successfully implementing all desired items for each release. -We hope that the items listed below will inspire further engagement from the community to keep karmada progressing and shipping exciting and valuable features. +The items below are intended to inspire further community engagement to keep HAMi progressing and shipping exciting and valuable features. ## 2022 H1 diff --git a/versioned_docs/version-v2.5.0/troubleshooting/troubleshooting.md b/versioned_docs/version-v2.5.0/troubleshooting/troubleshooting.md index f4e9a15a..f99bfb8a 100644 --- a/versioned_docs/version-v2.5.0/troubleshooting/troubleshooting.md +++ b/versioned_docs/version-v2.5.0/troubleshooting/troubleshooting.md @@ -6,6 +6,6 @@ title: Troubleshooting - Currently, A100 MIG can be supported in only "none" and "mixed" modes. - Tasks with the "nodeName" field cannot be scheduled at the moment; please use "nodeSelector" instead. - Only computing tasks are currently supported; video codec processing is not supported. -- We change `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: +- The `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: - Manually execute `kubectl edit daemonset` to modify the `device-plugin` env var from `NodeName` to `NODE_NAME`. - Upgrade to the latest version using helm, the latest version of `device-plugin` image version is `v2.3.10`, execute `helm upgrade hami hami/hami -n kube-system`, it will be fixed automatically. diff --git a/versioned_docs/version-v2.5.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/versioned_docs/version-v2.5.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index ae498abe..92b3f2fb 100644 --- a/versioned_docs/version-v2.5.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/versioned_docs/version-v2.5.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -4,15 +4,15 @@ title: Enable cambricon MLU sharing ## Introduction -**We now support cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: -***MLU sharing***: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. +**MLU sharing**: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. -***Device Memory Control***: MLUs can be allocated with certain device memory size on certain type(i.e 370) and have made it that it does not exceed the boundary. +**Device Memory Control**: MLUs can be allocated with certain device memory size on certain type(i.e 370) and have made it that it does not exceed the boundary. -***MLU Type Specification***: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. +**MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. ## Prerequisites diff --git a/versioned_docs/version-v2.5.0/userguide/configure.md b/versioned_docs/version-v2.5.0/userguide/configure.md index 8e3a34f3..536c889e 100644 --- a/versioned_docs/version-v2.5.0/userguide/configure.md +++ b/versioned_docs/version-v2.5.0/userguide/configure.md @@ -18,7 +18,7 @@ You can update these configurations using one of the following methods: 2. Modify Helm Chart: Update the corresponding values in the [ConfigMap](https://raw.githubusercontent.com/archlitchi/HAMi/refs/heads/master/charts/hami/templates/scheduler/device-configmap.yaml), then reapply the Helm Chart to regenerate the ConfigMap. * `nvidia.deviceMemoryScaling:` - Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if we set `nvidia.deviceMemoryScaling` argument to *S*, vGPUs split by this GPU will totally get `S * M` memory in Kubernetes with our device plugin. + Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if `nvidia.deviceMemoryScaling` is set argument to *S*, vGPUs split by this GPU will totally get `S * M` memory in Kubernetes with the HAMi device plugin. * `nvidia.deviceSplitCount:` Integer type, by default: equals 10. Maximum tasks assigned to a simple GPU device. * `nvidia.migstrategy:` diff --git a/versioned_docs/version-v2.5.0/userguide/device-supported.md b/versioned_docs/version-v2.5.0/userguide/device-supported.md index 4a055c8c..44507a4b 100644 --- a/versioned_docs/version-v2.5.0/userguide/device-supported.md +++ b/versioned_docs/version-v2.5.0/userguide/device-supported.md @@ -6,11 +6,11 @@ The view of device supported by HAMi is shown in this table below: | Production | manufactor | Type |MemoryIsolation | CoreIsolation | MultiCard support | |-------------|------------|-------------|-----------|---------------|-------------------| -| GPU | NVIDIA | All | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L, K100 | ✅ | ✅ | ❌ | -| Ascend | Huawei | 910B, 910B2, 910B3, 910B4, 310P | ✅ | ✅ | ❌ | -| GPU | iluvatar | All | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| GCU | Enflame | S60 | ✅ | ✅ | ❌ | -| DPU | Teco | Checking | In progress | In progress | ❌ | +| GPU | NVIDIA | All | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L, K100 | Yes | Yes | No | +| Ascend | Huawei | 910B, 910B2, 910B3, 910B4, 310P | Yes | Yes | No | +| GPU | iluvatar | All | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| GCU | Enflame | S60 | Yes | Yes | No | +| DPU | Teco | Checking | In progress | In progress | No | diff --git a/versioned_docs/version-v2.5.0/userguide/enflame-device/enable-enflame-gpu-sharing.md b/versioned_docs/version-v2.5.0/userguide/enflame-device/enable-enflame-gpu-sharing.md index 4761c98c..ae019334 100644 --- a/versioned_docs/version-v2.5.0/userguide/enflame-device/enable-enflame-gpu-sharing.md +++ b/versioned_docs/version-v2.5.0/userguide/enflame-device/enable-enflame-gpu-sharing.md @@ -4,19 +4,19 @@ title: Enable Enflame GCU sharing ## Introduction -**We now support sharing on enflame.com/gcu(i.e S60) by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports sharing on enflame.com/gcu(i.e S60) by implementing most device-sharing features as nvidia-GPU**, including: -***GCU sharing***: Each task can allocate a portion of GCU instead of a whole GCU card, thus GCU can be shared among multiple tasks. +**GCU sharing**: Each task can allocate a portion of GCU instead of a whole GCU card, thus GCU can be shared among multiple tasks. -***Device Memory and Core Control***: GCUs can be allocated with certain percentage of device memory and core, we make sure that it does not exceed the boundary. +**Device Memory and Core Control**: GCUs can be allocated with certain percentage of device memory and core, HAMi ensures it does not exceed the boundary. -***Device UUID Selection***: You can specify which GCU devices to use or exclude using annotations. +**Device UUID Selection**: You can specify which GCU devices to use or exclude using annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites -* Enflame gcushare-device-plugin >= 2.1.6 (please consult your device provider, gcushare has two components: gcushare-scheduler-plugin and gcushare-device-plugin, we only need gcushare-device-plugin here ) +* Enflame gcushare-device-plugin >= 2.1.6 (please consult your device provider, gcushare has two components: gcushare-scheduler-plugin and gcushare-device-plugin, only gcushare-device-plugin is needed here ) * driver version >= 1.2.3.14 * kubernetes >= 1.24 * enflame-container-toolkit >=2.0.50 diff --git a/versioned_docs/version-v2.5.0/userguide/hygon-device/enable-hygon-dcu-sharing.md b/versioned_docs/version-v2.5.0/userguide/hygon-device/enable-hygon-dcu-sharing.md index a90f4086..64fd849b 100644 --- a/versioned_docs/version-v2.5.0/userguide/hygon-device/enable-hygon-dcu-sharing.md +++ b/versioned_docs/version-v2.5.0/userguide/hygon-device/enable-hygon-dcu-sharing.md @@ -4,15 +4,15 @@ title: Enable Hygon DCU sharing ## Introduction -**We now support hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: -***DCU sharing***: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. +**DCU sharing**: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. -***Device Memory Control***: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. +**Device Memory Control**: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. -***Device compute core limitation***: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) +**Device compute core limitation**: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) -***DCU Type Specification***: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. +**DCU Type Specification**: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. ## Prerequisites diff --git a/versioned_docs/version-v2.5.0/userguide/iluvatar-device/enable-iluvatar-gpu-sharing.md b/versioned_docs/version-v2.5.0/userguide/iluvatar-device/enable-iluvatar-gpu-sharing.md index 615adcce..d37fcefe 100644 --- a/versioned_docs/version-v2.5.0/userguide/iluvatar-device/enable-iluvatar-gpu-sharing.md +++ b/versioned_docs/version-v2.5.0/userguide/iluvatar-device/enable-iluvatar-gpu-sharing.md @@ -4,17 +4,17 @@ title: Enable Iluvatar GCU sharing ## Introduction -**We now support iluvatar.ai/gpu(i.e MR-V100、BI-V150、BI-V100) by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports iluvatar.ai/gpu(i.e MR-V100、BI-V150、BI-V100) by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores and have made it that it does not exceed the boundary. -***Device UUID Selection***: You can specify which GPU devices to use or exclude using annotations. +**Device UUID Selection**: You can specify which GPU devices to use or exclude using annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.5.0/userguide/metax-device/enable-metax-gpu-schedule.md b/versioned_docs/version-v2.5.0/userguide/metax-device/enable-metax-gpu-schedule.md index 130992f4..5dc21839 100644 --- a/versioned_docs/version-v2.5.0/userguide/metax-device/enable-metax-gpu-schedule.md +++ b/versioned_docs/version-v2.5.0/userguide/metax-device/enable-metax-gpu-schedule.md @@ -2,7 +2,7 @@ title: Enable Metax GPU topology-aware scheduling --- -**We now support metax.com/gpu by implementing topo-awareness among metax GPUs**: +**HAMi now supports metax.com/gpu by implementing topo-awareness among metax GPUs**: When multiple GPUs are configured on a single server, the GPU cards are connected to the same PCIe Switch or MetaXLink depending on whether they are connected , there is a near-far relationship. This forms a topology among all the cards on the server, as shown in the following figure: diff --git a/versioned_docs/version-v2.5.0/userguide/metax-device/enable-metax-gpu-sharing.md b/versioned_docs/version-v2.5.0/userguide/metax-device/enable-metax-gpu-sharing.md index 479c454d..d57c6806 100644 --- a/versioned_docs/version-v2.5.0/userguide/metax-device/enable-metax-gpu-sharing.md +++ b/versioned_docs/version-v2.5.0/userguide/metax-device/enable-metax-gpu-sharing.md @@ -4,7 +4,7 @@ title: Enable Metax GPU sharing ## Introduction -We support metax.com/gpu as follows: +HAMi supports metax.com/gpu as follows: - support metax.com/gpu by implementing most device-sharing features as nvidia-GPU - support metax.com/gpu by implementing topo-awareness among metax GPUs @@ -13,11 +13,11 @@ We support metax.com/gpu as follows: device-sharing features include the following: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. -***Device compute core limitation***: GPUs can be allocated with certain percentage of device core(60 indicate this container uses 60% compute cores of this device) +**Device compute core limitation**: GPUs can be allocated with certain percentage of device core(60 indicate this container uses 60% compute cores of this device) ### Prerequisites diff --git a/versioned_docs/version-v2.5.0/userguide/monitoring/device-allocation.md b/versioned_docs/version-v2.5.0/userguide/monitoring/device-allocation.md index 94bbf0a5..7060687e 100644 --- a/versioned_docs/version-v2.5.0/userguide/monitoring/device-allocation.md +++ b/versioned_docs/version-v2.5.0/userguide/monitoring/device-allocation.md @@ -21,4 +21,4 @@ It contains the following metrics: | GPUDeviceSharedNum | Number of containers sharing this GPU | `{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",nodeid="aio-node67",zone="vGPU"}` 1 | | vGPUPodsDeviceAllocated | vGPU Allocated from pods | `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 | -> **Note** Please note that, this is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file +> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file diff --git a/versioned_docs/version-v2.5.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md b/versioned_docs/version-v2.5.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md index 9942c982..1b548b3f 100644 --- a/versioned_docs/version-v2.5.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md +++ b/versioned_docs/version-v2.5.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md @@ -4,13 +4,13 @@ title: Enable Mthreads GPU sharing ## Introduction -**We now support mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. ## Important Notes diff --git a/versioned_docs/version-v2.5.0/userguide/nvidia-device/dynamic-mig-support.md b/versioned_docs/version-v2.5.0/userguide/nvidia-device/dynamic-mig-support.md index 13dd62d1..5f2a1c80 100644 --- a/versioned_docs/version-v2.5.0/userguide/nvidia-device/dynamic-mig-support.md +++ b/versioned_docs/version-v2.5.0/userguide/nvidia-device/dynamic-mig-support.md @@ -4,17 +4,17 @@ title: Enable dynamic-mig feature ## Introduction -**We now support dynamic-mig by using mig-parted to adjust mig-devices dynamically**, including: +**HAMi now supports dynamic-mig by using mig-parted to adjust mig-devices dynamically**, including: -***Dynamic MIG instance management***: User don't need to operate on GPU node, using 'nvidia-smi -i 0 -mig 1' or other command to manage MIG instance, all will be done by HAMi-device-plugin. +**Dynamic MIG instance management**: User don't need to operate on GPU node, using 'nvidia-smi -i 0 -mig 1' or other command to manage MIG instance, all will be done by HAMi-device-plugin. -***Dynamic MIG Adjustment***: Each MIG device managed by HAMi will dynamically adjust their MIG template according to tasks submitted when necessary. +**Dynamic MIG Adjustment**: Each MIG device managed by HAMi will dynamically adjust their MIG template according to tasks submitted when necessary. -***Device MIG Observation***: Each MIG instance generated by HAMi will be shown in scheduler-monitor, including task information. user can get a clear overview of MIG nodes. +**Device MIG Observation**: Each MIG instance generated by HAMi will be shown in scheduler-monitor, including task information. user can get a clear overview of MIG nodes. -***Compatible with HAMi-core nodes***: HAMi can manage a unified GPU pool of `HAMi-core node` and `mig node`. A task can be scheduled to either node if not appointed manually by using `nvidia.com/vgpu-mode` annotation. +**Compatible with HAMi-core nodes**: HAMi can manage a unified GPU pool of `HAMi-core node` and `mig node`. A task can be scheduled to either node if not appointed manually by using `nvidia.com/vgpu-mode` annotation. -***Unified API with HAMi-core***: Zero work needs to be done to make the job compatible with dynamic-mig feature. +**Unified API with HAMi-core**: Zero work needs to be done to make the job compatible with dynamic-mig feature. ## Prerequisites diff --git a/versioned_docs/version-v2.5.0/userguide/nvidia-device/examples/specify-card-type-to-use.md b/versioned_docs/version-v2.5.0/userguide/nvidia-device/examples/specify-card-type-to-use.md index 397e984f..946f0e50 100644 --- a/versioned_docs/version-v2.5.0/userguide/nvidia-device/examples/specify-card-type-to-use.md +++ b/versioned_docs/version-v2.5.0/userguide/nvidia-device/examples/specify-card-type-to-use.md @@ -24,4 +24,4 @@ spec: nvidia.com/gpu: 2 # requesting 2 vGPUs ``` -> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, we want to run this job on A100 or V100* +> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, the job targets A100 or V100* diff --git a/versioned_docs/version-v2.5.1/contributor/adopters.md b/versioned_docs/version-v2.5.1/contributor/adopters.md index 8c521c76..0d87e5eb 100644 --- a/versioned_docs/version-v2.5.1/contributor/adopters.md +++ b/versioned_docs/version-v2.5.1/contributor/adopters.md @@ -1,12 +1,12 @@ # HAMi Adopters -So you and your organisation are using HAMi? That's great. We would love to hear from you! 💖 +HAMi is used in production by the organisations listed below. ## Adding yourself [Here](https://github.com/Project-HAMi/website/blob/master/src/pages/adopters.mdx) lists the organisations who adopted the HAMi project in production. -You just need to add an entry for your company and upon merging it will automatically be added to our website. +Add an entry for your company - it will be added to the website once the PR merges. To add your organisation follow these steps: @@ -25,4 +25,4 @@ To add your organisation follow these steps: 6. Push the commit with `git push origin main`. 7. Open a Pull Request to [HAMi-io/website](https://github.com/Project-HAMi/website) and a preview build will turn up. -Thanks a lot for being part of our community - we very much appreciate it! +Thanks to all adopters for being part of the community! diff --git a/versioned_docs/version-v2.5.1/contributor/contribute-docs.md b/versioned_docs/version-v2.5.1/contributor/contribute-docs.md index f095e61a..6b2ba2d5 100644 --- a/versioned_docs/version-v2.5.1/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.5.1/contributor/contribute-docs.md @@ -9,14 +9,14 @@ the `Project-HAMi/website` repository. ## Prerequisites - Docs, like codes, are also categorized and stored by version. - 1.3 is the first version we have archived. + 1.3 is the first version is the first archived. - Docs need to be translated into multiple languages for readers from different regions. The community now supports both Chinese and English. English is the official language of documentation. -- For our docs we use markdown. If you are unfamiliar with Markdown, +- The docs use markdown. If you are unfamiliar with Markdown, please see [https://guides.github.com/features/mastering-markdown/](https://guides.github.com/features/mastering-markdown/) or [https://www.markdownguide.org/](https://www.markdownguide.org/) if you are looking for something more substantial. -- We get some additions through [Docusaurus 2](https://docusaurus.io/), a model static website generator. +- The site uses [Docusaurus 2](https://docusaurus.io/), a model static website generator. ## Setup @@ -88,7 +88,7 @@ title: A doc with tags ``` The top section between two lines of --- is the Front Matter section. -Here we define a couple of entries which tell Docusaurus how to handle the article: +These entries tell Docusaurus how to handle the article: - Title is the equivalent of the `<h1>` in a HTML document or `# <title>` in a Markdown article. - Each document has a unique ID. By default, a document ID is the name of the document @@ -106,7 +106,7 @@ You can easily route to other places by adding any of the following links: You can use relative paths to index the corresponding files. - Link to pictures or other resources. If your article contains images, prefer storing them in `/static/img/docs/` and linking - with absolute paths. We use language-aware folders: + with absolute paths. Language-aware folders are used: - `/static/img/docs/common/` for shared images - `/static/img/docs/en/` for English-only images - `/static/img/docs/zh/` for Chinese-only images @@ -202,6 +202,6 @@ If the previewed page is not what you expected, please check your docs again. ### Versioning -For the newly supplemented documents of each version, we will synchronize to the latest version +For the newly supplemented documents of each version, they are synchronized to the latest version on the release date of each version, and the documents of the old version will not be modified. -For errata found in the documentation, we will fix it with every release. +For errata found in the documentation, fixes are applied with every release. diff --git a/versioned_docs/version-v2.5.1/contributor/contributing.md b/versioned_docs/version-v2.5.1/contributor/contributing.md index fab6d31c..0c88de85 100644 --- a/versioned_docs/version-v2.5.1/contributor/contributing.md +++ b/versioned_docs/version-v2.5.1/contributor/contributing.md @@ -6,7 +6,7 @@ Welcome to HAMi! ## Code of Conduct -Please make sure to read and observe our [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) +Please make sure to read and observe the [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) ## Community Expectations @@ -20,7 +20,7 @@ HAMi is a community project driven by its community which strives to promote a h ## Your First Contribution -We will help you to contribute in different areas like filing issues, developing features, fixing critical bugs and +Help is available for contributing in areas like filing issues, developing features, fixing critical bugs and getting your work reviewed and merged. If you have questions about the development process, @@ -28,7 +28,7 @@ feel free to [file an issue](https://github.com/Project-HAMi/HAMi/issues/new/cho ## Find something to work on -We are always in need of help, be it fixing documentation, reporting bugs or writing some code. +Help is always welcome - fixing documentation, reporting bugs, writing code. Look at places where you feel best coding practices aren't followed, code refactoring is needed or tests are missing. Here is how you get started. @@ -40,18 +40,18 @@ For example, [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi) has [help wanted](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) and [good first issue](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) labels for issues that should not need deep knowledge of the system. -We can help new contributors who wish to work on such issues. +Maintainers can help new contributors who wish to work on such issues. Another good way to contribute is to find a documentation improvement, such as a missing/broken link. Please see [Contributor Workflow](#contributor-workflow) below for the workflow. #### Work on an issue -When you are willing to take on an issue, just reply on the issue. The maintainer will assign it to you. +When you are willing to take on an issue, reply on the issue. The maintainer will assign it to you. ### File an Issue -While we encourage everyone to contribute code, it is also appreciated when someone reports an issue. +Code contributions are welcome, and bug reports are equally appreciated. Issues should be filed under the appropriate HAMi sub-repository. *Example:* a HAMi issue should be opened to [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi/issues). diff --git a/versioned_docs/version-v2.5.1/contributor/github-workflow.md b/versioned_docs/version-v2.5.1/contributor/github-workflow.md index 8582a392..a362a3b5 100644 --- a/versioned_docs/version-v2.5.1/contributor/github-workflow.md +++ b/versioned_docs/version-v2.5.1/contributor/github-workflow.md @@ -110,7 +110,7 @@ in a few cycles. ## Push -When ready to review (or just to establish an offsite backup of your work), +When ready to review (or to establish an offsite backup of your work), push your branch to your fork on `github.com`: ```sh diff --git a/versioned_docs/version-v2.5.1/contributor/governance.md b/versioned_docs/version-v2.5.1/contributor/governance.md index f49b23b7..aaf1e568 100644 --- a/versioned_docs/version-v2.5.1/contributor/governance.md +++ b/versioned_docs/version-v2.5.1/contributor/governance.md @@ -19,7 +19,7 @@ The HAMi and its leadership embrace the following values: priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. -* Inclusivity: We innovate through different perspectives and skill sets, which +* Inclusivity: Innovation comes from different perspectives and skill sets, and this can only be accomplished in a welcoming and respectful environment. * Participation: Responsibilities within the project are earned through diff --git a/versioned_docs/version-v2.5.1/contributor/ladder.md b/versioned_docs/version-v2.5.1/contributor/ladder.md index 26ea756f..14aff5d7 100644 --- a/versioned_docs/version-v2.5.1/contributor/ladder.md +++ b/versioned_docs/version-v2.5.1/contributor/ladder.md @@ -4,7 +4,7 @@ title: Contributor Ladder This docs different ways to get involved and level up within the project. You can see different roles within the project in the contributor roles. -Hello! We are excited that you want to learn more about our project contributor ladder! This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Community members generally start at the first levels of the "ladder" and advance up it as their involvement in the project grows. Our project members are happy to help you advance along the contributor ladder. +This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Each of the contributor roles below is organized into lists of three types of things. "Responsibilities" are things that a contributor is expected to do. "Requirements" are qualifications a person needs to meet to be in that role, and "Privileges" are things contributors on that level are entitled to. @@ -45,7 +45,7 @@ Description: A Contributor contributes directly to the project and adds value to * Invitations to contributor events * Eligible to become an Organization Member -A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. We wouldn't be where we are today without your contributions. Thank you! 💖 +A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. Thanks to everyone who contributed and helped maintain the project. As long as you contribute to HAMi, your name will be added to the [HAMi AUTHORS list](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md). If you don't find your name, please contact us to add it. @@ -126,7 +126,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github ## An active maintainer should -* Actively participate in reviewing pull requests and incoming issues. Note that there are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. +* Actively participate in reviewing pull requests and incoming issues. There are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. * Actively participate in discussions about design and the future of the project. @@ -140,7 +140,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github New maintainers are added by consensus among the current group of maintainers. This can be done via a private discussion via Slack or email. A majority of maintainers should support the addition of the new person, and no single maintainer should object to adding the new maintainer. -When adding a new maintainer, we should file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. +When adding a new maintainer, file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. ## Removing Maintainers diff --git a/versioned_docs/version-v2.5.1/developers/dynamic-mig.md b/versioned_docs/version-v2.5.1/developers/dynamic-mig.md index fd22875b..8872a832 100644 --- a/versioned_docs/version-v2.5.1/developers/dynamic-mig.md +++ b/versioned_docs/version-v2.5.1/developers/dynamic-mig.md @@ -10,8 +10,8 @@ This feature will not be implemented without the help of @sailorvii. ## Introduction -The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, so we chose the MPS and MIG. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. We want to develop an automatic slice plugin and create the slice when the user require it. -For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, we consider the CPU, Mem, GPU memory and other user-defined resource. +The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, MPS and MIG are preferred. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. The goal is an automatic slice plugin that creates slices on demand. +For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, the scheduler considers CPU, memory, GPU memory, and other user-defined resources. HAMi is done by using [hami-core](https://github.com/Project-HAMi/HAMi-core), which is a cuda-hacking library. But mig is also widely used across the world. A unified API for dynamic-mig and hami-core is needed. ## Targets @@ -149,7 +149,7 @@ The Procedure of a vGPU task which uses dynamic-mig is shown below: <img src="https://github.com/Project-HAMi/HAMi/blob/master/docs/develop/imgs/hami-dynamic-mig-procedure.png?raw=true" width="800" alt="HAMi dynamic MIG procedure flowchart showing task scheduling process" /> -Note that after submitted a task, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. +After a task is submitted, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. If you submit the example on an empty A100-PCIE-40GB node, then it will select a GPU and choose MIG template below: diff --git a/versioned_docs/version-v2.5.1/developers/protocol.md b/versioned_docs/version-v2.5.1/developers/protocol.md index 0a02dba9..6e680bd6 100644 --- a/versioned_docs/version-v2.5.1/developers/protocol.md +++ b/versioned_docs/version-v2.5.1/developers/protocol.md @@ -6,7 +6,7 @@ title: Protocol design ### Device Registration -In order to perform more accurate scheduling, the HAMI scheduler needs to perceive the specifications of the device during device registration, including UUID, video memory, computing power, model, numa number, etc +In order to perform more accurate scheduling, the HAMi scheduler needs to perceive the specifications of the device during device registration, including UUID, video memory, computing power, model, numa number, etc However, the device-plugin device registration API does not provide corresponding parameter acquisition, so HAMi-device-plugin stores these supplementary information in the node annotations during registering for the scheduler to read, as the following figure shows: diff --git a/versioned_docs/version-v2.5.1/developers/scheduling.md b/versioned_docs/version-v2.5.1/developers/scheduling.md index 02270146..7f8ce100 100644 --- a/versioned_docs/version-v2.5.1/developers/scheduling.md +++ b/versioned_docs/version-v2.5.1/developers/scheduling.md @@ -8,7 +8,7 @@ Current in a cluster with many GPU nodes, nodes are not `binpack` or `spread` wh ## Proposal -We add a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and +The scheduler adds a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and use can set Pod annotation to change this default policy, use `hami.io/node-scheduler-policy` and `hami.io/gpu-scheduler-policy` to overlay scheduler config. ### User Stories @@ -104,7 +104,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Binpack` policy we can select `Node1`. +So, in `Binpack` policy, the selected node is `Node1`. #### Spread @@ -124,7 +124,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Spread` policy we can select `Node2`. +So, in `Spread` policy, the selected node is `Node2`. ### GPU-scheduler-policy @@ -147,7 +147,7 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Binpack` policy we can select `GPU2`. +So, in `Binpack` policy, the selected node is `GPU2`. #### Spread @@ -166,4 +166,4 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Spread` policy we can select `GPU1`. +So, in `Spread` policy, the selected node is `GPU1`. diff --git a/versioned_docs/version-v2.5.1/key-features/device-sharing.md b/versioned_docs/version-v2.5.1/key-features/device-sharing.md index 2ac077b2..ef9d6c17 100644 --- a/versioned_docs/version-v2.5.1/key-features/device-sharing.md +++ b/versioned_docs/version-v2.5.1/key-features/device-sharing.md @@ -2,7 +2,7 @@ title: Device sharing --- -HAMi offers robust device-sharing capabilities, enabling multiple tasks to share the same GPU, MLU, or NPU device, +HAMi provides device-sharing capabilities, enabling multiple tasks to share the same GPU, MLU, or NPU device, maximizing the utilization of heterogeneous AI computing resources. HAMi's device sharing enables: diff --git a/versioned_docs/version-v2.5.1/troubleshooting/troubleshooting.md b/versioned_docs/version-v2.5.1/troubleshooting/troubleshooting.md index f4e9a15a..f99bfb8a 100644 --- a/versioned_docs/version-v2.5.1/troubleshooting/troubleshooting.md +++ b/versioned_docs/version-v2.5.1/troubleshooting/troubleshooting.md @@ -6,6 +6,6 @@ title: Troubleshooting - Currently, A100 MIG can be supported in only "none" and "mixed" modes. - Tasks with the "nodeName" field cannot be scheduled at the moment; please use "nodeSelector" instead. - Only computing tasks are currently supported; video codec processing is not supported. -- We change `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: +- The `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: - Manually execute `kubectl edit daemonset` to modify the `device-plugin` env var from `NodeName` to `NODE_NAME`. - Upgrade to the latest version using helm, the latest version of `device-plugin` image version is `v2.3.10`, execute `helm upgrade hami hami/hami -n kube-system`, it will be fixed automatically. diff --git a/versioned_docs/version-v2.5.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/versioned_docs/version-v2.5.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index ae498abe..92b3f2fb 100644 --- a/versioned_docs/version-v2.5.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/versioned_docs/version-v2.5.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -4,15 +4,15 @@ title: Enable cambricon MLU sharing ## Introduction -**We now support cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: -***MLU sharing***: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. +**MLU sharing**: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. -***Device Memory Control***: MLUs can be allocated with certain device memory size on certain type(i.e 370) and have made it that it does not exceed the boundary. +**Device Memory Control**: MLUs can be allocated with certain device memory size on certain type(i.e 370) and have made it that it does not exceed the boundary. -***MLU Type Specification***: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. +**MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. ## Prerequisites diff --git a/versioned_docs/version-v2.5.1/userguide/device-supported.md b/versioned_docs/version-v2.5.1/userguide/device-supported.md index e0732292..61ab18e6 100644 --- a/versioned_docs/version-v2.5.1/userguide/device-supported.md +++ b/versioned_docs/version-v2.5.1/userguide/device-supported.md @@ -6,12 +6,12 @@ The table below lists the devices supported by HAMi: | Type | Manufactor | Models | MemoryIsolation | CoreIsolation | MultiCard Support | |------|------------|------|-----------------|---------------|-------------------| -| GPU | NVIDIA | All | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| NPU | Huawei Ascend | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | Iluvatar | All | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| GPU | Metax | MXC500 | ✅ | ✅ | ❌ | -| GCU | Enflame | S60 | ✅ | ✅ | ❌ | -| DPU | Teco | Checking | In progress | In progress | ❌ | +| GPU | NVIDIA | All | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| NPU | Huawei Ascend | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | Iluvatar | All | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| GPU | Metax | MXC500 | Yes | Yes | No | +| GCU | Enflame | S60 | Yes | Yes | No | +| DPU | Teco | Checking | In progress | In progress | No | diff --git a/versioned_docs/version-v2.5.1/userguide/hygon-device/enable-hygon-dcu-sharing.md b/versioned_docs/version-v2.5.1/userguide/hygon-device/enable-hygon-dcu-sharing.md index a90f4086..64fd849b 100644 --- a/versioned_docs/version-v2.5.1/userguide/hygon-device/enable-hygon-dcu-sharing.md +++ b/versioned_docs/version-v2.5.1/userguide/hygon-device/enable-hygon-dcu-sharing.md @@ -4,15 +4,15 @@ title: Enable Hygon DCU sharing ## Introduction -**We now support hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: -***DCU sharing***: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. +**DCU sharing**: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. -***Device Memory Control***: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. +**Device Memory Control**: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. -***Device compute core limitation***: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) +**Device compute core limitation**: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) -***DCU Type Specification***: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. +**DCU Type Specification**: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. ## Prerequisites diff --git a/versioned_docs/version-v2.5.1/userguide/metax-device/enable-metax-gpu-schedule.md b/versioned_docs/version-v2.5.1/userguide/metax-device/enable-metax-gpu-schedule.md index 130992f4..5dc21839 100644 --- a/versioned_docs/version-v2.5.1/userguide/metax-device/enable-metax-gpu-schedule.md +++ b/versioned_docs/version-v2.5.1/userguide/metax-device/enable-metax-gpu-schedule.md @@ -2,7 +2,7 @@ title: Enable Metax GPU topology-aware scheduling --- -**We now support metax.com/gpu by implementing topo-awareness among metax GPUs**: +**HAMi now supports metax.com/gpu by implementing topo-awareness among metax GPUs**: When multiple GPUs are configured on a single server, the GPU cards are connected to the same PCIe Switch or MetaXLink depending on whether they are connected , there is a near-far relationship. This forms a topology among all the cards on the server, as shown in the following figure: diff --git a/versioned_docs/version-v2.5.1/userguide/monitoring/device-allocation.md b/versioned_docs/version-v2.5.1/userguide/monitoring/device-allocation.md index 94bbf0a5..7060687e 100644 --- a/versioned_docs/version-v2.5.1/userguide/monitoring/device-allocation.md +++ b/versioned_docs/version-v2.5.1/userguide/monitoring/device-allocation.md @@ -21,4 +21,4 @@ It contains the following metrics: | GPUDeviceSharedNum | Number of containers sharing this GPU | `{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",nodeid="aio-node67",zone="vGPU"}` 1 | | vGPUPodsDeviceAllocated | vGPU Allocated from pods | `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 | -> **Note** Please note that, this is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file +> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file diff --git a/versioned_docs/version-v2.5.1/userguide/mthreads-device/enable-mthreads-gpu-sharing.md b/versioned_docs/version-v2.5.1/userguide/mthreads-device/enable-mthreads-gpu-sharing.md index 9942c982..1b548b3f 100644 --- a/versioned_docs/version-v2.5.1/userguide/mthreads-device/enable-mthreads-gpu-sharing.md +++ b/versioned_docs/version-v2.5.1/userguide/mthreads-device/enable-mthreads-gpu-sharing.md @@ -4,13 +4,13 @@ title: Enable Mthreads GPU sharing ## Introduction -**We now support mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. ## Important Notes diff --git a/versioned_docs/version-v2.5.1/userguide/nvidia-device/dynamic-mig-support.md b/versioned_docs/version-v2.5.1/userguide/nvidia-device/dynamic-mig-support.md index bea5435b..954348ee 100644 --- a/versioned_docs/version-v2.5.1/userguide/nvidia-device/dynamic-mig-support.md +++ b/versioned_docs/version-v2.5.1/userguide/nvidia-device/dynamic-mig-support.md @@ -130,7 +130,7 @@ nvidia: :::note Helm installations and updates will follow the configuration specified in this file, overriding the default Helm settings. -Please note that HAMi will identify and use the first MIG template that matches the job, in the order defined in this configMap. +HAMi identifies and use the first MIG template that matches the job, in the order defined in this configMap. ::: ## Running MIG jobs diff --git a/versioned_docs/version-v2.5.1/userguide/nvidia-device/examples/specify-card-type-to-use.md b/versioned_docs/version-v2.5.1/userguide/nvidia-device/examples/specify-card-type-to-use.md index 397e984f..946f0e50 100644 --- a/versioned_docs/version-v2.5.1/userguide/nvidia-device/examples/specify-card-type-to-use.md +++ b/versioned_docs/version-v2.5.1/userguide/nvidia-device/examples/specify-card-type-to-use.md @@ -24,4 +24,4 @@ spec: nvidia.com/gpu: 2 # requesting 2 vGPUs ``` -> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, we want to run this job on A100 or V100* +> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, the job targets A100 or V100* diff --git a/versioned_docs/version-v2.6.0/contributor/adopters.md b/versioned_docs/version-v2.6.0/contributor/adopters.md index 8c521c76..0d87e5eb 100644 --- a/versioned_docs/version-v2.6.0/contributor/adopters.md +++ b/versioned_docs/version-v2.6.0/contributor/adopters.md @@ -1,12 +1,12 @@ # HAMi Adopters -So you and your organisation are using HAMi? That's great. We would love to hear from you! 💖 +HAMi is used in production by the organisations listed below. ## Adding yourself [Here](https://github.com/Project-HAMi/website/blob/master/src/pages/adopters.mdx) lists the organisations who adopted the HAMi project in production. -You just need to add an entry for your company and upon merging it will automatically be added to our website. +Add an entry for your company - it will be added to the website once the PR merges. To add your organisation follow these steps: @@ -25,4 +25,4 @@ To add your organisation follow these steps: 6. Push the commit with `git push origin main`. 7. Open a Pull Request to [HAMi-io/website](https://github.com/Project-HAMi/website) and a preview build will turn up. -Thanks a lot for being part of our community - we very much appreciate it! +Thanks to all adopters for being part of the community! diff --git a/versioned_docs/version-v2.6.0/contributor/contribute-docs.md b/versioned_docs/version-v2.6.0/contributor/contribute-docs.md index f095e61a..6b2ba2d5 100644 --- a/versioned_docs/version-v2.6.0/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.6.0/contributor/contribute-docs.md @@ -9,14 +9,14 @@ the `Project-HAMi/website` repository. ## Prerequisites - Docs, like codes, are also categorized and stored by version. - 1.3 is the first version we have archived. + 1.3 is the first version is the first archived. - Docs need to be translated into multiple languages for readers from different regions. The community now supports both Chinese and English. English is the official language of documentation. -- For our docs we use markdown. If you are unfamiliar with Markdown, +- The docs use markdown. If you are unfamiliar with Markdown, please see [https://guides.github.com/features/mastering-markdown/](https://guides.github.com/features/mastering-markdown/) or [https://www.markdownguide.org/](https://www.markdownguide.org/) if you are looking for something more substantial. -- We get some additions through [Docusaurus 2](https://docusaurus.io/), a model static website generator. +- The site uses [Docusaurus 2](https://docusaurus.io/), a model static website generator. ## Setup @@ -88,7 +88,7 @@ title: A doc with tags ``` The top section between two lines of --- is the Front Matter section. -Here we define a couple of entries which tell Docusaurus how to handle the article: +These entries tell Docusaurus how to handle the article: - Title is the equivalent of the `<h1>` in a HTML document or `# <title>` in a Markdown article. - Each document has a unique ID. By default, a document ID is the name of the document @@ -106,7 +106,7 @@ You can easily route to other places by adding any of the following links: You can use relative paths to index the corresponding files. - Link to pictures or other resources. If your article contains images, prefer storing them in `/static/img/docs/` and linking - with absolute paths. We use language-aware folders: + with absolute paths. Language-aware folders are used: - `/static/img/docs/common/` for shared images - `/static/img/docs/en/` for English-only images - `/static/img/docs/zh/` for Chinese-only images @@ -202,6 +202,6 @@ If the previewed page is not what you expected, please check your docs again. ### Versioning -For the newly supplemented documents of each version, we will synchronize to the latest version +For the newly supplemented documents of each version, they are synchronized to the latest version on the release date of each version, and the documents of the old version will not be modified. -For errata found in the documentation, we will fix it with every release. +For errata found in the documentation, fixes are applied with every release. diff --git a/versioned_docs/version-v2.6.0/contributor/contributing.md b/versioned_docs/version-v2.6.0/contributor/contributing.md index 32af0a21..bcc5cbf2 100644 --- a/versioned_docs/version-v2.6.0/contributor/contributing.md +++ b/versioned_docs/version-v2.6.0/contributor/contributing.md @@ -6,7 +6,7 @@ Welcome to HAMi! ## Code of Conduct -Please make sure to read and observe our [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) +Please make sure to read and observe the [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) ## Community Expectations @@ -20,7 +20,7 @@ HAMi is a community project driven by its community which strives to promote a h ## Your First Contribution -We will help you to contribute in different areas like filing issues, developing features, fixing critical bugs and +Help is available for contributing in areas like filing issues, developing features, fixing critical bugs and getting your work reviewed and merged. If you have questions about the development process, @@ -28,7 +28,7 @@ feel free to [file an issue](https://github.com/Project-HAMi/HAMi/issues/new/cho ## Find something to work on -We are always in need of help, be it fixing documentation, reporting bugs or writing some code. +Help is always welcome - fixing documentation, reporting bugs, writing code. Look at places where you feel best coding practices aren't followed, code refactoring is needed or tests are missing. Here is how you get started. @@ -40,18 +40,18 @@ For example, [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi) has [help wanted](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) and [good first issue](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) labels for issues that should not need deep knowledge of the system. -We can help new contributors who wish to work on such issues. +Maintainers can help new contributors who wish to work on such issues. Another good way to contribute is to find a documentation improvement, such as a missing/broken link. Please see [Contributor Workflow](#contributor-workflow) below for the workflow. ### Work on an issue -When you are willing to take on an issue, just reply on the issue. The maintainer will assign it to you. +When you are willing to take on an issue, reply on the issue. The maintainer will assign it to you. ## File an Issue -While we encourage everyone to contribute code, it is also appreciated when someone reports an issue. +Code contributions are welcome, and bug reports are equally appreciated. Issues should be filed under the appropriate HAMi sub-repository. *Example:* a HAMi issue should be opened to [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi/issues). diff --git a/versioned_docs/version-v2.6.0/contributor/github-workflow.md b/versioned_docs/version-v2.6.0/contributor/github-workflow.md index 8582a392..a362a3b5 100644 --- a/versioned_docs/version-v2.6.0/contributor/github-workflow.md +++ b/versioned_docs/version-v2.6.0/contributor/github-workflow.md @@ -110,7 +110,7 @@ in a few cycles. ## Push -When ready to review (or just to establish an offsite backup of your work), +When ready to review (or to establish an offsite backup of your work), push your branch to your fork on `github.com`: ```sh diff --git a/versioned_docs/version-v2.6.0/contributor/governance.md b/versioned_docs/version-v2.6.0/contributor/governance.md index f49b23b7..aaf1e568 100644 --- a/versioned_docs/version-v2.6.0/contributor/governance.md +++ b/versioned_docs/version-v2.6.0/contributor/governance.md @@ -19,7 +19,7 @@ The HAMi and its leadership embrace the following values: priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. -* Inclusivity: We innovate through different perspectives and skill sets, which +* Inclusivity: Innovation comes from different perspectives and skill sets, and this can only be accomplished in a welcoming and respectful environment. * Participation: Responsibilities within the project are earned through diff --git a/versioned_docs/version-v2.6.0/contributor/ladder.md b/versioned_docs/version-v2.6.0/contributor/ladder.md index 26ea756f..14aff5d7 100644 --- a/versioned_docs/version-v2.6.0/contributor/ladder.md +++ b/versioned_docs/version-v2.6.0/contributor/ladder.md @@ -4,7 +4,7 @@ title: Contributor Ladder This docs different ways to get involved and level up within the project. You can see different roles within the project in the contributor roles. -Hello! We are excited that you want to learn more about our project contributor ladder! This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Community members generally start at the first levels of the "ladder" and advance up it as their involvement in the project grows. Our project members are happy to help you advance along the contributor ladder. +This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Each of the contributor roles below is organized into lists of three types of things. "Responsibilities" are things that a contributor is expected to do. "Requirements" are qualifications a person needs to meet to be in that role, and "Privileges" are things contributors on that level are entitled to. @@ -45,7 +45,7 @@ Description: A Contributor contributes directly to the project and adds value to * Invitations to contributor events * Eligible to become an Organization Member -A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. We wouldn't be where we are today without your contributions. Thank you! 💖 +A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. Thanks to everyone who contributed and helped maintain the project. As long as you contribute to HAMi, your name will be added to the [HAMi AUTHORS list](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md). If you don't find your name, please contact us to add it. @@ -126,7 +126,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github ## An active maintainer should -* Actively participate in reviewing pull requests and incoming issues. Note that there are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. +* Actively participate in reviewing pull requests and incoming issues. There are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. * Actively participate in discussions about design and the future of the project. @@ -140,7 +140,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github New maintainers are added by consensus among the current group of maintainers. This can be done via a private discussion via Slack or email. A majority of maintainers should support the addition of the new person, and no single maintainer should object to adding the new maintainer. -When adding a new maintainer, we should file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. +When adding a new maintainer, file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. ## Removing Maintainers diff --git a/versioned_docs/version-v2.6.0/developers/dynamic-mig.md b/versioned_docs/version-v2.6.0/developers/dynamic-mig.md index fd22875b..8872a832 100644 --- a/versioned_docs/version-v2.6.0/developers/dynamic-mig.md +++ b/versioned_docs/version-v2.6.0/developers/dynamic-mig.md @@ -10,8 +10,8 @@ This feature will not be implemented without the help of @sailorvii. ## Introduction -The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, so we chose the MPS and MIG. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. We want to develop an automatic slice plugin and create the slice when the user require it. -For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, we consider the CPU, Mem, GPU memory and other user-defined resource. +The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, MPS and MIG are preferred. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. The goal is an automatic slice plugin that creates slices on demand. +For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, the scheduler considers CPU, memory, GPU memory, and other user-defined resources. HAMi is done by using [hami-core](https://github.com/Project-HAMi/HAMi-core), which is a cuda-hacking library. But mig is also widely used across the world. A unified API for dynamic-mig and hami-core is needed. ## Targets @@ -149,7 +149,7 @@ The Procedure of a vGPU task which uses dynamic-mig is shown below: <img src="https://github.com/Project-HAMi/HAMi/blob/master/docs/develop/imgs/hami-dynamic-mig-procedure.png?raw=true" width="800" alt="HAMi dynamic MIG procedure flowchart showing task scheduling process" /> -Note that after submitted a task, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. +After a task is submitted, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. If you submit the example on an empty A100-PCIE-40GB node, then it will select a GPU and choose MIG template below: diff --git a/versioned_docs/version-v2.6.0/developers/protocol.md b/versioned_docs/version-v2.6.0/developers/protocol.md index 54d04056..ba994abb 100644 --- a/versioned_docs/version-v2.6.0/developers/protocol.md +++ b/versioned_docs/version-v2.6.0/developers/protocol.md @@ -31,7 +31,7 @@ hami.io/node-nvidia-register: GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec,10,32768, In this example, this node has two different AI devices, 2 Nvidia-V100 GPUs, and 2 Cambircon 370-X4 MLUs -Note that a device node may become unavailable due to hardware or network failure, if a node hasn't registered in last 5 minutes, scheduler will mark that node as 'unavailable'. +A device node may become unavailable due to hardware or network failure, if a node hasn't registered in last 5 minutes, scheduler will mark that node as 'unavailable'. Since system clock on scheduler node and 'device' node may not align properly, scheduler node will patch the following device node annotations every 30s diff --git a/versioned_docs/version-v2.6.0/developers/scheduling.md b/versioned_docs/version-v2.6.0/developers/scheduling.md index d80d3957..04b8c784 100644 --- a/versioned_docs/version-v2.6.0/developers/scheduling.md +++ b/versioned_docs/version-v2.6.0/developers/scheduling.md @@ -8,7 +8,7 @@ Current in a cluster with many GPU nodes, nodes are not `binpack` or `spread` wh ## Proposal -We add a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread` or `topology-aware`. The `topology-aware` policy only takes effect with Nvidia GPUs. +The scheduler adds a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread` or `topology-aware`. The `topology-aware` policy only takes effect with Nvidia GPUs. User can set Pod annotation to change this default policy, use `hami.io/node-scheduler-policy` and `hami.io/gpu-scheduler-policy` to overlay scheduler config. @@ -105,7 +105,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Binpack` policy we can select `Node1`. +So, in `Binpack` policy, the selected node is `Node1`. #### Spread @@ -125,7 +125,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Spread` policy we can select `Node2`. +So, in `Spread` policy, the selected node is `Node2`. ### GPU-scheduler-policy @@ -148,7 +148,7 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Binpack` policy we can select `GPU2`. +So, in `Binpack` policy, the selected node is `GPU2`. #### Spread @@ -167,7 +167,7 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Spread` policy we can select `GPU1`. +So, in `Spread` policy, the selected node is `GPU1`. #### Topology-aware @@ -231,7 +231,7 @@ gpu2 score: 100 + 200 + 200 = 500 gpu3 score: 200 + 100 + 200 = 500 ``` -Therefore, when a **Pod requests only one GPU**, we randomly select either **gpu0** or **gpu1**. +Therefore, when a **Pod requests only one GPU**, the scheduler randomly selects either **gpu0** or **gpu1**. ###### More than one GPU @@ -253,4 +253,4 @@ For example: If a Pod requests 3 GPUs, take **gpu0, gpu1, gpu2** as an example. (gpu1, gpu2, gpu3) totalScore: 200 + 100 + 200 = 500 ``` -Therefore, when a **Pod requests 3 GPUs**, we allocate **gpu1, gpu2, gpu3**. +Therefore, when a **Pod requests 3 GPUs**, the scheduler allocates **gpu1, gpu2, gpu3**. diff --git a/versioned_docs/version-v2.6.0/faq/faq.md b/versioned_docs/version-v2.6.0/faq/faq.md index 6a9b72f4..f41a45ff 100644 --- a/versioned_docs/version-v2.6.0/faq/faq.md +++ b/versioned_docs/version-v2.6.0/faq/faq.md @@ -42,7 +42,7 @@ A vGPU is a logical instance of a physical GPU created using virtualization, all 4. **Design Intent** The design of vGPU aims to **allow one GPU to be shared by multiple tasks**, rather than letting one task occupy multiple vGPUs on the same GPU. The purpose of vGPU overcommitment is to improve GPU utilization, not to increase resource allocation for individual tasks. -## HAMi's `nvidia.com/priority` field only supports two levels. How can we implement multi-level, user-defined priority-based scheduling for a queue of jobs, especially when cluster resources are limited? +## HAMi's `nvidia.com/priority` field only supports two levels. How to implement multi-level, user-defined priority-based scheduling for a queue of jobs, especially when cluster resources are limited? **TL;DR** @@ -63,7 +63,7 @@ However, achieving multi-level priority scheduling **is feasible**. The recommen 1. HAMi integrates with Volcano via the [volcano-vgpu-device-plugin](https://github.com/Project-HAMi/volcano-vgpu-device-plugin). 2. It continues to manage the vGPU sharing and its own two-level runtime priority for tasks contending on the *same physical GPU*, as described earlier. -In summary, while HAMi's own priority serves a different, device-specific purpose (runtime preemption on a single card), implementing multi-level job scheduling priority is achievable by using **Volcano in conjunction with HAMi**. Volcano would handle which job from the queue is prioritized for resource allocation based on multiple priority levels, and HAMi would manage the GPU sharing and its specific on-device preemption. +While HAMi's own priority serves a different, device-specific purpose (runtime preemption on a single card), implementing multi-level job scheduling priority is achievable by using **Volcano in conjunction with HAMi**. Volcano would handle which job from the queue is prioritized for resource allocation based on multiple priority levels, and HAMi would manage the GPU sharing and its specific on-device preemption. ## Integration with Other Open-Source Tools diff --git a/versioned_docs/version-v2.6.0/key-features/device-sharing.md b/versioned_docs/version-v2.6.0/key-features/device-sharing.md index 2ac077b2..ef9d6c17 100644 --- a/versioned_docs/version-v2.6.0/key-features/device-sharing.md +++ b/versioned_docs/version-v2.6.0/key-features/device-sharing.md @@ -2,7 +2,7 @@ title: Device sharing --- -HAMi offers robust device-sharing capabilities, enabling multiple tasks to share the same GPU, MLU, or NPU device, +HAMi provides device-sharing capabilities, enabling multiple tasks to share the same GPU, MLU, or NPU device, maximizing the utilization of heterogeneous AI computing resources. HAMi's device sharing enables: diff --git a/versioned_docs/version-v2.6.0/troubleshooting-copy/troubleshooting.md b/versioned_docs/version-v2.6.0/troubleshooting-copy/troubleshooting.md index f4e9a15a..f99bfb8a 100644 --- a/versioned_docs/version-v2.6.0/troubleshooting-copy/troubleshooting.md +++ b/versioned_docs/version-v2.6.0/troubleshooting-copy/troubleshooting.md @@ -6,6 +6,6 @@ title: Troubleshooting - Currently, A100 MIG can be supported in only "none" and "mixed" modes. - Tasks with the "nodeName" field cannot be scheduled at the moment; please use "nodeSelector" instead. - Only computing tasks are currently supported; video codec processing is not supported. -- We change `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: +- The `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: - Manually execute `kubectl edit daemonset` to modify the `device-plugin` env var from `NodeName` to `NODE_NAME`. - Upgrade to the latest version using helm, the latest version of `device-plugin` image version is `v2.3.10`, execute `helm upgrade hami hami/hami -n kube-system`, it will be fixed automatically. diff --git a/versioned_docs/version-v2.6.0/troubleshooting/troubleshooting.md b/versioned_docs/version-v2.6.0/troubleshooting/troubleshooting.md index f4e9a15a..f99bfb8a 100644 --- a/versioned_docs/version-v2.6.0/troubleshooting/troubleshooting.md +++ b/versioned_docs/version-v2.6.0/troubleshooting/troubleshooting.md @@ -6,6 +6,6 @@ title: Troubleshooting - Currently, A100 MIG can be supported in only "none" and "mixed" modes. - Tasks with the "nodeName" field cannot be scheduled at the moment; please use "nodeSelector" instead. - Only computing tasks are currently supported; video codec processing is not supported. -- We change `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: +- The `device-plugin` env var name from `NodeName` to `NODE_NAME`, if you use the image version `v2.3.9`, you may encounter the situation that `device-plugin` cannot start, there are two ways to fix it: - Manually execute `kubectl edit daemonset` to modify the `device-plugin` env var from `NodeName` to `NODE_NAME`. - Upgrade to the latest version using helm, the latest version of `device-plugin` image version is `v2.3.10`, execute `helm upgrade hami hami/hami -n kube-system`, it will be fixed automatically. diff --git a/versioned_docs/version-v2.6.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/versioned_docs/version-v2.6.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index d5fe450d..477574a6 100644 --- a/versioned_docs/version-v2.6.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/versioned_docs/version-v2.6.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -4,15 +4,15 @@ title: Enable cambricon MLU sharing ## Introduction -**We now support cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: -***MLU sharing***: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. +**MLU sharing**: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. -***Device Memory Control***: MLUs can be allocated with certain device memory size and guarantee it that it does not exceed the boundary. +**Device Memory Control**: MLUs can be allocated with certain device memory size and guarantee it that it does not exceed the boundary. -***Device Core Control***: MLUs can be allocated with certain compute cores and guarantee it that it does not exceed the boundary. +**Device Core Control**: MLUs can be allocated with certain compute cores and guarantee it that it does not exceed the boundary. -***MLU Type Specification***: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. +**MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. ## Prerequisites @@ -38,7 +38,7 @@ cnmon set -c 1 -smlu on These two parameters represent enabling the dynamic smlu function and setting the minimum allocable memory unit to 256 MB, respectively. You can refer to the document from device provider for more details -* Deploy the cambricon-device-plugin you just specified +* Deploy the cambricon-device-plugin you specified ``` kubectl apply -f cambricon-device-plugin-daemonset.yaml diff --git a/versioned_docs/version-v2.6.0/userguide/device-supported.md b/versioned_docs/version-v2.6.0/userguide/device-supported.md index e0732292..61ab18e6 100644 --- a/versioned_docs/version-v2.6.0/userguide/device-supported.md +++ b/versioned_docs/version-v2.6.0/userguide/device-supported.md @@ -6,12 +6,12 @@ The table below lists the devices supported by HAMi: | Type | Manufactor | Models | MemoryIsolation | CoreIsolation | MultiCard Support | |------|------------|------|-----------------|---------------|-------------------| -| GPU | NVIDIA | All | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| NPU | Huawei Ascend | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | Iluvatar | All | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| GPU | Metax | MXC500 | ✅ | ✅ | ❌ | -| GCU | Enflame | S60 | ✅ | ✅ | ❌ | -| DPU | Teco | Checking | In progress | In progress | ❌ | +| GPU | NVIDIA | All | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| NPU | Huawei Ascend | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | Iluvatar | All | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| GPU | Metax | MXC500 | Yes | Yes | No | +| GCU | Enflame | S60 | Yes | Yes | No | +| DPU | Teco | Checking | In progress | In progress | No | diff --git a/versioned_docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md b/versioned_docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md index 3f3eecbf..7a1bfec9 100644 --- a/versioned_docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md +++ b/versioned_docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md @@ -5,19 +5,19 @@ title: Enable Enflame GPU Sharing ## Introduction -**We now support sharing on enflame.com/gcu(i.e S60) by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports sharing on enflame.com/gcu(i.e S60) by implementing most device-sharing features as nvidia-GPU**, including: -***GCU sharing***: Each task can allocate a portion of GCU instead of a whole GCU card, thus GCU can be shared among multiple tasks. +**GCU sharing**: Each task can allocate a portion of GCU instead of a whole GCU card, thus GCU can be shared among multiple tasks. -***Device Memory and Core Control***: GCUs can be allocated with certain percentage of device memory and core, we make sure that it does not exceed the boundary. +**Device Memory and Core Control**: GCUs can be allocated with certain percentage of device memory and core, HAMi ensures it does not exceed the boundary. -***Device UUID Selection***: You can specify which GCU devices to use or exclude using annotations. +**Device UUID Selection**: You can specify which GCU devices to use or exclude using annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites -* Enflame gcushare-device-plugin >= 2.1.6 (please consult your device provider, gcushare has two components: gcushare-scheduler-plugin and gcushare-device-plugin, we only need gcushare-device-plugin here ) +* Enflame gcushare-device-plugin >= 2.1.6 (please consult your device provider, gcushare has two components: gcushare-scheduler-plugin and gcushare-device-plugin, only gcushare-device-plugin is needed here ) * driver version >= 1.2.3.14 * kubernetes >= 1.24 * enflame-container-toolkit >=2.0.50 diff --git a/versioned_docs/version-v2.6.0/userguide/hygon-device/enable-hygon-dcu-sharing.md b/versioned_docs/version-v2.6.0/userguide/hygon-device/enable-hygon-dcu-sharing.md index 292040f3..96fc2ffa 100644 --- a/versioned_docs/version-v2.6.0/userguide/hygon-device/enable-hygon-dcu-sharing.md +++ b/versioned_docs/version-v2.6.0/userguide/hygon-device/enable-hygon-dcu-sharing.md @@ -4,15 +4,15 @@ title: Enable Hygon DCU sharing ## Introduction -**We now support hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: -***DCU sharing***: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. +**DCU sharing**: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. -***Device Memory Control***: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. +**Device Memory Control**: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. -***Device compute core limitation***: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) +**Device compute core limitation**: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) -***DCU Type Specification***: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. +**DCU Type Specification**: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. ## Prerequisites diff --git a/versioned_docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md b/versioned_docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md index 1de8daac..8fb912f3 100644 --- a/versioned_docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md +++ b/versioned_docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md @@ -5,17 +5,17 @@ title: Enable Illuvatar GPU Sharing ## Introduction -**We now support iluvatar.ai/gpu(i.e MR-V100、BI-V150、BI-V100) by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports iluvatar.ai/gpu(i.e MR-V100、BI-V150、BI-V100) by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores and have made it that it does not exceed the boundary. -***Device UUID Selection***: You can specify which GPU devices to use or exclude using annotations. +**Device UUID Selection**: You can specify which GPU devices to use or exclude using annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.6.0/userguide/metax-device/metax-gpu/enable-metax-gpu-schedule.md b/versioned_docs/version-v2.6.0/userguide/metax-device/metax-gpu/enable-metax-gpu-schedule.md index da0b5726..0cdf6ff6 100644 --- a/versioned_docs/version-v2.6.0/userguide/metax-device/metax-gpu/enable-metax-gpu-schedule.md +++ b/versioned_docs/version-v2.6.0/userguide/metax-device/metax-gpu/enable-metax-gpu-schedule.md @@ -4,7 +4,7 @@ title: Enable Metax GPU topology-aware scheduling ## Introduction -**we now support metax.com/gpu by implementing topo-awareness among metax GPUs** +**HAMi now supports metax.com/gpu with topo-awareness among metax GPUs** When multiple GPUs are configured on a single server, the GPU cards are connected to the same PCIe Switch or MetaXLink depending on whether they are connected , there is a near-far relationship. This forms a topology among all the cards on the server, as shown in the following figure: diff --git a/versioned_docs/version-v2.6.0/userguide/metax-device/metax-sgpu/enable-metax-gpu-sharing.md b/versioned_docs/version-v2.6.0/userguide/metax-device/metax-sgpu/enable-metax-gpu-sharing.md index 0b8d9132..386237d3 100644 --- a/versioned_docs/version-v2.6.0/userguide/metax-device/metax-sgpu/enable-metax-gpu-sharing.md +++ b/versioned_docs/version-v2.6.0/userguide/metax-device/metax-sgpu/enable-metax-gpu-sharing.md @@ -5,13 +5,13 @@ translated: true ## Introduction -**we now support metax.com/gpu by implementing most device-sharing features as nvidia-GPU**, device-sharing features include the following: +**HAMi now supports metax.com/gpu with most device-sharing features as nvidia-GPU**, device-sharing features include the following: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. -***Device compute core limitation***: GPUs can be allocated with certain percentage of device core(60 indicate this container uses 60% compute cores of this device) +**Device compute core limitation**: GPUs can be allocated with certain percentage of device core(60 indicate this container uses 60% compute cores of this device) ### Prerequisites diff --git a/versioned_docs/version-v2.6.0/userguide/monitoring/device-allocation.md b/versioned_docs/version-v2.6.0/userguide/monitoring/device-allocation.md index 94bbf0a5..7060687e 100644 --- a/versioned_docs/version-v2.6.0/userguide/monitoring/device-allocation.md +++ b/versioned_docs/version-v2.6.0/userguide/monitoring/device-allocation.md @@ -21,4 +21,4 @@ It contains the following metrics: | GPUDeviceSharedNum | Number of containers sharing this GPU | `{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",nodeid="aio-node67",zone="vGPU"}` 1 | | vGPUPodsDeviceAllocated | vGPU Allocated from pods | `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 | -> **Note** Please note that, this is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file +> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file diff --git a/versioned_docs/version-v2.6.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md b/versioned_docs/version-v2.6.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md index eda503ad..7ccf3dce 100644 --- a/versioned_docs/version-v2.6.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md +++ b/versioned_docs/version-v2.6.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md @@ -4,13 +4,13 @@ title: Enable Mthreads GPU sharing ## Introduction -**We now support mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. ## Important Notes diff --git a/versioned_docs/version-v2.6.0/userguide/nvidia-device/dynamic-mig-support.md b/versioned_docs/version-v2.6.0/userguide/nvidia-device/dynamic-mig-support.md index bea5435b..954348ee 100644 --- a/versioned_docs/version-v2.6.0/userguide/nvidia-device/dynamic-mig-support.md +++ b/versioned_docs/version-v2.6.0/userguide/nvidia-device/dynamic-mig-support.md @@ -130,7 +130,7 @@ nvidia: :::note Helm installations and updates will follow the configuration specified in this file, overriding the default Helm settings. -Please note that HAMi will identify and use the first MIG template that matches the job, in the order defined in this configMap. +HAMi identifies and use the first MIG template that matches the job, in the order defined in this configMap. ::: ## Running MIG jobs diff --git a/versioned_docs/version-v2.6.0/userguide/nvidia-device/examples/specify-card-type-to-use.md b/versioned_docs/version-v2.6.0/userguide/nvidia-device/examples/specify-card-type-to-use.md index dd7239fd..a05bf560 100644 --- a/versioned_docs/version-v2.6.0/userguide/nvidia-device/examples/specify-card-type-to-use.md +++ b/versioned_docs/version-v2.6.0/userguide/nvidia-device/examples/specify-card-type-to-use.md @@ -24,4 +24,4 @@ spec: nvidia.com/gpu: 2 # requesting 2 vGPUs ``` -> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, we want to run this job on A100 or V100* \ No newline at end of file +> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, the job targets A100 or V100* \ No newline at end of file diff --git a/versioned_docs/version-v2.7.0/contributor/adopters.md b/versioned_docs/version-v2.7.0/contributor/adopters.md index 8c521c76..0d87e5eb 100644 --- a/versioned_docs/version-v2.7.0/contributor/adopters.md +++ b/versioned_docs/version-v2.7.0/contributor/adopters.md @@ -1,12 +1,12 @@ # HAMi Adopters -So you and your organisation are using HAMi? That's great. We would love to hear from you! 💖 +HAMi is used in production by the organisations listed below. ## Adding yourself [Here](https://github.com/Project-HAMi/website/blob/master/src/pages/adopters.mdx) lists the organisations who adopted the HAMi project in production. -You just need to add an entry for your company and upon merging it will automatically be added to our website. +Add an entry for your company - it will be added to the website once the PR merges. To add your organisation follow these steps: @@ -25,4 +25,4 @@ To add your organisation follow these steps: 6. Push the commit with `git push origin main`. 7. Open a Pull Request to [HAMi-io/website](https://github.com/Project-HAMi/website) and a preview build will turn up. -Thanks a lot for being part of our community - we very much appreciate it! +Thanks to all adopters for being part of the community! diff --git a/versioned_docs/version-v2.7.0/contributor/contribute-docs.md b/versioned_docs/version-v2.7.0/contributor/contribute-docs.md index 58f4b0d4..b463684c 100644 --- a/versioned_docs/version-v2.7.0/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.7.0/contributor/contribute-docs.md @@ -9,14 +9,14 @@ the `Project-HAMi/website` repository. ## Prerequisites - Docs, like codes, are also categorized and stored by version. - 1.3 is the first version we have archived. + 1.3 is the first version is the first archived. - Docs need to be translated into multiple languages for readers from different regions. The community now supports both Chinese and English. English is the official language of documentation. -- For our docs we use markdown. If you are unfamiliar with Markdown, +- The docs use markdown. If you are unfamiliar with Markdown, please see [https://guides.github.com/features/mastering-markdown/](https://guides.github.com/features/mastering-markdown/) or [https://www.markdownguide.org/](https://www.markdownguide.org/) if you are looking for something more substantial. -- We get some additions through [Docusaurus 2](https://docusaurus.io/), a model static website generator. +- The site uses [Docusaurus 2](https://docusaurus.io/), a model static website generator. ## Setup @@ -88,7 +88,7 @@ title: A doc with tags ``` The top section between two lines of --- is the Front Matter section. -Here we define a couple of entries which tell Docusaurus how to handle the article: +These entries tell Docusaurus how to handle the article: - Title is the equivalent of the `<h1>` in a HTML document or `# <title>` in a Markdown article. - Each document has a unique ID. By default, a document ID is the name of the document @@ -106,7 +106,7 @@ You can easily route to other places by adding any of the following links: You can use relative paths to index the corresponding files. - Link to pictures or other resources. If your article contains images, prefer storing them in `/static/img/docs/` and linking - with absolute paths. We use language-aware folders: + with absolute paths. Language-aware folders are used: - `/static/img/docs/common/` for shared images - `/static/img/docs/en/` for English-only images - `/static/img/docs/zh/` for Chinese-only images @@ -202,6 +202,6 @@ If the previewed page is not what you expected, please check your docs again. ### Versioning -For the newly supplemented documents of each version, we will synchronize to the latest version +For the newly supplemented documents of each version, they are synchronized to the latest version on the release date of each version, and the documents of the old version will not be modified. -For errata found in the documentation, we will fix it with every release. +For errata found in the documentation, fixes are applied with every release. diff --git a/versioned_docs/version-v2.7.0/contributor/contributing.md b/versioned_docs/version-v2.7.0/contributor/contributing.md index fab6d31c..0c88de85 100644 --- a/versioned_docs/version-v2.7.0/contributor/contributing.md +++ b/versioned_docs/version-v2.7.0/contributor/contributing.md @@ -6,7 +6,7 @@ Welcome to HAMi! ## Code of Conduct -Please make sure to read and observe our [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) +Please make sure to read and observe the [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) ## Community Expectations @@ -20,7 +20,7 @@ HAMi is a community project driven by its community which strives to promote a h ## Your First Contribution -We will help you to contribute in different areas like filing issues, developing features, fixing critical bugs and +Help is available for contributing in areas like filing issues, developing features, fixing critical bugs and getting your work reviewed and merged. If you have questions about the development process, @@ -28,7 +28,7 @@ feel free to [file an issue](https://github.com/Project-HAMi/HAMi/issues/new/cho ## Find something to work on -We are always in need of help, be it fixing documentation, reporting bugs or writing some code. +Help is always welcome - fixing documentation, reporting bugs, writing code. Look at places where you feel best coding practices aren't followed, code refactoring is needed or tests are missing. Here is how you get started. @@ -40,18 +40,18 @@ For example, [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi) has [help wanted](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) and [good first issue](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) labels for issues that should not need deep knowledge of the system. -We can help new contributors who wish to work on such issues. +Maintainers can help new contributors who wish to work on such issues. Another good way to contribute is to find a documentation improvement, such as a missing/broken link. Please see [Contributor Workflow](#contributor-workflow) below for the workflow. #### Work on an issue -When you are willing to take on an issue, just reply on the issue. The maintainer will assign it to you. +When you are willing to take on an issue, reply on the issue. The maintainer will assign it to you. ### File an Issue -While we encourage everyone to contribute code, it is also appreciated when someone reports an issue. +Code contributions are welcome, and bug reports are equally appreciated. Issues should be filed under the appropriate HAMi sub-repository. *Example:* a HAMi issue should be opened to [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi/issues). diff --git a/versioned_docs/version-v2.7.0/contributor/github-workflow.md b/versioned_docs/version-v2.7.0/contributor/github-workflow.md index 8582a392..a362a3b5 100644 --- a/versioned_docs/version-v2.7.0/contributor/github-workflow.md +++ b/versioned_docs/version-v2.7.0/contributor/github-workflow.md @@ -110,7 +110,7 @@ in a few cycles. ## Push -When ready to review (or just to establish an offsite backup of your work), +When ready to review (or to establish an offsite backup of your work), push your branch to your fork on `github.com`: ```sh diff --git a/versioned_docs/version-v2.7.0/contributor/governance.md b/versioned_docs/version-v2.7.0/contributor/governance.md index f49b23b7..aaf1e568 100644 --- a/versioned_docs/version-v2.7.0/contributor/governance.md +++ b/versioned_docs/version-v2.7.0/contributor/governance.md @@ -19,7 +19,7 @@ The HAMi and its leadership embrace the following values: priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. -* Inclusivity: We innovate through different perspectives and skill sets, which +* Inclusivity: Innovation comes from different perspectives and skill sets, and this can only be accomplished in a welcoming and respectful environment. * Participation: Responsibilities within the project are earned through diff --git a/versioned_docs/version-v2.7.0/contributor/ladder.md b/versioned_docs/version-v2.7.0/contributor/ladder.md index 50a277f9..faac0fd1 100644 --- a/versioned_docs/version-v2.7.0/contributor/ladder.md +++ b/versioned_docs/version-v2.7.0/contributor/ladder.md @@ -6,7 +6,7 @@ This docs different ways to get involved and level up within the project. You ca ## Contributor Ladder -Hello! We are excited that you want to learn more about our project contributor ladder! This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Community members generally start at the first levels of the "ladder" and advance up it as their involvement in the project grows. Our project members are happy to help you advance along the contributor ladder. +This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Each of the contributor roles below is organized into lists of three types of things. "Responsibilities" are things that a contributor is expected to do. "Requirements" are qualifications a person needs to meet to be in that role, and "Privileges" are things contributors on that level are entitled to. @@ -47,7 +47,7 @@ Description: A Contributor contributes directly to the project and adds value to * Invitations to contributor events * Eligible to become an Organization Member -A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. We wouldn't be where we are today without your contributions. Thank you! 💖 +A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. Thanks to everyone who contributed and helped maintain the project. As long as you contribute to HAMi, your name will be added [here](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md). If you don't find your name, please contact us to add it. @@ -128,7 +128,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github ## An active maintainer should -* Actively participate in reviewing pull requests and incoming issues. Note that there are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. +* Actively participate in reviewing pull requests and incoming issues. There are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. * Actively participate in discussions about design and the future of the project. @@ -142,7 +142,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github New maintainers are added by consensus among the current group of maintainers. This can be done via a private discussion via Slack or email. A majority of maintainers should support the addition of the new person, and no single maintainer should object to adding the new maintainer. -When adding a new maintainer, we should file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. +When adding a new maintainer, file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. ## Removing Maintainers diff --git a/versioned_docs/version-v2.7.0/developers/dynamic-mig.md b/versioned_docs/version-v2.7.0/developers/dynamic-mig.md index fd22875b..8872a832 100644 --- a/versioned_docs/version-v2.7.0/developers/dynamic-mig.md +++ b/versioned_docs/version-v2.7.0/developers/dynamic-mig.md @@ -10,8 +10,8 @@ This feature will not be implemented without the help of @sailorvii. ## Introduction -The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, so we chose the MPS and MIG. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. We want to develop an automatic slice plugin and create the slice when the user require it. -For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, we consider the CPU, Mem, GPU memory and other user-defined resource. +The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, MPS and MIG are preferred. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. The goal is an automatic slice plugin that creates slices on demand. +For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, the scheduler considers CPU, memory, GPU memory, and other user-defined resources. HAMi is done by using [hami-core](https://github.com/Project-HAMi/HAMi-core), which is a cuda-hacking library. But mig is also widely used across the world. A unified API for dynamic-mig and hami-core is needed. ## Targets @@ -149,7 +149,7 @@ The Procedure of a vGPU task which uses dynamic-mig is shown below: <img src="https://github.com/Project-HAMi/HAMi/blob/master/docs/develop/imgs/hami-dynamic-mig-procedure.png?raw=true" width="800" alt="HAMi dynamic MIG procedure flowchart showing task scheduling process" /> -Note that after submitted a task, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. +After a task is submitted, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. If you submit the example on an empty A100-PCIE-40GB node, then it will select a GPU and choose MIG template below: diff --git a/versioned_docs/version-v2.7.0/developers/kunlunxin-topology.md b/versioned_docs/version-v2.7.0/developers/kunlunxin-topology.md index fb66c728..05331eed 100644 --- a/versioned_docs/version-v2.7.0/developers/kunlunxin-topology.md +++ b/versioned_docs/version-v2.7.0/developers/kunlunxin-topology.md @@ -30,7 +30,7 @@ The selection process is shown below: ## Score In the scoring phase, all filtered nodes are evaluated and scored to select the optimal one -for scheduling. We introduce a metric called **MTF** (Minimized Tasks to Fill), +for scheduling. The metric used is **MTF** (Minimized Tasks to Fill), which quantifies how well a node can accommodate future tasks after allocation. The table below shows examples of XPU occupation and proper MTF values: diff --git a/versioned_docs/version-v2.7.0/developers/protocol.md b/versioned_docs/version-v2.7.0/developers/protocol.md index 2d3bbd03..fdd826f6 100644 --- a/versioned_docs/version-v2.7.0/developers/protocol.md +++ b/versioned_docs/version-v2.7.0/developers/protocol.md @@ -31,7 +31,7 @@ hami.io/node-nvidia-register: GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec,10,32768, In this example, this node has two different AI devices, 2 Nvidia-V100 GPUs, and 2 Cambircon 370-X4 MLUs -Note that a device node may become unavailable due to hardware or network failure, if a node hasn't registered in last 5 minutes, scheduler will mark that node as 'unavailable'. +A device node may become unavailable due to hardware or network failure, if a node hasn't registered in last 5 minutes, scheduler will mark that node as 'unavailable'. Since system clock on scheduler node and 'device' node may not align properly, scheduler node will patch the following device node annotations every 30s diff --git a/versioned_docs/version-v2.7.0/developers/scheduling.md b/versioned_docs/version-v2.7.0/developers/scheduling.md index 02270146..7f8ce100 100644 --- a/versioned_docs/version-v2.7.0/developers/scheduling.md +++ b/versioned_docs/version-v2.7.0/developers/scheduling.md @@ -8,7 +8,7 @@ Current in a cluster with many GPU nodes, nodes are not `binpack` or `spread` wh ## Proposal -We add a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and +The scheduler adds a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and use can set Pod annotation to change this default policy, use `hami.io/node-scheduler-policy` and `hami.io/gpu-scheduler-policy` to overlay scheduler config. ### User Stories @@ -104,7 +104,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Binpack` policy we can select `Node1`. +So, in `Binpack` policy, the selected node is `Node1`. #### Spread @@ -124,7 +124,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Spread` policy we can select `Node2`. +So, in `Spread` policy, the selected node is `Node2`. ### GPU-scheduler-policy @@ -147,7 +147,7 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Binpack` policy we can select `GPU2`. +So, in `Binpack` policy, the selected node is `GPU2`. #### Spread @@ -166,4 +166,4 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Spread` policy we can select `GPU1`. +So, in `Spread` policy, the selected node is `GPU1`. diff --git a/versioned_docs/version-v2.7.0/faq/faq.md b/versioned_docs/version-v2.7.0/faq/faq.md index 067ca2b8..91d95689 100644 --- a/versioned_docs/version-v2.7.0/faq/faq.md +++ b/versioned_docs/version-v2.7.0/faq/faq.md @@ -42,7 +42,7 @@ A vGPU is a logical instance of a physical GPU created using virtualization, all 4. **Design Intent** The design of vGPU aims to **allow one GPU to be shared by multiple tasks**, rather than letting one task occupy multiple vGPUs on the same GPU. The purpose of vGPU overcommitment is to improve GPU utilization, not to increase resource allocation for individual tasks. -## HAMi's `nvidia.com/priority` field only supports two levels. How can we implement multi-level, user-defined priority-based scheduling for a queue of jobs, especially when cluster resources are limited? +## HAMi's `nvidia.com/priority` field only supports two levels. How to implement multi-level, user-defined priority-based scheduling for a queue of jobs, especially when cluster resources are limited? **TL;DR** @@ -63,7 +63,7 @@ However, achieving multi-level priority scheduling **is feasible**. The recommen 1. HAMi integrates with Volcano via the [volcano-vgpu-device-plugin](https://github.com/Project-HAMi/volcano-vgpu-device-plugin). 2. It continues to manage the vGPU sharing and its own two-level runtime priority for tasks contending on the *same physical GPU*, as described earlier. -In summary, while HAMi's own priority serves a different, device-specific purpose (runtime preemption on a single card), implementing multi-level job scheduling priority is achievable by using **Volcano in conjunction with HAMi**. Volcano would handle which job from the queue is prioritized for resource allocation based on multiple priority levels, and HAMi would manage the GPU sharing and its specific on-device preemption. +While HAMi's own priority serves a different, device-specific purpose (runtime preemption on a single card), implementing multi-level job scheduling priority is achievable by using **Volcano in conjunction with HAMi**. Volcano would handle which job from the queue is prioritized for resource allocation based on multiple priority levels, and HAMi would manage the GPU sharing and its specific on-device preemption. ## Integration with Other Open-Source Tools diff --git a/versioned_docs/version-v2.7.0/key-features/device-sharing.md b/versioned_docs/version-v2.7.0/key-features/device-sharing.md index 29bed9d2..19797b9e 100644 --- a/versioned_docs/version-v2.7.0/key-features/device-sharing.md +++ b/versioned_docs/version-v2.7.0/key-features/device-sharing.md @@ -2,7 +2,7 @@ title: Device sharing --- -HAMi offers robust device-sharing capabilities, enabling multiple tasks to share the same GPU, MLU, or NPU device, +HAMi provides device-sharing capabilities, enabling multiple tasks to share the same GPU, MLU, or NPU device, maximizing the utilization of heterogeneous AI computing resources. ## Device Sharing {#device-sharing} diff --git a/versioned_docs/version-v2.7.0/userguide/device-supported.md b/versioned_docs/version-v2.7.0/userguide/device-supported.md index c76adc36..446a8e39 100644 --- a/versioned_docs/version-v2.7.0/userguide/device-supported.md +++ b/versioned_docs/version-v2.7.0/userguide/device-supported.md @@ -6,13 +6,13 @@ The table below lists the devices supported by HAMi: | Type | Manufactor | Models | MemoryIsolation | CoreIsolation | MultiCard Support | |------|------------|------|-----------------|---------------|-------------------| -| GPU | NVIDIA | All | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| NPU | Huawei Ascend | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | Iluvatar | All | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| GPU | Metax | MXC500 | ✅ | ✅ | ❌ | -| GCU | Enflame | S60 | ✅ | ✅ | ❌ | -| XPU | Kunlunxin | P800 | ✅ | ✅ | ❌ | -| DPU | Teco | Checking | In progress | In progress | ❌ | +| GPU | NVIDIA | All | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| NPU | Huawei Ascend | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | Iluvatar | All | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| GPU | Metax | MXC500 | Yes | Yes | No | +| GCU | Enflame | S60 | Yes | Yes | No | +| XPU | Kunlunxin | P800 | Yes | Yes | No | +| DPU | Teco | Checking | In progress | In progress | No | diff --git a/versioned_docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md b/versioned_docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md index a8a3026f..73820996 100644 --- a/versioned_docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md +++ b/versioned_docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md @@ -5,19 +5,19 @@ title: Enable Enflame GPU Sharing ## Introduction -**We now support sharing on enflame.com/gcu(i.e S60) by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports sharing on enflame.com/gcu(i.e S60) by implementing most device-sharing features as nvidia-GPU**, including: -***GCU sharing***: Each task can allocate a portion of GCU instead of a whole GCU card, thus GCU can be shared among multiple tasks. +**GCU sharing**: Each task can allocate a portion of GCU instead of a whole GCU card, thus GCU can be shared among multiple tasks. -***Device Memory and Core Control***: GCUs can be allocated with certain percentage of device memory and core, we make sure that it does not exceed the boundary. +**Device Memory and Core Control**: GCUs can be allocated with certain percentage of device memory and core, HAMi ensures it does not exceed the boundary. -***Device UUID Selection***: You can specify which GCU devices to use or exclude using annotations. +**Device UUID Selection**: You can specify which GCU devices to use or exclude using annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites -* Enflame gcushare-device-plugin >= 2.1.6 (please consult your device provider, gcushare has two components: gcushare-scheduler-plugin and gcushare-device-plugin, we only need gcushare-device-plugin here ) +* Enflame gcushare-device-plugin >= 2.1.6 (please consult your device provider, gcushare has two components: gcushare-scheduler-plugin and gcushare-device-plugin, only gcushare-device-plugin is needed here ) * driver version >= 1.2.3.14 * kubernetes >= 1.24 * enflame-container-toolkit >=2.0.50 diff --git a/versioned_docs/version-v2.7.0/userguide/hygon-device/enable-hygon-dcu-sharing.md b/versioned_docs/version-v2.7.0/userguide/hygon-device/enable-hygon-dcu-sharing.md index c0a950fa..0fb0f489 100644 --- a/versioned_docs/version-v2.7.0/userguide/hygon-device/enable-hygon-dcu-sharing.md +++ b/versioned_docs/version-v2.7.0/userguide/hygon-device/enable-hygon-dcu-sharing.md @@ -4,15 +4,15 @@ title: Enable Hygon DCU sharing ## Introduction -**We now support hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: -***DCU sharing***: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. +**DCU sharing**: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. -***Device Memory Control***: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. +**Device Memory Control**: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. -***Device compute core limitation***: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) +**Device compute core limitation**: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) -***DCU Type Specification***: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. +**DCU Type Specification**: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. ## Prerequisites diff --git a/versioned_docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md b/versioned_docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md index 9771b6bf..833a3ecc 100644 --- a/versioned_docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md +++ b/versioned_docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md @@ -5,17 +5,17 @@ title: Enable Illuvatar GPU Sharing ## Introduction -**We now support iluvatar.ai/gpu(i.e MR-V100、BI-V150、BI-V100) by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports iluvatar.ai/gpu(i.e MR-V100、BI-V150、BI-V100) by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores and have made it that it does not exceed the boundary. -***Device UUID Selection***: You can specify which GPU devices to use or exclude using annotations. +**Device UUID Selection**: You can specify which GPU devices to use or exclude using annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.7.0/userguide/kunlunxin-device/enable-kunlunxin-vxpu.md b/versioned_docs/version-v2.7.0/userguide/kunlunxin-device/enable-kunlunxin-vxpu.md index f1a254d6..0a4f8b9b 100644 --- a/versioned_docs/version-v2.7.0/userguide/kunlunxin-device/enable-kunlunxin-vxpu.md +++ b/versioned_docs/version-v2.7.0/userguide/kunlunxin-device/enable-kunlunxin-vxpu.md @@ -6,11 +6,11 @@ title: Enable Kunlunxin VXPU This component supports multiplexing Kunlunxin XPU devices (P800-OAM) and provides the following vGPU-like multiplexing capabilities, Special thanks for rise-union and kunlunxin for contributing: -***XPU Sharing***: Each task can occupy only a portion of the device, allowing multiple tasks to share a single XPU +**XPU Sharing**: Each task can occupy only a portion of the device, allowing multiple tasks to share a single XPU -***Memory Allocation Limits***: You can now allocate XPUs using memory values (e.g., 24576M), and the component ensures that tasks do not exceed the allocated memory limit +**Memory Allocation Limits**: You can now allocate XPUs using memory values (e.g., 24576M), and the component ensures that tasks do not exceed the allocated memory limit -***Device UUID Selection***: You can specify to use or exclude specific XPU devices through annotations +**Device UUID Selection**: You can specify to use or exclude specific XPU devices through annotations ## Prerequisites diff --git a/versioned_docs/version-v2.7.0/userguide/monitoring/device-allocation.md b/versioned_docs/version-v2.7.0/userguide/monitoring/device-allocation.md index 03388648..f615db70 100644 --- a/versioned_docs/version-v2.7.0/userguide/monitoring/device-allocation.md +++ b/versioned_docs/version-v2.7.0/userguide/monitoring/device-allocation.md @@ -24,4 +24,4 @@ It contains the following metrics: | QuotaUsed | resourcequota usage for a certain device | `{quotaName="nvidia.com/gpucores", quotanamespace="default",limit="200",zone="vGPU"}` 100 | | vGPUPodsDeviceAllocated | vGPU Allocated from pods (This metric will be deprecated in v2.8.0, use vGPUMemoryAllocated and vGPUCoreAllocated instead.)| `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 | -> **Note** Please note that, this is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file +> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. \ No newline at end of file diff --git a/versioned_docs/version-v2.7.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md b/versioned_docs/version-v2.7.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md index abbe2ca2..4f6d07a4 100644 --- a/versioned_docs/version-v2.7.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md +++ b/versioned_docs/version-v2.7.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md @@ -4,13 +4,13 @@ title: Enable Mthreads GPU sharing ## Introduction -**We now support mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. ## Important Notes diff --git a/versioned_docs/version-v2.7.0/userguide/nvidia-device/dynamic-mig-support.md b/versioned_docs/version-v2.7.0/userguide/nvidia-device/dynamic-mig-support.md index bea5435b..954348ee 100644 --- a/versioned_docs/version-v2.7.0/userguide/nvidia-device/dynamic-mig-support.md +++ b/versioned_docs/version-v2.7.0/userguide/nvidia-device/dynamic-mig-support.md @@ -130,7 +130,7 @@ nvidia: :::note Helm installations and updates will follow the configuration specified in this file, overriding the default Helm settings. -Please note that HAMi will identify and use the first MIG template that matches the job, in the order defined in this configMap. +HAMi identifies and use the first MIG template that matches the job, in the order defined in this configMap. ::: ## Running MIG jobs diff --git a/versioned_docs/version-v2.7.0/userguide/nvidia-device/examples/specify-card-type-to-use.md b/versioned_docs/version-v2.7.0/userguide/nvidia-device/examples/specify-card-type-to-use.md index 397e984f..946f0e50 100644 --- a/versioned_docs/version-v2.7.0/userguide/nvidia-device/examples/specify-card-type-to-use.md +++ b/versioned_docs/version-v2.7.0/userguide/nvidia-device/examples/specify-card-type-to-use.md @@ -24,4 +24,4 @@ spec: nvidia.com/gpu: 2 # requesting 2 vGPUs ``` -> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, we want to run this job on A100 or V100* +> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, the job targets A100 or V100* diff --git a/versioned_docs/version-v2.8.0/contributor/adopters.md b/versioned_docs/version-v2.8.0/contributor/adopters.md index 8c521c76..0d87e5eb 100644 --- a/versioned_docs/version-v2.8.0/contributor/adopters.md +++ b/versioned_docs/version-v2.8.0/contributor/adopters.md @@ -1,12 +1,12 @@ # HAMi Adopters -So you and your organisation are using HAMi? That's great. We would love to hear from you! 💖 +HAMi is used in production by the organisations listed below. ## Adding yourself [Here](https://github.com/Project-HAMi/website/blob/master/src/pages/adopters.mdx) lists the organisations who adopted the HAMi project in production. -You just need to add an entry for your company and upon merging it will automatically be added to our website. +Add an entry for your company - it will be added to the website once the PR merges. To add your organisation follow these steps: @@ -25,4 +25,4 @@ To add your organisation follow these steps: 6. Push the commit with `git push origin main`. 7. Open a Pull Request to [HAMi-io/website](https://github.com/Project-HAMi/website) and a preview build will turn up. -Thanks a lot for being part of our community - we very much appreciate it! +Thanks to all adopters for being part of the community! diff --git a/versioned_docs/version-v2.8.0/contributor/contribute-docs.md b/versioned_docs/version-v2.8.0/contributor/contribute-docs.md index 179e55a8..c084d8f6 100644 --- a/versioned_docs/version-v2.8.0/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.8.0/contributor/contribute-docs.md @@ -9,14 +9,14 @@ the `Project-HAMi/website` repository. ## Prerequisites - Docs, like codes, are also categorized and stored by version. - 1.3 is the first version we have archived. + 1.3 is the first version is the first archived. - Docs need to be translated into multiple languages for readers from different regions. The community now supports both Chinese and English. English is the official language of documentation. -- For our docs we use markdown. If you are unfamiliar with Markdown, +- The docs use markdown. If you are unfamiliar with Markdown, please see [https://guides.github.com/features/mastering-markdown/](https://guides.github.com/features/mastering-markdown/) or [https://www.markdownguide.org/](https://www.markdownguide.org/) if you are looking for something more substantial. -- We get some additions through [Docusaurus 2](https://docusaurus.io/), a model static website generator. +- The site uses [Docusaurus 2](https://docusaurus.io/), a model static website generator. ## Setup @@ -88,7 +88,7 @@ title: A doc with tags ``` The top section between two lines of --- is the Front Matter section. -Here we define a couple of entries which tell Docusaurus how to handle the article: +These entries tell Docusaurus how to handle the article: - Title is the equivalent of the `<h1>` in a HTML document or `# <title>` in a Markdown article. - Each document has a unique ID. By default, a document ID is the name of the document @@ -106,7 +106,7 @@ You can easily route to other places by adding any of the following links: You can use relative paths to index the corresponding files. - Link to pictures or other resources. If your article contains images, prefer storing them in `/static/img/docs/` and linking - with absolute paths. We use language-aware folders: + with absolute paths. Language-aware folders are used: - `/static/img/docs/common/` for shared images - `/static/img/docs/en/` for English-only images - `/static/img/docs/zh/` for Chinese-only images @@ -202,6 +202,6 @@ If the previewed page is not what you expected, please check your docs again. ### Versioning -For the newly supplemented documents of each version, we will synchronize to the latest version +For the newly supplemented documents of each version, they are synchronized to the latest version on the release date of each version, and the documents of the old version will not be modified. -For errata found in the documentation, we will fix it with every release. +For errata found in the documentation, fixes are applied with every release. diff --git a/versioned_docs/version-v2.8.0/contributor/contributing.md b/versioned_docs/version-v2.8.0/contributor/contributing.md index fab6d31c..0c88de85 100644 --- a/versioned_docs/version-v2.8.0/contributor/contributing.md +++ b/versioned_docs/version-v2.8.0/contributor/contributing.md @@ -6,7 +6,7 @@ Welcome to HAMi! ## Code of Conduct -Please make sure to read and observe our [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) +Please make sure to read and observe the [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) ## Community Expectations @@ -20,7 +20,7 @@ HAMi is a community project driven by its community which strives to promote a h ## Your First Contribution -We will help you to contribute in different areas like filing issues, developing features, fixing critical bugs and +Help is available for contributing in areas like filing issues, developing features, fixing critical bugs and getting your work reviewed and merged. If you have questions about the development process, @@ -28,7 +28,7 @@ feel free to [file an issue](https://github.com/Project-HAMi/HAMi/issues/new/cho ## Find something to work on -We are always in need of help, be it fixing documentation, reporting bugs or writing some code. +Help is always welcome - fixing documentation, reporting bugs, writing code. Look at places where you feel best coding practices aren't followed, code refactoring is needed or tests are missing. Here is how you get started. @@ -40,18 +40,18 @@ For example, [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi) has [help wanted](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) and [good first issue](https://github.com/Project-HAMi/HAMi/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) labels for issues that should not need deep knowledge of the system. -We can help new contributors who wish to work on such issues. +Maintainers can help new contributors who wish to work on such issues. Another good way to contribute is to find a documentation improvement, such as a missing/broken link. Please see [Contributor Workflow](#contributor-workflow) below for the workflow. #### Work on an issue -When you are willing to take on an issue, just reply on the issue. The maintainer will assign it to you. +When you are willing to take on an issue, reply on the issue. The maintainer will assign it to you. ### File an Issue -While we encourage everyone to contribute code, it is also appreciated when someone reports an issue. +Code contributions are welcome, and bug reports are equally appreciated. Issues should be filed under the appropriate HAMi sub-repository. *Example:* a HAMi issue should be opened to [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi/issues). diff --git a/versioned_docs/version-v2.8.0/contributor/github-workflow.md b/versioned_docs/version-v2.8.0/contributor/github-workflow.md index 8582a392..a362a3b5 100644 --- a/versioned_docs/version-v2.8.0/contributor/github-workflow.md +++ b/versioned_docs/version-v2.8.0/contributor/github-workflow.md @@ -110,7 +110,7 @@ in a few cycles. ## Push -When ready to review (or just to establish an offsite backup of your work), +When ready to review (or to establish an offsite backup of your work), push your branch to your fork on `github.com`: ```sh diff --git a/versioned_docs/version-v2.8.0/contributor/governance.md b/versioned_docs/version-v2.8.0/contributor/governance.md index f49b23b7..aaf1e568 100644 --- a/versioned_docs/version-v2.8.0/contributor/governance.md +++ b/versioned_docs/version-v2.8.0/contributor/governance.md @@ -19,7 +19,7 @@ The HAMi and its leadership embrace the following values: priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. -* Inclusivity: We innovate through different perspectives and skill sets, which +* Inclusivity: Innovation comes from different perspectives and skill sets, and this can only be accomplished in a welcoming and respectful environment. * Participation: Responsibilities within the project are earned through diff --git a/versioned_docs/version-v2.8.0/contributor/ladder.md b/versioned_docs/version-v2.8.0/contributor/ladder.md index b9f51fbc..9b329c66 100644 --- a/versioned_docs/version-v2.8.0/contributor/ladder.md +++ b/versioned_docs/version-v2.8.0/contributor/ladder.md @@ -4,7 +4,7 @@ title: Contributor Ladder This docs different ways to get involved and level up within the project. You can see different roles within the project in the contributor roles. -Hello! We are excited that you want to learn more about our project contributor ladder! This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Community members generally start at the first levels of the "ladder" and advance up it as their involvement in the project grows. Our project members are happy to help you advance along the contributor ladder. +This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Each of the contributor roles below is organized into lists of three types of things. "Responsibilities" are things that a contributor is expected to do. "Requirements" are qualifications a person needs to meet to be in that role, and "Privileges" are things contributors on that level are entitled to. @@ -45,7 +45,7 @@ Description: A Contributor contributes directly to the project and adds value to * Invitations to contributor events * Eligible to become an Organization Member -A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. We wouldn't be where we are today without your contributions. Thank you! 💖 +A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. Thanks to everyone who contributed and helped maintain the project. As long as you contribute to HAMi, your name will be added [here](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md). If you don't find your name, please contact us to add it. @@ -126,7 +126,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github ### An active maintainer should -* Actively participate in reviewing pull requests and incoming issues. Note that there are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. +* Actively participate in reviewing pull requests and incoming issues. There are no hard rules on what is “active enough” and this is left up to the judgement of the current group of maintainers. * Actively participate in discussions about design and the future of the project. @@ -140,7 +140,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github New maintainers are added by consensus among the current group of maintainers. This can be done via a private discussion via Slack or email. A majority of maintainers should support the addition of the new person, and no single maintainer should object to adding the new maintainer. -When adding a new maintainer, we should file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. +When adding a new maintainer, file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. ### Removing Maintainers diff --git a/versioned_docs/version-v2.8.0/developers/dynamic-mig.md b/versioned_docs/version-v2.8.0/developers/dynamic-mig.md index ebe587f2..0d257953 100644 --- a/versioned_docs/version-v2.8.0/developers/dynamic-mig.md +++ b/versioned_docs/version-v2.8.0/developers/dynamic-mig.md @@ -9,8 +9,8 @@ This feature will not be implemented without the help of @sailorvii. ## Introduction -The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, so we chose the MPS and MIG. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. We want to develop an automatic slice plugin and create the slice when the user require it. -For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, we consider the CPU, Mem, GPU memory and other user-defined resource. +The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, MPS and MIG are preferred. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. The goal is an automatic slice plugin that creates slices on demand. +For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, the scheduler considers CPU, memory, GPU memory, and other user-defined resources. HAMi is done by using [hami-core](https://github.com/Project-HAMi/HAMi-core), which is a cuda-hacking library. But mig is also widely used across the world. A unified API for dynamic-mig and hami-core is needed. ## Targets @@ -149,7 +149,7 @@ The Procedure of a vGPU task which uses dynamic-mig is shown below: <img src="https://github.com/Project-HAMi/HAMi/blob/master/docs/develop/imgs/hami-dynamic-mig-procedure.png?raw=true" width="800" alt="HAMi dynamic MIG procedure flowchart showing task scheduling process" /> -Note that after submitted a task, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. +After a task is submitted, deviceshare plugin will iterate over templates defined in configMap `hami-scheduler-device`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize. If you submit the example on an empty A100-PCIE-40GB node, then it will select a GPU and choose MIG template below: diff --git a/versioned_docs/version-v2.8.0/developers/kunlunxin-topology.md b/versioned_docs/version-v2.8.0/developers/kunlunxin-topology.md index eddcf225..c48e25a8 100644 --- a/versioned_docs/version-v2.8.0/developers/kunlunxin-topology.md +++ b/versioned_docs/version-v2.8.0/developers/kunlunxin-topology.md @@ -30,7 +30,7 @@ The selection process is shown below: ## Score In the scoring phase, all filtered nodes are evaluated and scored to select the optimal one -for scheduling. We introduce a metric called **MTF** (Minimized Tasks to Fill), +for scheduling. The metric used is **MTF** (Minimized Tasks to Fill), which quantifies how well a node can accommodate future tasks after allocation. The table below shows examples of XPU occupation and proper MTF values: diff --git a/versioned_docs/version-v2.8.0/developers/protocol.md b/versioned_docs/version-v2.8.0/developers/protocol.md index d6aead03..af093054 100644 --- a/versioned_docs/version-v2.8.0/developers/protocol.md +++ b/versioned_docs/version-v2.8.0/developers/protocol.md @@ -31,7 +31,7 @@ hami.io/node-nvidia-register: GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec,10,32768, In this example, this node has two different AI devices, 2 Nvidia-V100 GPUs, and 2 Cambircon 370-X4 MLUs -Note that a device node may become unavailable due to hardware or network failure, if a node hasn't registered in last 5 minutes, scheduler will mark that node as 'unavailable'. +A device node may become unavailable due to hardware or network failure, if a node hasn't registered in last 5 minutes, scheduler will mark that node as 'unavailable'. Since system clock on scheduler node and 'device' node may not align properly, scheduler node will patch the following device node annotations every 30s diff --git a/versioned_docs/version-v2.8.0/developers/scheduling.md b/versioned_docs/version-v2.8.0/developers/scheduling.md index 02f1cc9c..db9d5662 100644 --- a/versioned_docs/version-v2.8.0/developers/scheduling.md +++ b/versioned_docs/version-v2.8.0/developers/scheduling.md @@ -8,7 +8,7 @@ Current in a cluster with many GPU nodes, nodes are not `binpack` or `spread` wh ## Proposal -We add a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and +The scheduler adds a `node-scheduler-policy` and `gpu-scheduler-policy` to config, then scheduler to use this policy can impl node `binpack` or `spread` or GPU `binpack` or `spread`. and use can set Pod annotation to change this default policy, use `hami.io/node-scheduler-policy` and `hami.io/gpu-scheduler-policy` to overlay scheduler config. ### User Stories @@ -105,7 +105,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Binpack` policy we can select `Node1`. +So, in `Binpack` policy, the selected node is `Node1`. #### Spread @@ -127,7 +127,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Spread` policy we can select `Node2`. +So, in `Spread` policy, the selected node is `Node2`. ### GPU-scheduler-policy @@ -153,7 +153,7 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Binpack` policy we can select `GPU2`. +So, in `Binpack` policy, the selected node is `GPU2`. #### Spread @@ -175,4 +175,4 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Spread` policy we can select `GPU1`. +So, in `Spread` policy, the selected node is `GPU1`. diff --git a/versioned_docs/version-v2.8.0/faq/faq.md b/versioned_docs/version-v2.8.0/faq/faq.md index f5ee5c04..f661e687 100644 --- a/versioned_docs/version-v2.8.0/faq/faq.md +++ b/versioned_docs/version-v2.8.0/faq/faq.md @@ -41,7 +41,7 @@ A vGPU is a logical instance of a physical GPU created using virtualization, all 4. **Design Intent** The design of vGPU aims to **allow one GPU to be shared by multiple tasks**, rather than letting one task occupy multiple vGPUs on the same GPU. The purpose of vGPU overcommitment is to improve GPU utilization, not to increase resource allocation for individual tasks. -## HAMi's `nvidia.com/priority` field only supports two levels. How can we implement multi-level, user-defined priority-based scheduling for a queue of jobs, especially when cluster resources are limited? +## HAMi's `nvidia.com/priority` field only supports two levels. How to implement multi-level, user-defined priority-based scheduling for a queue of jobs, especially when cluster resources are limited? **TL;DR** @@ -62,7 +62,7 @@ However, achieving multi-level priority scheduling **is feasible**. The recommen 1. HAMi integrates with Volcano via the [volcano-vgpu-device-plugin](https://github.com/Project-HAMi/volcano-vgpu-device-plugin). 2. It continues to manage the vGPU sharing and its own two-level runtime priority for tasks contending on the *same physical GPU*, as described earlier. -In summary, while HAMi's own priority serves a different, device-specific purpose (runtime preemption on a single card), implementing multi-level job scheduling priority is achievable by using **Volcano in conjunction with HAMi**. Volcano would handle which job from the queue is prioritized for resource allocation based on multiple priority levels, and HAMi would manage the GPU sharing and its specific on-device preemption. +While HAMi's own priority serves a different, device-specific purpose (runtime preemption on a single card), implementing multi-level job scheduling priority is achievable by using **Volcano in conjunction with HAMi**. Volcano would handle which job from the queue is prioritized for resource allocation based on multiple priority levels, and HAMi would manage the GPU sharing and its specific on-device preemption. ## Integration with Other Open-Source Tools diff --git a/versioned_docs/version-v2.8.0/key-features/device-sharing.md b/versioned_docs/version-v2.8.0/key-features/device-sharing.md index 29bed9d2..19797b9e 100644 --- a/versioned_docs/version-v2.8.0/key-features/device-sharing.md +++ b/versioned_docs/version-v2.8.0/key-features/device-sharing.md @@ -2,7 +2,7 @@ title: Device sharing --- -HAMi offers robust device-sharing capabilities, enabling multiple tasks to share the same GPU, MLU, or NPU device, +HAMi provides device-sharing capabilities, enabling multiple tasks to share the same GPU, MLU, or NPU device, maximizing the utilization of heterogeneous AI computing resources. ## Device Sharing {#device-sharing} diff --git a/versioned_docs/version-v2.8.0/userguide/device-supported.md b/versioned_docs/version-v2.8.0/userguide/device-supported.md index c76adc36..446a8e39 100644 --- a/versioned_docs/version-v2.8.0/userguide/device-supported.md +++ b/versioned_docs/version-v2.8.0/userguide/device-supported.md @@ -6,13 +6,13 @@ The table below lists the devices supported by HAMi: | Type | Manufactor | Models | MemoryIsolation | CoreIsolation | MultiCard Support | |------|------------|------|-----------------|---------------|-------------------| -| GPU | NVIDIA | All | ✅ | ✅ | ✅ | -| MLU | Cambricon | 370, 590 | ✅ | ✅ | ❌ | -| DCU | Hygon | Z100, Z100L | ✅ | ✅ | ❌ | -| NPU | Huawei Ascend | 910B, 910B3, 310P | ✅ | ✅ | ❌ | -| GPU | Iluvatar | All | ✅ | ✅ | ❌ | -| GPU | Mthreads | MTT S4000 | ✅ | ✅ | ❌ | -| GPU | Metax | MXC500 | ✅ | ✅ | ❌ | -| GCU | Enflame | S60 | ✅ | ✅ | ❌ | -| XPU | Kunlunxin | P800 | ✅ | ✅ | ❌ | -| DPU | Teco | Checking | In progress | In progress | ❌ | +| GPU | NVIDIA | All | Yes | Yes | Yes | +| MLU | Cambricon | 370, 590 | Yes | Yes | No | +| DCU | Hygon | Z100, Z100L | Yes | Yes | No | +| NPU | Huawei Ascend | 910B, 910B3, 310P | Yes | Yes | No | +| GPU | Iluvatar | All | Yes | Yes | No | +| GPU | Mthreads | MTT S4000 | Yes | Yes | No | +| GPU | Metax | MXC500 | Yes | Yes | No | +| GCU | Enflame | S60 | Yes | Yes | No | +| XPU | Kunlunxin | P800 | Yes | Yes | No | +| DPU | Teco | Checking | In progress | In progress | No | diff --git a/versioned_docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md b/versioned_docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md index c95ca130..0e04d9a4 100644 --- a/versioned_docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md +++ b/versioned_docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md @@ -5,19 +5,19 @@ title: Enable Enflame GPU Sharing ## Introduction -**We now support sharing on enflame.com/gcu(i.e S60) by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports sharing on enflame.com/gcu(i.e S60) by implementing most device-sharing features as nvidia-GPU**, including: -***GCU sharing***: Each task can allocate a portion of GCU instead of a whole GCU card, thus GCU can be shared among multiple tasks. +**GCU sharing**: Each task can allocate a portion of GCU instead of a whole GCU card, thus GCU can be shared among multiple tasks. -***Device Memory and Core Control***: GCUs can be allocated with certain percentage of device memory and core, we make sure that it does not exceed the boundary. +**Device Memory and Core Control**: GCUs can be allocated with certain percentage of device memory and core, HAMi ensures it does not exceed the boundary. -***Device UUID Selection***: You can specify which GCU devices to use or exclude using annotations. +**Device UUID Selection**: You can specify which GCU devices to use or exclude using annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites -* Enflame gcushare-device-plugin >= 2.1.6 (please consult your device provider, gcushare has two components: gcushare-scheduler-plugin and gcushare-device-plugin, we only need gcushare-device-plugin here ) +* Enflame gcushare-device-plugin >= 2.1.6 (please consult your device provider, gcushare has two components: gcushare-scheduler-plugin and gcushare-device-plugin, only gcushare-device-plugin is needed here ) * driver version >= 1.2.3.14 * kubernetes >= 1.24 * enflame-container-toolkit >=2.0.50 diff --git a/versioned_docs/version-v2.8.0/userguide/hygon-device/enable-hygon-dcu-sharing.md b/versioned_docs/version-v2.8.0/userguide/hygon-device/enable-hygon-dcu-sharing.md index a721ecde..401dae8f 100644 --- a/versioned_docs/version-v2.8.0/userguide/hygon-device/enable-hygon-dcu-sharing.md +++ b/versioned_docs/version-v2.8.0/userguide/hygon-device/enable-hygon-dcu-sharing.md @@ -4,15 +4,15 @@ title: Enable Hygon DCU sharing ## Introduction -**We now support hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: -***DCU sharing***: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. +**DCU sharing**: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. -***Device Memory Control***: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. +**Device Memory Control**: DCUs can be allocated with certain device memory size on certain type(i.e Z100) and have made it that it does not exceed the boundary. -***Device compute core limitation***: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) +**Device compute core limitation**: DCUs can be allocated with certain percentage of device core(i.e hygon.com/dcucores:60 indicate this container uses 60% compute cores of this device) -***DCU Type Specification***: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. +**DCU Type Specification**: You can specify which type of DCU to use or to avoid for a certain task, by setting "hygon.com/use-dcutype" or "hygon.com/nouse-dcutype" annotations. ## Prerequisites diff --git a/versioned_docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md b/versioned_docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md index 3462da64..27b48aa8 100644 --- a/versioned_docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md +++ b/versioned_docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md @@ -4,17 +4,17 @@ title: Enable Illuvatar GPU Sharing ## Introduction -**We now support iluvatar.ai/gpu(i.e MR-V100, BI-V150, BI-V100) by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports iluvatar.ai/gpu(i.e MR-V100, BI-V150, BI-V100) by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores and have made it that it does not exceed the boundary. -***Device UUID Selection***: You can specify which GPU devices to use or exclude using annotations. +**Device UUID Selection**: You can specify which GPU devices to use or exclude using annotations. -***Very Easy to use***: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.8.0/userguide/kunlunxin-device/enable-kunlunxin-vxpu.md b/versioned_docs/version-v2.8.0/userguide/kunlunxin-device/enable-kunlunxin-vxpu.md index 217313d5..13870bb2 100644 --- a/versioned_docs/version-v2.8.0/userguide/kunlunxin-device/enable-kunlunxin-vxpu.md +++ b/versioned_docs/version-v2.8.0/userguide/kunlunxin-device/enable-kunlunxin-vxpu.md @@ -6,11 +6,11 @@ title: Enable Kunlunxin VXPU This component supports multiplexing Kunlunxin XPU devices (P800-OAM) and provides the following vGPU-like multiplexing capabilities, Special thanks for rise-union and kunlunxin for contributing: -***XPU Sharing***: Each task can occupy only a portion of the device, allowing multiple tasks to share a single XPU +**XPU Sharing**: Each task can occupy only a portion of the device, allowing multiple tasks to share a single XPU -***Memory Allocation Limits***: You can now allocate XPUs using memory values (e.g., 24576M), and the component ensures that tasks do not exceed the allocated memory limit +**Memory Allocation Limits**: You can now allocate XPUs using memory values (e.g., 24576M), and the component ensures that tasks do not exceed the allocated memory limit -***Device UUID Selection***: You can specify to use or exclude specific XPU devices through annotations +**Device UUID Selection**: You can specify to use or exclude specific XPU devices through annotations ## Prerequisites diff --git a/versioned_docs/version-v2.8.0/userguide/monitoring/device-allocation.md b/versioned_docs/version-v2.8.0/userguide/monitoring/device-allocation.md index 8fb12d60..3dee7c8b 100644 --- a/versioned_docs/version-v2.8.0/userguide/monitoring/device-allocation.md +++ b/versioned_docs/version-v2.8.0/userguide/monitoring/device-allocation.md @@ -23,4 +23,4 @@ It contains the following metrics: | QuotaUsed | resourcequota usage for a certain device | `{quotaName="nvidia.com/gpucores", quotanamespace="default",limit="200",zone="vGPU"}` 100 | | vGPUPodsDeviceAllocated | vGPU Allocated from pods (This metric will be deprecated in v2.8.0, use vGPUMemoryAllocated and vGPUCoreAllocated instead.)| `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 | -> **Note** Please note that, this is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. +> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage. diff --git a/versioned_docs/version-v2.8.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md b/versioned_docs/version-v2.8.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md index 4dbe403f..0b29873e 100644 --- a/versioned_docs/version-v2.8.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md +++ b/versioned_docs/version-v2.8.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md @@ -4,13 +4,13 @@ title: Enable Mthreads GPU sharing ## Introduction -**We now support mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: -***GPU sharing***: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. +**GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. -***Device Memory Control***: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Memory Control**: GPUs can be allocated with certain device memory size on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. -***Device Core Control***: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. +**Device Core Control**: GPUs can be allocated with limited compute cores on certain type(i.e MTT S4000) and have made it that it does not exceed the boundary. ## Important Notes diff --git a/versioned_docs/version-v2.8.0/userguide/nvidia-device/dynamic-mig-support.md b/versioned_docs/version-v2.8.0/userguide/nvidia-device/dynamic-mig-support.md index bea5435b..954348ee 100644 --- a/versioned_docs/version-v2.8.0/userguide/nvidia-device/dynamic-mig-support.md +++ b/versioned_docs/version-v2.8.0/userguide/nvidia-device/dynamic-mig-support.md @@ -130,7 +130,7 @@ nvidia: :::note Helm installations and updates will follow the configuration specified in this file, overriding the default Helm settings. -Please note that HAMi will identify and use the first MIG template that matches the job, in the order defined in this configMap. +HAMi identifies and use the first MIG template that matches the job, in the order defined in this configMap. ::: ## Running MIG jobs diff --git a/versioned_docs/version-v2.8.0/userguide/nvidia-device/examples/specify-card-type-to-use.md b/versioned_docs/version-v2.8.0/userguide/nvidia-device/examples/specify-card-type-to-use.md index a2f0072a..0cda86a7 100644 --- a/versioned_docs/version-v2.8.0/userguide/nvidia-device/examples/specify-card-type-to-use.md +++ b/versioned_docs/version-v2.8.0/userguide/nvidia-device/examples/specify-card-type-to-use.md @@ -22,4 +22,4 @@ spec: nvidia.com/gpu: 2 # requesting 2 vGPUs ``` -> **NOTICE:** *You can assign this task to multiple GPU types, use comma to separate,In this example, we want to run this job on A100 or V100* +> **NOTICE:** *You can assign this task to multiple GPU types, use comma to separate,In this example, the job targets A100 or V100* From d682dee52435a8b472347b330d167e3159b176d1 Mon Sep 17 00:00:00 2001 From: mesutoezdil <mesudozdil@gmail.com> Date: Thu, 7 May 2026 22:17:31 +0200 Subject: [PATCH 2/3] docs: fix remaining emoji in i18n/zh and first-person in userguide and contributor docs Signed-off-by: mesutoezdil <mesudozdil@gmail.com> --- .../current/contributor/adopters.md | 2 +- .../current/contributor/ladder.md | 2 +- .../version-v2.5.0/contributor/adopters.md | 2 +- .../version-v2.5.0/contributor/ladder.md | 2 +- .../version-v2.5.1/contributor/adopters.md | 2 +- .../version-v2.5.1/contributor/ladder.md | 2 +- .../version-v2.6.0/contributor/adopters.md | 2 +- .../version-v2.6.0/contributor/ladder.md | 2 +- .../version-v2.7.0/contributor/adopters.md | 2 +- .../version-v2.7.0/contributor/ladder.md | 2 +- .../version-v2.8.0/contributor/adopters.md | 2 +- .../version-v2.8.0/contributor/ladder.md | 2 +- versioned_docs/version-v1.3.0/contributor/governance.md | 2 +- versioned_docs/version-v1.3.0/get-started/nginx-example.md | 4 ++-- versioned_docs/version-v1.3.0/installation/prerequisites.md | 2 +- .../cambricon-device/enable-cambricon-mlu-sharing.md | 2 +- versioned_docs/version-v2.4.1/contributor/cherry-picks.md | 2 +- .../version-v2.4.1/contributor/contribute-docs.md | 6 +++--- versioned_docs/version-v2.4.1/contributor/governance.md | 2 +- versioned_docs/version-v2.4.1/get-started/nginx-example.md | 4 ++-- versioned_docs/version-v2.4.1/installation/prerequisites.md | 2 +- .../cambricon-device/enable-cambricon-mlu-sharing.md | 2 +- versioned_docs/version-v2.5.0/contributor/cherry-picks.md | 2 +- .../version-v2.5.0/contributor/contribute-docs.md | 6 +++--- versioned_docs/version-v2.5.0/contributor/governance.md | 2 +- versioned_docs/version-v2.5.0/get-started/nginx-example.md | 4 ++-- versioned_docs/version-v2.5.0/installation/prerequisites.md | 2 +- .../cambricon-device/enable-cambricon-mlu-sharing.md | 2 +- .../userguide/enflame-device/enable-enflame-gpu-sharing.md | 2 +- .../iluvatar-device/enable-iluvatar-gpu-sharing.md | 2 +- versioned_docs/version-v2.5.1/contributor/cherry-picks.md | 2 +- .../version-v2.5.1/contributor/contribute-docs.md | 6 +++--- versioned_docs/version-v2.5.1/contributor/governance.md | 2 +- versioned_docs/version-v2.5.1/get-started/nginx-example.md | 4 ++-- versioned_docs/version-v2.5.1/installation/prerequisites.md | 2 +- .../cambricon-device/enable-cambricon-mlu-sharing.md | 2 +- versioned_docs/version-v2.6.0/contributor/cherry-picks.md | 2 +- .../version-v2.6.0/contributor/contribute-docs.md | 6 +++--- versioned_docs/version-v2.6.0/contributor/governance.md | 2 +- versioned_docs/version-v2.6.0/get-started/nginx-example.md | 4 ++-- versioned_docs/version-v2.6.0/installation/prerequisites.md | 2 +- .../userguide/enflame-device/enable-enflame-gcu-sharing.md | 2 +- .../iluvatar-device/enable-illuvatar-gpu-sharing.md | 2 +- versioned_docs/version-v2.7.0/contributor/cherry-picks.md | 2 +- .../version-v2.7.0/contributor/contribute-docs.md | 6 +++--- versioned_docs/version-v2.7.0/contributor/governance.md | 2 +- .../version-v2.7.0/get-started/deploy-with-helm.md | 4 ++-- versioned_docs/version-v2.7.0/installation/prerequisites.md | 2 +- .../userguide/enflame-device/enable-enflame-gcu-sharing.md | 2 +- .../iluvatar-device/enable-illuvatar-gpu-sharing.md | 2 +- versioned_docs/version-v2.8.0/contributor/cherry-picks.md | 2 +- .../version-v2.8.0/contributor/contribute-docs.md | 6 +++--- versioned_docs/version-v2.8.0/contributor/governance.md | 2 +- .../version-v2.8.0/get-started/deploy-with-helm.md | 4 ++-- versioned_docs/version-v2.8.0/get-started/verify-hami.md | 2 +- versioned_docs/version-v2.8.0/installation/prerequisites.md | 2 +- .../userguide/enflame-device/enable-enflame-gcu-sharing.md | 2 +- .../iluvatar-device/enable-illuvatar-gpu-sharing.md | 2 +- 58 files changed, 77 insertions(+), 77 deletions(-) diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/contributor/adopters.md b/i18n/zh/docusaurus-plugin-content-docs/current/contributor/adopters.md index d60446a0..3b14125a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/contributor/adopters.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/contributor/adopters.md @@ -4,7 +4,7 @@ title: HAMi 采用者 # HAMi 采用者 -你和你的组织正在使用 HAMi?太棒了!我们很乐意听到你的使用反馈!💖 +你和你的组织正在使用 HAMi?太棒了!请通过 GitHub 提交使用信息。 ## 添加你的信息 diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/contributor/ladder.md b/i18n/zh/docusaurus-plugin-content-docs/current/contributor/ladder.md index 064432c8..de986dd7 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/contributor/ladder.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/contributor/ladder.md @@ -48,7 +48,7 @@ translated: true * 受邀参加贡献者活动 * 有资格成为组织成员 -特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢!💖 +特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢! 只要你为 HAMi 做出贡献,你的名字将被添加到[这里](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)。如果你没有找到你的名字,联系我们添加。 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/contributor/adopters.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/contributor/adopters.md index d60446a0..3b14125a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/contributor/adopters.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/contributor/adopters.md @@ -4,7 +4,7 @@ title: HAMi 采用者 # HAMi 采用者 -你和你的组织正在使用 HAMi?太棒了!我们很乐意听到你的使用反馈!💖 +你和你的组织正在使用 HAMi?太棒了!请通过 GitHub 提交使用信息。 ## 添加你的信息 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/contributor/ladder.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/contributor/ladder.md index d7b754ce..a5d5af4a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/contributor/ladder.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.0/contributor/ladder.md @@ -46,7 +46,7 @@ translated: true * 受邀参加贡献者活动 * 有资格成为组织成员 -特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢!💖 +特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢! 只要你为 HAMi 做出贡献,你的名字将被添加到[这里](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)。如果你没有找到你的名字,联系我们添加。 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/contributor/adopters.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/contributor/adopters.md index d60446a0..3b14125a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/contributor/adopters.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/contributor/adopters.md @@ -4,7 +4,7 @@ title: HAMi 采用者 # HAMi 采用者 -你和你的组织正在使用 HAMi?太棒了!我们很乐意听到你的使用反馈!💖 +你和你的组织正在使用 HAMi?太棒了!请通过 GitHub 提交使用信息。 ## 添加你的信息 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/contributor/ladder.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/contributor/ladder.md index 590d4244..d0a336d6 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/contributor/ladder.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.5.1/contributor/ladder.md @@ -46,7 +46,7 @@ translated: true * 受邀参加贡献者活动 * 有资格成为组织成员 -特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢!💖 +特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢! 只要你为 HAMi 做出贡献,你的名字将被添加到[这里](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)。如果你没有找到你的名字,联系我们添加。 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/contributor/adopters.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/contributor/adopters.md index d60446a0..3b14125a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/contributor/adopters.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/contributor/adopters.md @@ -4,7 +4,7 @@ title: HAMi 采用者 # HAMi 采用者 -你和你的组织正在使用 HAMi?太棒了!我们很乐意听到你的使用反馈!💖 +你和你的组织正在使用 HAMi?太棒了!请通过 GitHub 提交使用信息。 ## 添加你的信息 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/contributor/ladder.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/contributor/ladder.md index 73fffe0b..87e6c71f 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/contributor/ladder.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.6.0/contributor/ladder.md @@ -47,7 +47,7 @@ translated: true * 受邀参加贡献者活动 * 有资格成为组织成员 -特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢!💖 +特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢! 只要你为 HAMi 做出贡献,你的名字将被添加到[这里](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)。如果你没有找到你的名字,联系我们添加。 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/contributor/adopters.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/contributor/adopters.md index d60446a0..3b14125a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/contributor/adopters.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/contributor/adopters.md @@ -4,7 +4,7 @@ title: HAMi 采用者 # HAMi 采用者 -你和你的组织正在使用 HAMi?太棒了!我们很乐意听到你的使用反馈!💖 +你和你的组织正在使用 HAMi?太棒了!请通过 GitHub 提交使用信息。 ## 添加你的信息 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/contributor/ladder.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/contributor/ladder.md index 590d4244..d0a336d6 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/contributor/ladder.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.7.0/contributor/ladder.md @@ -46,7 +46,7 @@ translated: true * 受邀参加贡献者活动 * 有资格成为组织成员 -特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢!💖 +特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢! 只要你为 HAMi 做出贡献,你的名字将被添加到[这里](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)。如果你没有找到你的名字,联系我们添加。 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/contributor/adopters.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/contributor/adopters.md index d60446a0..3b14125a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/contributor/adopters.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/contributor/adopters.md @@ -4,7 +4,7 @@ title: HAMi 采用者 # HAMi 采用者 -你和你的组织正在使用 HAMi?太棒了!我们很乐意听到你的使用反馈!💖 +你和你的组织正在使用 HAMi?太棒了!请通过 GitHub 提交使用信息。 ## 添加你的信息 diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/contributor/ladder.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/contributor/ladder.md index f5ff776c..7cce6bc0 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/contributor/ladder.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.8.0/contributor/ladder.md @@ -44,7 +44,7 @@ translated: true * 受邀参加贡献者活动 * 有资格成为组织成员 -特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢!💖 +特别感谢[长长的名单](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)中那些为项目做出贡献并帮助维护项目的人。没有你们的贡献,我们不会有今天的成就。谢谢! 只要你为 HAMi 做出贡献,你的名字将被添加到[这里](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md)。如果你没有找到你的名字,联系我们添加。 diff --git a/versioned_docs/version-v1.3.0/contributor/governance.md b/versioned_docs/version-v1.3.0/contributor/governance.md index aaf1e568..95984570 100644 --- a/versioned_docs/version-v1.3.0/contributor/governance.md +++ b/versioned_docs/version-v1.3.0/contributor/governance.md @@ -15,7 +15,7 @@ The HAMi and its leadership embrace the following values: * Fairness: All stakeholders have the opportunity to provide feedback and submit contributions, which will be considered on their merits. -* Community over Product or Company: Sustaining and growing our community takes +* Community over Product or Company: Sustaining and growing the community takes priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. diff --git a/versioned_docs/version-v1.3.0/get-started/nginx-example.md b/versioned_docs/version-v1.3.0/get-started/nginx-example.md index 8bf574bc..b82231ce 100644 --- a/versioned_docs/version-v1.3.0/get-started/nginx-example.md +++ b/versioned_docs/version-v1.3.0/get-started/nginx-example.md @@ -92,7 +92,7 @@ sudo systemctl daemon-reload && systemctl restart containerd #### 2. Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ``` kubectl label nodes {nodeid} gpu=on @@ -106,7 +106,7 @@ First, you need to check your Kubernetes version by using the following command: kubectl version ``` -Then, add our repo in helm +Then, add the HAMi repo in helm ``` helm repo add hami-charts https://project-hami.github.io/HAMi/ diff --git a/versioned_docs/version-v1.3.0/installation/prerequisites.md b/versioned_docs/version-v1.3.0/installation/prerequisites.md index 13666bb8..4e52d10e 100644 --- a/versioned_docs/version-v1.3.0/installation/prerequisites.md +++ b/versioned_docs/version-v1.3.0/installation/prerequisites.md @@ -82,7 +82,7 @@ sudo systemctl daemon-reload && systemctl restart containerd ### Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ``` kubectl label nodes {nodeid} gpu=on diff --git a/versioned_docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/versioned_docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index 92b3f2fb..51980675 100644 --- a/versioned_docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/versioned_docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -12,7 +12,7 @@ title: Enable cambricon MLU sharing **MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. ## Prerequisites diff --git a/versioned_docs/version-v2.4.1/contributor/cherry-picks.md b/versioned_docs/version-v2.4.1/contributor/cherry-picks.md index 80a58e56..c3edd9eb 100644 --- a/versioned_docs/version-v2.4.1/contributor/cherry-picks.md +++ b/versioned_docs/version-v2.4.1/contributor/cherry-picks.md @@ -62,7 +62,7 @@ your case by supplementing your PR with e.g., - Key stakeholder reviewers/approvers attesting to their confidence in the change being a required backport -It is critical that our full community is actively engaged on enhancements in +It is critical that the full community is actively engaged on enhancements in the project. If a released feature was not enabled on a particular provider's platform, this is a community miss that needs to be resolved in the `master` branch for subsequent releases. Such enabling will not be backported to the diff --git a/versioned_docs/version-v2.4.1/contributor/contribute-docs.md b/versioned_docs/version-v2.4.1/contributor/contribute-docs.md index 7ebbb69c..a997275b 100644 --- a/versioned_docs/version-v2.4.1/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.4.1/contributor/contribute-docs.md @@ -18,7 +18,7 @@ the `Project-HAMi/website` repository. ## Setup -You can set up your local environment by cloning our website repository. +You can set up your local environment by cloning the website repository. ```shell git clone https://github.com/Project-HAMi/website.git @@ -120,7 +120,7 @@ Creating a sidebar is useful to: - Display a sidebar on each of those documents - Provide paginated navigation, with next/previous button -For our docs, you can know how our documents are organized from [https://github.com/Project-HAMi/website/blob/main/sidebars.js](https://github.com/Project-HAMi/website/blob/main/sidebars.js). +The document organization can be found from [https://github.com/Project-HAMi/website/blob/main/sidebars.js](https://github.com/Project-HAMi/website/blob/main/sidebars.js). ```js module.exports = { @@ -168,7 +168,7 @@ If you add a document, you must add it to `sidebars.js` to make it display prope There are two situations about the Chinese version of the document: -- You want to translate our existing English docs to Chinese. In this case, you need to modify the corresponding file content from [https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current](https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current). +- You want to translate the existing English docs to Chinese. In this case, you need to modify the corresponding file content from [https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current](https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current). The organization of this directory is exactly the same as the outer layer. `current.json` holds translations for the documentation directory. You can edit it if you want to translate the name of directory. - You want to contribute Chinese docs without English version. Any articles of any kind are welcomed. In this case, you can add articles and titles to the main directory first. Article content can be TBD first, like this. Then add the corresponding Chinese content to the Chinese directory. diff --git a/versioned_docs/version-v2.4.1/contributor/governance.md b/versioned_docs/version-v2.4.1/contributor/governance.md index aaf1e568..95984570 100644 --- a/versioned_docs/version-v2.4.1/contributor/governance.md +++ b/versioned_docs/version-v2.4.1/contributor/governance.md @@ -15,7 +15,7 @@ The HAMi and its leadership embrace the following values: * Fairness: All stakeholders have the opportunity to provide feedback and submit contributions, which will be considered on their merits. -* Community over Product or Company: Sustaining and growing our community takes +* Community over Product or Company: Sustaining and growing the community takes priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. diff --git a/versioned_docs/version-v2.4.1/get-started/nginx-example.md b/versioned_docs/version-v2.4.1/get-started/nginx-example.md index 3e5aa4cd..b1d16917 100644 --- a/versioned_docs/version-v2.4.1/get-started/nginx-example.md +++ b/versioned_docs/version-v2.4.1/get-started/nginx-example.md @@ -92,7 +92,7 @@ sudo systemctl daemon-reload && systemctl restart containerd #### 2. Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on @@ -106,7 +106,7 @@ First, you need to check your Kubernetes version by using the following command: kubectl version ``` -Then, add our repo in helm +Then, add the HAMi repo in helm ```bash helm repo add hami-charts https://project-hami.github.io/HAMi/ diff --git a/versioned_docs/version-v2.4.1/installation/prerequisites.md b/versioned_docs/version-v2.4.1/installation/prerequisites.md index 13666bb8..4e52d10e 100644 --- a/versioned_docs/version-v2.4.1/installation/prerequisites.md +++ b/versioned_docs/version-v2.4.1/installation/prerequisites.md @@ -82,7 +82,7 @@ sudo systemctl daemon-reload && systemctl restart containerd ### Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ``` kubectl label nodes {nodeid} gpu=on diff --git a/versioned_docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/versioned_docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index 92b3f2fb..51980675 100644 --- a/versioned_docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/versioned_docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -12,7 +12,7 @@ title: Enable cambricon MLU sharing **MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. ## Prerequisites diff --git a/versioned_docs/version-v2.5.0/contributor/cherry-picks.md b/versioned_docs/version-v2.5.0/contributor/cherry-picks.md index ca4c976b..a7a96275 100644 --- a/versioned_docs/version-v2.5.0/contributor/cherry-picks.md +++ b/versioned_docs/version-v2.5.0/contributor/cherry-picks.md @@ -62,7 +62,7 @@ your case by supplementing your PR with e.g., - Key stakeholder reviewers/approvers attesting to their confidence in the change being a required backport -It is critical that our full community is actively engaged on enhancements in +It is critical that the full community is actively engaged on enhancements in the project. If a released feature was not enabled on a particular provider's platform, this is a community miss that needs to be resolved in the `master` branch for subsequent releases. Such enabling will not be backported to the diff --git a/versioned_docs/version-v2.5.0/contributor/contribute-docs.md b/versioned_docs/version-v2.5.0/contributor/contribute-docs.md index 6b2ba2d5..de6979e5 100644 --- a/versioned_docs/version-v2.5.0/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.5.0/contributor/contribute-docs.md @@ -20,7 +20,7 @@ the `Project-HAMi/website` repository. ## Setup -You can set up your local environment by cloning our website repository. +You can set up your local environment by cloning the website repository. ```shell git clone https://github.com/Project-HAMi/website.git @@ -125,7 +125,7 @@ Creating a sidebar is useful to: - Display a sidebar on each of those documents - Provide paginated navigation, with next/previous button -For our docs, you can know how our documents are organized from +The document organization can be found from [https://github.com/Project-HAMi/website/blob/main/sidebars.js](https://github.com/Project-HAMi/website/blob/main/sidebars.js). ```js @@ -175,7 +175,7 @@ If you're not sure where your docs are located, you can ask community members in There are two situations about the Chinese version of the document: -- You want to translate our existing English docs to Chinese. In this case, +- You want to translate the existing English docs to Chinese. In this case, you need to modify the corresponding file content from [https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current](https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current). The organization of this directory is exactly the same as the outer layer. diff --git a/versioned_docs/version-v2.5.0/contributor/governance.md b/versioned_docs/version-v2.5.0/contributor/governance.md index aaf1e568..95984570 100644 --- a/versioned_docs/version-v2.5.0/contributor/governance.md +++ b/versioned_docs/version-v2.5.0/contributor/governance.md @@ -15,7 +15,7 @@ The HAMi and its leadership embrace the following values: * Fairness: All stakeholders have the opportunity to provide feedback and submit contributions, which will be considered on their merits. -* Community over Product or Company: Sustaining and growing our community takes +* Community over Product or Company: Sustaining and growing the community takes priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. diff --git a/versioned_docs/version-v2.5.0/get-started/nginx-example.md b/versioned_docs/version-v2.5.0/get-started/nginx-example.md index 3e5aa4cd..b1d16917 100644 --- a/versioned_docs/version-v2.5.0/get-started/nginx-example.md +++ b/versioned_docs/version-v2.5.0/get-started/nginx-example.md @@ -92,7 +92,7 @@ sudo systemctl daemon-reload && systemctl restart containerd #### 2. Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on @@ -106,7 +106,7 @@ First, you need to check your Kubernetes version by using the following command: kubectl version ``` -Then, add our repo in helm +Then, add the HAMi repo in helm ```bash helm repo add hami-charts https://project-hami.github.io/HAMi/ diff --git a/versioned_docs/version-v2.5.0/installation/prerequisites.md b/versioned_docs/version-v2.5.0/installation/prerequisites.md index 13666bb8..4e52d10e 100644 --- a/versioned_docs/version-v2.5.0/installation/prerequisites.md +++ b/versioned_docs/version-v2.5.0/installation/prerequisites.md @@ -82,7 +82,7 @@ sudo systemctl daemon-reload && systemctl restart containerd ### Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ``` kubectl label nodes {nodeid} gpu=on diff --git a/versioned_docs/version-v2.5.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/versioned_docs/version-v2.5.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index 92b3f2fb..51980675 100644 --- a/versioned_docs/version-v2.5.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/versioned_docs/version-v2.5.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -12,7 +12,7 @@ title: Enable cambricon MLU sharing **MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. ## Prerequisites diff --git a/versioned_docs/version-v2.5.0/userguide/enflame-device/enable-enflame-gpu-sharing.md b/versioned_docs/version-v2.5.0/userguide/enflame-device/enable-enflame-gpu-sharing.md index ae019334..64bb24cc 100644 --- a/versioned_docs/version-v2.5.0/userguide/enflame-device/enable-enflame-gpu-sharing.md +++ b/versioned_docs/version-v2.5.0/userguide/enflame-device/enable-enflame-gpu-sharing.md @@ -12,7 +12,7 @@ title: Enable Enflame GCU sharing **Device UUID Selection**: You can specify which GCU devices to use or exclude using annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.5.0/userguide/iluvatar-device/enable-iluvatar-gpu-sharing.md b/versioned_docs/version-v2.5.0/userguide/iluvatar-device/enable-iluvatar-gpu-sharing.md index d37fcefe..69b5e941 100644 --- a/versioned_docs/version-v2.5.0/userguide/iluvatar-device/enable-iluvatar-gpu-sharing.md +++ b/versioned_docs/version-v2.5.0/userguide/iluvatar-device/enable-iluvatar-gpu-sharing.md @@ -14,7 +14,7 @@ title: Enable Iluvatar GCU sharing **Device UUID Selection**: You can specify which GPU devices to use or exclude using annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.5.1/contributor/cherry-picks.md b/versioned_docs/version-v2.5.1/contributor/cherry-picks.md index ca4c976b..a7a96275 100644 --- a/versioned_docs/version-v2.5.1/contributor/cherry-picks.md +++ b/versioned_docs/version-v2.5.1/contributor/cherry-picks.md @@ -62,7 +62,7 @@ your case by supplementing your PR with e.g., - Key stakeholder reviewers/approvers attesting to their confidence in the change being a required backport -It is critical that our full community is actively engaged on enhancements in +It is critical that the full community is actively engaged on enhancements in the project. If a released feature was not enabled on a particular provider's platform, this is a community miss that needs to be resolved in the `master` branch for subsequent releases. Such enabling will not be backported to the diff --git a/versioned_docs/version-v2.5.1/contributor/contribute-docs.md b/versioned_docs/version-v2.5.1/contributor/contribute-docs.md index 6b2ba2d5..de6979e5 100644 --- a/versioned_docs/version-v2.5.1/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.5.1/contributor/contribute-docs.md @@ -20,7 +20,7 @@ the `Project-HAMi/website` repository. ## Setup -You can set up your local environment by cloning our website repository. +You can set up your local environment by cloning the website repository. ```shell git clone https://github.com/Project-HAMi/website.git @@ -125,7 +125,7 @@ Creating a sidebar is useful to: - Display a sidebar on each of those documents - Provide paginated navigation, with next/previous button -For our docs, you can know how our documents are organized from +The document organization can be found from [https://github.com/Project-HAMi/website/blob/main/sidebars.js](https://github.com/Project-HAMi/website/blob/main/sidebars.js). ```js @@ -175,7 +175,7 @@ If you're not sure where your docs are located, you can ask community members in There are two situations about the Chinese version of the document: -- You want to translate our existing English docs to Chinese. In this case, +- You want to translate the existing English docs to Chinese. In this case, you need to modify the corresponding file content from [https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current](https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current). The organization of this directory is exactly the same as the outer layer. diff --git a/versioned_docs/version-v2.5.1/contributor/governance.md b/versioned_docs/version-v2.5.1/contributor/governance.md index aaf1e568..95984570 100644 --- a/versioned_docs/version-v2.5.1/contributor/governance.md +++ b/versioned_docs/version-v2.5.1/contributor/governance.md @@ -15,7 +15,7 @@ The HAMi and its leadership embrace the following values: * Fairness: All stakeholders have the opportunity to provide feedback and submit contributions, which will be considered on their merits. -* Community over Product or Company: Sustaining and growing our community takes +* Community over Product or Company: Sustaining and growing the community takes priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. diff --git a/versioned_docs/version-v2.5.1/get-started/nginx-example.md b/versioned_docs/version-v2.5.1/get-started/nginx-example.md index 3e5aa4cd..b1d16917 100644 --- a/versioned_docs/version-v2.5.1/get-started/nginx-example.md +++ b/versioned_docs/version-v2.5.1/get-started/nginx-example.md @@ -92,7 +92,7 @@ sudo systemctl daemon-reload && systemctl restart containerd #### 2. Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on @@ -106,7 +106,7 @@ First, you need to check your Kubernetes version by using the following command: kubectl version ``` -Then, add our repo in helm +Then, add the HAMi repo in helm ```bash helm repo add hami-charts https://project-hami.github.io/HAMi/ diff --git a/versioned_docs/version-v2.5.1/installation/prerequisites.md b/versioned_docs/version-v2.5.1/installation/prerequisites.md index 13666bb8..4e52d10e 100644 --- a/versioned_docs/version-v2.5.1/installation/prerequisites.md +++ b/versioned_docs/version-v2.5.1/installation/prerequisites.md @@ -82,7 +82,7 @@ sudo systemctl daemon-reload && systemctl restart containerd ### Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ``` kubectl label nodes {nodeid} gpu=on diff --git a/versioned_docs/version-v2.5.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/versioned_docs/version-v2.5.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index 92b3f2fb..51980675 100644 --- a/versioned_docs/version-v2.5.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/versioned_docs/version-v2.5.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -12,7 +12,7 @@ title: Enable cambricon MLU sharing **MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. ## Prerequisites diff --git a/versioned_docs/version-v2.6.0/contributor/cherry-picks.md b/versioned_docs/version-v2.6.0/contributor/cherry-picks.md index ca4c976b..a7a96275 100644 --- a/versioned_docs/version-v2.6.0/contributor/cherry-picks.md +++ b/versioned_docs/version-v2.6.0/contributor/cherry-picks.md @@ -62,7 +62,7 @@ your case by supplementing your PR with e.g., - Key stakeholder reviewers/approvers attesting to their confidence in the change being a required backport -It is critical that our full community is actively engaged on enhancements in +It is critical that the full community is actively engaged on enhancements in the project. If a released feature was not enabled on a particular provider's platform, this is a community miss that needs to be resolved in the `master` branch for subsequent releases. Such enabling will not be backported to the diff --git a/versioned_docs/version-v2.6.0/contributor/contribute-docs.md b/versioned_docs/version-v2.6.0/contributor/contribute-docs.md index 6b2ba2d5..de6979e5 100644 --- a/versioned_docs/version-v2.6.0/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.6.0/contributor/contribute-docs.md @@ -20,7 +20,7 @@ the `Project-HAMi/website` repository. ## Setup -You can set up your local environment by cloning our website repository. +You can set up your local environment by cloning the website repository. ```shell git clone https://github.com/Project-HAMi/website.git @@ -125,7 +125,7 @@ Creating a sidebar is useful to: - Display a sidebar on each of those documents - Provide paginated navigation, with next/previous button -For our docs, you can know how our documents are organized from +The document organization can be found from [https://github.com/Project-HAMi/website/blob/main/sidebars.js](https://github.com/Project-HAMi/website/blob/main/sidebars.js). ```js @@ -175,7 +175,7 @@ If you're not sure where your docs are located, you can ask community members in There are two situations about the Chinese version of the document: -- You want to translate our existing English docs to Chinese. In this case, +- You want to translate the existing English docs to Chinese. In this case, you need to modify the corresponding file content from [https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current](https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current). The organization of this directory is exactly the same as the outer layer. diff --git a/versioned_docs/version-v2.6.0/contributor/governance.md b/versioned_docs/version-v2.6.0/contributor/governance.md index aaf1e568..95984570 100644 --- a/versioned_docs/version-v2.6.0/contributor/governance.md +++ b/versioned_docs/version-v2.6.0/contributor/governance.md @@ -15,7 +15,7 @@ The HAMi and its leadership embrace the following values: * Fairness: All stakeholders have the opportunity to provide feedback and submit contributions, which will be considered on their merits. -* Community over Product or Company: Sustaining and growing our community takes +* Community over Product or Company: Sustaining and growing the community takes priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. diff --git a/versioned_docs/version-v2.6.0/get-started/nginx-example.md b/versioned_docs/version-v2.6.0/get-started/nginx-example.md index 3e5aa4cd..b1d16917 100644 --- a/versioned_docs/version-v2.6.0/get-started/nginx-example.md +++ b/versioned_docs/version-v2.6.0/get-started/nginx-example.md @@ -92,7 +92,7 @@ sudo systemctl daemon-reload && systemctl restart containerd #### 2. Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on @@ -106,7 +106,7 @@ First, you need to check your Kubernetes version by using the following command: kubectl version ``` -Then, add our repo in helm +Then, add the HAMi repo in helm ```bash helm repo add hami-charts https://project-hami.github.io/HAMi/ diff --git a/versioned_docs/version-v2.6.0/installation/prerequisites.md b/versioned_docs/version-v2.6.0/installation/prerequisites.md index f08ec526..acced9ec 100644 --- a/versioned_docs/version-v2.6.0/installation/prerequisites.md +++ b/versioned_docs/version-v2.6.0/installation/prerequisites.md @@ -60,7 +60,7 @@ sudo systemctl daemon-reload && systemctl restart containerd ### Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ``` kubectl label nodes {nodeid} gpu=on diff --git a/versioned_docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md b/versioned_docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md index 7a1bfec9..aedc43c6 100644 --- a/versioned_docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md +++ b/versioned_docs/version-v2.6.0/userguide/enflame-device/enable-enflame-gcu-sharing.md @@ -13,7 +13,7 @@ title: Enable Enflame GPU Sharing **Device UUID Selection**: You can specify which GCU devices to use or exclude using annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md b/versioned_docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md index 8fb912f3..a6f28a13 100644 --- a/versioned_docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md +++ b/versioned_docs/version-v2.6.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md @@ -15,7 +15,7 @@ title: Enable Illuvatar GPU Sharing **Device UUID Selection**: You can specify which GPU devices to use or exclude using annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.7.0/contributor/cherry-picks.md b/versioned_docs/version-v2.7.0/contributor/cherry-picks.md index ca4c976b..a7a96275 100644 --- a/versioned_docs/version-v2.7.0/contributor/cherry-picks.md +++ b/versioned_docs/version-v2.7.0/contributor/cherry-picks.md @@ -62,7 +62,7 @@ your case by supplementing your PR with e.g., - Key stakeholder reviewers/approvers attesting to their confidence in the change being a required backport -It is critical that our full community is actively engaged on enhancements in +It is critical that the full community is actively engaged on enhancements in the project. If a released feature was not enabled on a particular provider's platform, this is a community miss that needs to be resolved in the `master` branch for subsequent releases. Such enabling will not be backported to the diff --git a/versioned_docs/version-v2.7.0/contributor/contribute-docs.md b/versioned_docs/version-v2.7.0/contributor/contribute-docs.md index b463684c..612cd955 100644 --- a/versioned_docs/version-v2.7.0/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.7.0/contributor/contribute-docs.md @@ -20,7 +20,7 @@ the `Project-HAMi/website` repository. ## Setup -You can set up your local environment by cloning our website repository. +You can set up your local environment by cloning the website repository. ```shell git clone https://github.com/Project-HAMi/website.git @@ -125,7 +125,7 @@ Creating a sidebar is useful to: - Display a sidebar on each of those documents - Provide paginated navigation, with next/previous button -For our docs, you can know how our documents are organized from +The document organization can be found from [https://github.com/Project-HAMi/website/blob/main/sidebars.js](https://github.com/Project-HAMi/website/blob/main/sidebars.js). ```js @@ -175,7 +175,7 @@ If you're not sure where your docs are located, you can ask community members in There are two situations about the Chinese version of the document: -- You want to translate our existing English docs to Chinese. In this case, +- You want to translate the existing English docs to Chinese. In this case, you need to modify the corresponding file content from [https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current](https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current). The organization of this directory is exactly the same as the outer layer. diff --git a/versioned_docs/version-v2.7.0/contributor/governance.md b/versioned_docs/version-v2.7.0/contributor/governance.md index aaf1e568..95984570 100644 --- a/versioned_docs/version-v2.7.0/contributor/governance.md +++ b/versioned_docs/version-v2.7.0/contributor/governance.md @@ -15,7 +15,7 @@ The HAMi and its leadership embrace the following values: * Fairness: All stakeholders have the opportunity to provide feedback and submit contributions, which will be considered on their merits. -* Community over Product or Company: Sustaining and growing our community takes +* Community over Product or Company: Sustaining and growing the community takes priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. diff --git a/versioned_docs/version-v2.7.0/get-started/deploy-with-helm.md b/versioned_docs/version-v2.7.0/get-started/deploy-with-helm.md index 80781bbd..16d2b3e7 100644 --- a/versioned_docs/version-v2.7.0/get-started/deploy-with-helm.md +++ b/versioned_docs/version-v2.7.0/get-started/deploy-with-helm.md @@ -99,7 +99,7 @@ sudo systemctl daemon-reload && systemctl restart containerd #### 2. Label your nodes {#label-your-nodes} Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". -Without this label, the nodes cannot be managed by our scheduler. +Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on @@ -113,7 +113,7 @@ First, you need to check your Kubernetes version by using the following command: kubectl version ``` -Then, add our repo in helm +Then, add the HAMi repo in helm ```bash helm repo add hami-charts https://project-hami.github.io/HAMi/ diff --git a/versioned_docs/version-v2.7.0/installation/prerequisites.md b/versioned_docs/version-v2.7.0/installation/prerequisites.md index 41bfa6d4..c283d5bd 100644 --- a/versioned_docs/version-v2.7.0/installation/prerequisites.md +++ b/versioned_docs/version-v2.7.0/installation/prerequisites.md @@ -59,7 +59,7 @@ sudo systemctl daemon-reload && systemctl restart containerd ### Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on diff --git a/versioned_docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md b/versioned_docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md index 73820996..38ea7d92 100644 --- a/versioned_docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md +++ b/versioned_docs/version-v2.7.0/userguide/enflame-device/enable-enflame-gcu-sharing.md @@ -13,7 +13,7 @@ title: Enable Enflame GPU Sharing **Device UUID Selection**: You can specify which GCU devices to use or exclude using annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md b/versioned_docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md index 833a3ecc..f0c44339 100644 --- a/versioned_docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md +++ b/versioned_docs/version-v2.7.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md @@ -15,7 +15,7 @@ title: Enable Illuvatar GPU Sharing **Device UUID Selection**: You can specify which GPU devices to use or exclude using annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.8.0/contributor/cherry-picks.md b/versioned_docs/version-v2.8.0/contributor/cherry-picks.md index ca4c976b..a7a96275 100644 --- a/versioned_docs/version-v2.8.0/contributor/cherry-picks.md +++ b/versioned_docs/version-v2.8.0/contributor/cherry-picks.md @@ -62,7 +62,7 @@ your case by supplementing your PR with e.g., - Key stakeholder reviewers/approvers attesting to their confidence in the change being a required backport -It is critical that our full community is actively engaged on enhancements in +It is critical that the full community is actively engaged on enhancements in the project. If a released feature was not enabled on a particular provider's platform, this is a community miss that needs to be resolved in the `master` branch for subsequent releases. Such enabling will not be backported to the diff --git a/versioned_docs/version-v2.8.0/contributor/contribute-docs.md b/versioned_docs/version-v2.8.0/contributor/contribute-docs.md index c084d8f6..c41ee7a2 100644 --- a/versioned_docs/version-v2.8.0/contributor/contribute-docs.md +++ b/versioned_docs/version-v2.8.0/contributor/contribute-docs.md @@ -20,7 +20,7 @@ the `Project-HAMi/website` repository. ## Setup -You can set up your local environment by cloning our website repository. +You can set up your local environment by cloning the website repository. ```shell git clone https://github.com/Project-HAMi/website.git @@ -125,7 +125,7 @@ Creating a sidebar is useful to: - Display a sidebar on each of those documents - Provide paginated navigation, with next/previous button -For our docs, you can know how our documents are organized from +The document organization can be found from [https://github.com/Project-HAMi/website/blob/main/sidebars.js](https://github.com/Project-HAMi/website/blob/main/sidebars.js). ```js @@ -175,7 +175,7 @@ If you're not sure where your docs are located, you can ask community members in There are two situations about the Chinese version of the document: -- You want to translate our existing English docs to Chinese. In this case, +- You want to translate the existing English docs to Chinese. In this case, you need to modify the corresponding file content from [https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current](https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current). The organization of this directory is exactly the same as the outer layer. diff --git a/versioned_docs/version-v2.8.0/contributor/governance.md b/versioned_docs/version-v2.8.0/contributor/governance.md index aaf1e568..95984570 100644 --- a/versioned_docs/version-v2.8.0/contributor/governance.md +++ b/versioned_docs/version-v2.8.0/contributor/governance.md @@ -15,7 +15,7 @@ The HAMi and its leadership embrace the following values: * Fairness: All stakeholders have the opportunity to provide feedback and submit contributions, which will be considered on their merits. -* Community over Product or Company: Sustaining and growing our community takes +* Community over Product or Company: Sustaining and growing the community takes priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. diff --git a/versioned_docs/version-v2.8.0/get-started/deploy-with-helm.md b/versioned_docs/version-v2.8.0/get-started/deploy-with-helm.md index 80781bbd..16d2b3e7 100644 --- a/versioned_docs/version-v2.8.0/get-started/deploy-with-helm.md +++ b/versioned_docs/version-v2.8.0/get-started/deploy-with-helm.md @@ -99,7 +99,7 @@ sudo systemctl daemon-reload && systemctl restart containerd #### 2. Label your nodes {#label-your-nodes} Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". -Without this label, the nodes cannot be managed by our scheduler. +Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on @@ -113,7 +113,7 @@ First, you need to check your Kubernetes version by using the following command: kubectl version ``` -Then, add our repo in helm +Then, add the HAMi repo in helm ```bash helm repo add hami-charts https://project-hami.github.io/HAMi/ diff --git a/versioned_docs/version-v2.8.0/get-started/verify-hami.md b/versioned_docs/version-v2.8.0/get-started/verify-hami.md index d9807166..98c038cb 100644 --- a/versioned_docs/version-v2.8.0/get-started/verify-hami.md +++ b/versioned_docs/version-v2.8.0/get-started/verify-hami.md @@ -111,7 +111,7 @@ Expected: Both `hami-scheduler` and `vgpu-device-plugin` pods should be in the ` ## Step 3: Launch and Verify a vGPU Task -Let's prove HAMi is enforcing fractional resource limits (vGPU). +HAMi enforces fractional resource limits (vGPU): ### 1. Submit a vGPU demo task diff --git a/versioned_docs/version-v2.8.0/installation/prerequisites.md b/versioned_docs/version-v2.8.0/installation/prerequisites.md index 8a90cbd0..c4670ca9 100644 --- a/versioned_docs/version-v2.8.0/installation/prerequisites.md +++ b/versioned_docs/version-v2.8.0/installation/prerequisites.md @@ -62,7 +62,7 @@ sudo systemctl daemon-reload && systemctl restart containerd ### Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on diff --git a/versioned_docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md b/versioned_docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md index 0e04d9a4..164b5a11 100644 --- a/versioned_docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md +++ b/versioned_docs/version-v2.8.0/userguide/enflame-device/enable-enflame-gcu-sharing.md @@ -13,7 +13,7 @@ title: Enable Enflame GPU Sharing **Device UUID Selection**: You can specify which GCU devices to use or exclude using annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites diff --git a/versioned_docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md b/versioned_docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md index 27b48aa8..92bed1c7 100644 --- a/versioned_docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md +++ b/versioned_docs/version-v2.8.0/userguide/iluvatar-device/enable-illuvatar-gpu-sharing.md @@ -14,7 +14,7 @@ title: Enable Illuvatar GPU Sharing **Device UUID Selection**: You can specify which GPU devices to use or exclude using annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your GPU jobs will be automatically supported after installation. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your GPU jobs will be automatically supported after installation. ## Prerequisites From 4b60680d0f3e8d58282ceb783bba3787199d66ff Mon Sep 17 00:00:00 2001 From: mesutoezdil <mesudozdil@gmail.com> Date: Thu, 7 May 2026 22:20:25 +0200 Subject: [PATCH 3/3] docs: fix first-person language in i18n/zh versioned doc copies Signed-off-by: mesutoezdil <mesudozdil@gmail.com> --- docs/contributor/ladder.md | 2 +- .../version-v1.3.0/contributor/contributing.md | 4 ++-- .../version-v1.3.0/contributor/governance.md | 4 ++-- .../version-v1.3.0/contributor/ladder.md | 4 ++-- .../version-v1.3.0/developers/dynamic-mig.md | 4 ++-- .../version-v1.3.0/developers/scheduling.md | 8 ++++---- .../version-v1.3.0/get-started/nginx-example.md | 4 ++-- .../version-v1.3.0/installation/prerequisites.md | 2 +- .../enable-cambricon-mlu-sharing.md | 4 ++-- .../version-v1.3.0/userguide/configure.md | 2 +- .../hygon-device/enable-hygon-dcu-sharing.md | 2 +- .../metax-device/enable-metax-gpu-schedule.md | 2 +- .../enable-mthreads-gpu-sharing.md | 2 +- .../nvidia-device/dynamic-mig-support.md | 2 +- .../examples/specify-card-type-to-use.md | 2 +- .../version-v2.4.1/contributor/cherry-picks.md | 2 +- .../contributor/contribute-docs.md | 16 ++++++++-------- .../version-v2.4.1/get-started/nginx-example.md | 4 ++-- .../version-v2.4.1/installation/prerequisites.md | 2 +- .../enable-cambricon-mlu-sharing.md | 4 ++-- .../version-v2.4.1/userguide/configure.md | 2 +- .../examples/specify-card-type-to-use.md | 2 +- 22 files changed, 40 insertions(+), 40 deletions(-) diff --git a/docs/contributor/ladder.md b/docs/contributor/ladder.md index d8b725e0..916b0aa7 100644 --- a/docs/contributor/ladder.md +++ b/docs/contributor/ladder.md @@ -48,7 +48,7 @@ Description: A Contributor contributes directly to the project and adds value to A very special thanks to the [long list of people](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md) who have contributed to and helped maintain the project. The project wouldn't be where it is today without your contributions. Thank you! -As long as you contribute to HAMi, your name will be added to the [AUTHORS.md file](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md). If you don't find your name, please contact us to add it. +As long as you contribute to HAMi, your name will be added to the [AUTHORS.md file](https://github.com/Project-HAMi/HAMi/blob/master/AUTHORS.md). If you don't find your name, please open an issue to have it added. ### Organization Member diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/contributing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/contributing.md index 348e5b10..693b6bf2 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/contributing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/contributing.md @@ -6,7 +6,7 @@ Welcome to HAMi! ## Code of Conduct -Please make sure to read and observe our [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) +Please make sure to read and observe the [Code of Conduct](https://github.com/cncf/foundation/blob/main/code-of-conduct.md) ## Community Expectations @@ -51,7 +51,7 @@ When you are willing to take on an issue, just reply on the issue. The maintaine ### File an Issue -While we encourage everyone to contribute code, it is also appreciated when someone reports an issue. +Code contributions are welcome, and bug reports are equally appreciated. Issues should be filed under the appropriate HAMi sub-repository. *Example:* a HAMi issue should be opened to [Project-HAMi/HAMi](https://github.com/Project-HAMi/HAMi/issues). diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/governance.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/governance.md index f49b23b7..95984570 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/governance.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/governance.md @@ -15,11 +15,11 @@ The HAMi and its leadership embrace the following values: * Fairness: All stakeholders have the opportunity to provide feedback and submit contributions, which will be considered on their merits. -* Community over Product or Company: Sustaining and growing our community takes +* Community over Product or Company: Sustaining and growing the community takes priority over shipping code or sponsors' organizational goals. Each contributor participates in the project as an individual. -* Inclusivity: We innovate through different perspectives and skill sets, which +* Inclusivity: Innovation comes from different perspectives and skill sets, and this can only be accomplished in a welcoming and respectful environment. * Participation: Responsibilities within the project are earned through diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/ladder.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/ladder.md index 7ace3e75..77336320 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/ladder.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/contributor/ladder.md @@ -6,7 +6,7 @@ This docs different ways to get involved and level up within the project. You ca ## Contributor Ladder -Hello! We are excited that you want to learn more about our project contributor ladder! This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Community members generally start at the first levels of the "ladder" and advance up it as their involvement in the project grows. Our project members are happy to help you advance along the contributor ladder. +This contributor ladder outlines the different contributor roles within the project, along with the responsibilities and privileges that come with them. Each of the contributor roles below is organized into lists of three types of things. "Responsibilities" are things that a contributor is expected to do. "Requirements" are qualifications a person needs to meet to be in that role, and "Privileges" are things contributors on that level are entitled to. @@ -142,7 +142,7 @@ The current list of maintainers can be found in the [MAINTAINERS](https://github New maintainers are added by consensus among the current group of maintainers. This can be done via a private discussion via Slack or email. A majority of maintainers should support the addition of the new person, and no single maintainer should object to adding the new maintainer. -When adding a new maintainer, we should file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. +When adding a new maintainer, file a PR to [HAMi](https://github.com/Project-HAMi/HAMi) and update [MAINTAINERS](https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md). Once this PR is merged, you will become a maintainer of HAMi. ### Removing Maintainers diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/developers/dynamic-mig.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/developers/dynamic-mig.md index a91bc837..f1b5c618 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/developers/dynamic-mig.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/developers/dynamic-mig.md @@ -9,8 +9,8 @@ This feature will not be implemented without the help of @sailorvii. ## Introduction -The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, so we chose the MPS and MIG. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. We want to develop an automatic slice plugin and create the slice when the user require it. -For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, we consider the CPU, Mem, GPU memory and other user-defined resource. +The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, MPS and MIG are preferred. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. The goal is an automatic slice plugin that creates slices on demand. +For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, the scheduler considers CPU, memory, GPU memory, and other user-defined resources. HAMi is done by using [hami-core](https://github.com/Project-HAMi/HAMi-core), which is a cuda-hacking library. But mig is also widely used across the world. A unified API for dynamic-mig and hami-core is needed. ## Targets diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/developers/scheduling.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/developers/scheduling.md index 02270146..81b1275c 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/developers/scheduling.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/developers/scheduling.md @@ -104,7 +104,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Binpack` policy we can select `Node1`. +So, in `Binpack` policy, the selected node is `Node1`. #### Spread @@ -124,7 +124,7 @@ Node1 score: ((1+3)/4) * 10= 10 Node2 score: ((1+2)/4) * 10= 7.5 ``` -So, in `Spread` policy we can select `Node2`. +So, in `Spread` policy, the selected node is `Node2`. ### GPU-scheduler-policy @@ -147,7 +147,7 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Binpack` policy we can select `GPU2`. +So, in `Binpack` policy, the selected node is `GPU2`. #### Spread @@ -166,4 +166,4 @@ GPU1 Score: ((20+10)/100 + (1000+2000)/8000)) * 10 = 6.75 GPU2 Score: ((20+70)/100 + (1000+6000)/8000)) * 10 = 17.75 ``` -So, in `Spread` policy we can select `GPU1`. +So, in `Spread` policy, the selected node is `GPU1`. diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/get-started/nginx-example.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/get-started/nginx-example.md index 3e5aa4cd..b1d16917 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/get-started/nginx-example.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/get-started/nginx-example.md @@ -92,7 +92,7 @@ sudo systemctl daemon-reload && systemctl restart containerd #### 2. Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on @@ -106,7 +106,7 @@ First, you need to check your Kubernetes version by using the following command: kubectl version ``` -Then, add our repo in helm +Then, add the HAMi repo in helm ```bash helm repo add hami-charts https://project-hami.github.io/HAMi/ diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/installation/prerequisites.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/installation/prerequisites.md index 12cc0489..5021a3d6 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/installation/prerequisites.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/installation/prerequisites.md @@ -81,7 +81,7 @@ sudo systemctl daemon-reload && systemctl restart containerd ### Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index b524e6d0..1ecc3aa8 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -4,7 +4,7 @@ title: Enable cambricon MLU sharing ## Introduction -**We now support cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: **MLU sharing**: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. @@ -12,7 +12,7 @@ title: Enable cambricon MLU sharing **MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. ## Prerequisites diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/configure.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/configure.md index 35edcc0d..b7a8c9e9 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/configure.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/configure.md @@ -18,7 +18,7 @@ You can update these configurations using one of the following methods: 2. Modify Helm Chart: Update the corresponding values in the [ConfigMap](https://raw.githubusercontent.com/archlitchi/HAMi/refs/heads/master/charts/hami/templates/scheduler/device-configmap.yaml), then reapply the Helm Chart to regenerate the ConfigMap. * `nvidia.deviceMemoryScaling:` - Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if we set `nvidia.deviceMemoryScaling` argument to *S*, vGPUs split by this GPU will totally get `S * M` memory in Kubernetes with our device plugin. + Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if `nvidia.deviceMemoryScaling` is set argument to *S*, vGPUs split by this GPU will totally get `S * M` memory in Kubernetes with the HAMi device plugin. * `nvidia.deviceSplitCount:` Integer type, by default: equals 10. Maximum tasks assigned to a simple GPU device. * `nvidia.migstrategy:` diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/hygon-device/enable-hygon-dcu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/hygon-device/enable-hygon-dcu-sharing.md index d0a896c7..64fd849b 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/hygon-device/enable-hygon-dcu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/hygon-device/enable-hygon-dcu-sharing.md @@ -4,7 +4,7 @@ title: Enable Hygon DCU sharing ## Introduction -**We now support hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports hygon.com/dcu by implementing most device-sharing features as nvidia-GPU**, including: **DCU sharing**: Each task can allocate a portion of DCU instead of a whole DCU card, thus DCU can be shared among multiple tasks. diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/metax-device/enable-metax-gpu-schedule.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/metax-device/enable-metax-gpu-schedule.md index 0611d980..ee0ac149 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/metax-device/enable-metax-gpu-schedule.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/metax-device/enable-metax-gpu-schedule.md @@ -2,7 +2,7 @@ title: Enable Metax GPU topology-aware scheduling --- -**We now support metax.com/gpu by implementing topo-awareness among metax GPUs**: +**HAMi now supports metax.com/gpu by implementing topo-awareness among metax GPUs**: When multiple GPUs are configured on a single server, the GPU cards are connected to the same PCIe Switch or MetaXLink depending on whether they are connected , there is a near-far relationship. This forms a topology among all the cards on the server, as shown in the following figure: diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md index 5e349e37..4ad7aaf5 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/mthreads-device/enable-mthreads-gpu-sharing.md @@ -4,7 +4,7 @@ title: Enable Mthreads GPU sharing ## Introduction -**We now support mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports mthreads.com/vgpu by implementing most device-sharing features as nvidia-GPU**, including: **GPU sharing**: Each task can allocate a portion of GPU instead of a whole GPU card, thus GPU can be shared among multiple tasks. diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/nvidia-device/dynamic-mig-support.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/nvidia-device/dynamic-mig-support.md index b996b1a8..96b987c8 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/nvidia-device/dynamic-mig-support.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/nvidia-device/dynamic-mig-support.md @@ -4,7 +4,7 @@ title: Enable dynamic-mig feature ## Introduction -**We now support dynamic-mig by using mig-parted to adjust mig-devices dynamically**, including: +**HAMi now supports dynamic-mig by using mig-parted to adjust mig-devices dynamically**, including: **Dynamic MIG instance management**: User don't need to operate on GPU node, using 'nvidia-smi -i 0 -mig 1' or other command to manage MIG instance, all will be done by HAMi-device-plugin. diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/nvidia-device/examples/specify-card-type-to-use.md b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/nvidia-device/examples/specify-card-type-to-use.md index dd7239fd..a05bf560 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/nvidia-device/examples/specify-card-type-to-use.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v1.3.0/userguide/nvidia-device/examples/specify-card-type-to-use.md @@ -24,4 +24,4 @@ spec: nvidia.com/gpu: 2 # requesting 2 vGPUs ``` -> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, we want to run this job on A100 or V100* \ No newline at end of file +> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, the job targets A100 or V100* \ No newline at end of file diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/contributor/cherry-picks.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/contributor/cherry-picks.md index 80a58e56..c3edd9eb 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/contributor/cherry-picks.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/contributor/cherry-picks.md @@ -62,7 +62,7 @@ your case by supplementing your PR with e.g., - Key stakeholder reviewers/approvers attesting to their confidence in the change being a required backport -It is critical that our full community is actively engaged on enhancements in +It is critical that the full community is actively engaged on enhancements in the project. If a released feature was not enabled on a particular provider's platform, this is a community miss that needs to be resolved in the `master` branch for subsequent releases. Such enabling will not be backported to the diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/contributor/contribute-docs.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/contributor/contribute-docs.md index 9c02b725..f214a7ed 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/contributor/contribute-docs.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/contributor/contribute-docs.md @@ -9,16 +9,16 @@ the `Project-HAMi/website` repository. ## Prerequisites - Docs, like codes, are also categorized and stored by version. - 1.3 is the first version we have archived. + 1.3 is the first version is the first archived. - Docs need to be translated into multiple languages for readers from different regions. The community now supports both Chinese and English. English is the official language of documentation. -- For our docs we use markdown. If you are unfamiliar with Markdown, please see [https://guides.github.com/features/mastering-markdown/](https://guides.github.com/features/mastering-markdown/) or [https://www.markdownguide.org/](https://www.markdownguide.org/) if you are looking for something more substantial. +- The docs use markdown. If you are unfamiliar with Markdown, please see [https://guides.github.com/features/mastering-markdown/](https://guides.github.com/features/mastering-markdown/) or [https://www.markdownguide.org/](https://www.markdownguide.org/) if you are looking for something more substantial. - We get some additions through [Docusaurus 2](https://docusaurus.io/), a model static website generator. ## Setup -You can set up your local environment by cloning our website repository. +You can set up your local environment by cloning the website repository. ```shell git clone https://github.com/Project-HAMi/website.git @@ -85,7 +85,7 @@ title: A doc with tags ## secondary title ``` -The top section between two lines of --- is the Front Matter section. Here we define a couple of entries which tell Docusaurus how to handle the article: +The top section between two lines of --- is the Front Matter section. These entries tell Docusaurus how to handle the article: - Title is the equivalent of the `<h1>` in a HTML document or `# <title>` in a Markdown article. - Each document has a unique ID. By default, a document ID is the name of the document (without the extension) related to the root docs directory. @@ -118,7 +118,7 @@ Creating a sidebar is useful to: - Display a sidebar on each of those documents - Provide paginated navigation, with next/previous button -For our docs, you can know how our documents are organized from [https://github.com/Project-HAMi/website/blob/main/sidebars.js](https://github.com/Project-HAMi/website/blob/main/sidebars.js). +The document organization can be found from [https://github.com/Project-HAMi/website/blob/main/sidebars.js](https://github.com/Project-HAMi/website/blob/main/sidebars.js). ```js module.exports = { @@ -166,7 +166,7 @@ If you add a document, you must add it to `sidebars.js` to make it display prope There are two situations about the Chinese version of the document: -- You want to translate our existing English docs to Chinese. In this case, you need to modify the corresponding file content from [https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current](https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current). +- You want to translate the existing English docs to Chinese. In this case, you need to modify the corresponding file content from [https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current](https://github.com/Project-HAMi/website/tree/main/i18n/zh/docusaurus-plugin-content-docs/current). The organization of this directory is exactly the same as the outer layer. `current.json` holds translations for the documentation directory. You can edit it if you want to translate the name of directory. - You want to contribute Chinese docs without English version. Any articles of any kind are welcomed. In this case, you can add articles and titles to the main directory first. Article content can be TBD first, like this. Then add the corresponding Chinese content to the Chinese directory. @@ -185,5 +185,5 @@ If the previewed page is not what you expected, please check your docs again. ### Versioning -For the newly supplemented documents of each version, we will synchronize to the latest version on the release date of each version, and the documents of the old version will not be modified. -For errata found in the documentation, we will fix it with every release. +For the newly supplemented documents of each version, they are synchronized to the latest version on the release date of each version, and the documents of the old version will not be modified. +For errata found in the documentation, fixes are applied with every release. diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/get-started/nginx-example.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/get-started/nginx-example.md index 3e5aa4cd..b1d16917 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/get-started/nginx-example.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/get-started/nginx-example.md @@ -92,7 +92,7 @@ sudo systemctl daemon-reload && systemctl restart containerd #### 2. Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ```bash kubectl label nodes {nodeid} gpu=on @@ -106,7 +106,7 @@ First, you need to check your Kubernetes version by using the following command: kubectl version ``` -Then, add our repo in helm +Then, add the HAMi repo in helm ```bash helm repo add hami-charts https://project-hami.github.io/HAMi/ diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/installation/prerequisites.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/installation/prerequisites.md index 13666bb8..4e52d10e 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/installation/prerequisites.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/installation/prerequisites.md @@ -82,7 +82,7 @@ sudo systemctl daemon-reload && systemctl restart containerd ### Label your nodes -Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler. +Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by the HAMi scheduler. ``` kubectl label nodes {nodeid} gpu=on diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md index b626a845..8cfa7172 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/cambricon-device/enable-cambricon-mlu-sharing.md @@ -4,7 +4,7 @@ title: Enable cambricon MLU sharing ## Introduction -**We now support cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: +**HAMi now supports cambricon.com/mlu by implementing most device-sharing features as nvidia-GPU**, including: **MLU sharing**: Each task can allocate a portion of MLU instead of a whole MLU card, thus MLU can be shared among multiple tasks. @@ -12,7 +12,7 @@ title: Enable cambricon MLU sharing **MLU Type Specification**: You can specify which type of MLU to use or to avoid for a certain task, by setting "cambricon.com/use-mlutype" or "cambricon.com/nouse-mlutype" annotations. -**Very Easy to use**: You don't need to modify your task yaml to use our scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. +**Very Easy to use**: You don't need to modify your task yaml to use the HAMi scheduler. All your MLU jobs will be automatically supported after installation. The only thing you need to do is tag the MLU node. ## Prerequisites diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/configure.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/configure.md index 0fcda964..60ea88ee 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/configure.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/configure.md @@ -13,7 +13,7 @@ helm install vgpu-charts/vgpu vgpu --set devicePlugin.deviceMemoryScaling=5 ... * `devicePlugin.service.schedulerPort:` Integer type, by default: 31998, scheduler webhook service nodePort. * `devicePlugin.deviceMemoryScaling:` - Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if we set `devicePlugin.deviceMemoryScaling` argument to *S*, vGPUs split by this GPU will totally get `S * M` memory in Kubernetes with our device plugin. + Float type, by default: 1. The ratio for NVIDIA device memory scaling, can be greater than 1 (enable virtual device memory, experimental feature). For NVIDIA GPU with *M* memory, if `devicePlugin.deviceMemoryScaling` is set argument to *S*, vGPUs split by this GPU will totally get `S * M` memory in Kubernetes with the HAMi device plugin. * `devicePlugin.deviceSplitCount:` Integer type, by default: equals 10. Maximum tasks assigned to a simple GPU device. * `devicePlugin.migstrategy:` diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/nvidia-device/examples/specify-card-type-to-use.md b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/nvidia-device/examples/specify-card-type-to-use.md index 287b2f51..e65edfdd 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/nvidia-device/examples/specify-card-type-to-use.md +++ b/i18n/zh/docusaurus-plugin-content-docs/version-v2.4.1/userguide/nvidia-device/examples/specify-card-type-to-use.md @@ -24,4 +24,4 @@ spec: nvidia.com/gpu: 2 # requesting 2 vGPUs ``` -> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, we want to run this job on A100 or V100* \ No newline at end of file +> **NOTICE:** * You can assign this task to multiple GPU types, use comma to separate,In this example, the job targets A100 or V100* \ No newline at end of file