diff --git a/content/blog/cloud-native-ai-model-management.md b/content/blog/cloud-native-ai-model-management.md
new file mode 100644
index 000000000..f60bfbfce
--- /dev/null
+++ b/content/blog/cloud-native-ai-model-management.md
@@ -0,0 +1,506 @@
+---
+title: "Cloud-Native AI Model Management and Distribution for Inference Workloads"
+date: 2026-03-11T12:00:00+04:00
+description: "Cloud-native AI model management and distribution for scalable inference"
+showPageInfo: true
+---
+
+_Author:_
+
+- _Wenbo Qi(Gaius), Dragonfly/ModelPack Maintainer_
+- _Chenyu Zhang(Chlins), Harbor/ModelPack Maintainer_
+- _Feynman Zhou, ORAS Maintainer, CNCF Ambassador_
+
+_Reviewer:_
+
+- _Sascha Grunert, CRI-O Maintainer_
+- _Wei Fu, containerd Maintainer_
+
+## The weight of AI models: Why infrastructure always arrives slowly
+
+As AI adoption accelerates across industries, organizations face a critical bottleneck that is often overlooked until it
+becomes a serious obstacle: reliably managing and distributing large model weight files at scale.
+A model's weights serve as the central artifact that bridges both training and inference pipelines — yet the
+infrastructure surrounding this artifact is frequently an afterthought.
+
+This article addresses the operational challenges of managing AI model artifacts at enterprise scale,
+and introduces a cloud-native solution that brings software delivery best practices — versioning,
+immutability, and GitOps, to the world of large model files.
+
+### The gap nobody talks about — until it breaks production
+
+_**The cloud native gap**_: Most existing ML model storage approaches were not designed with
+Kubernetes-native delivery in mind, leaving a critical gap between how software artifacts
+are managed and how model artifacts are managed.
+
+Today, enterprises operate AI infrastructure on Kubernetes yet their model artifact management lags behind.
+Software containers are pulled from OCI registries with full versioning, security scanning, and rollback support.
+Model weights, by contrast, are often downloaded via ad hoc scripts, copied manually between storage buckets,
+or distributed through unsecured shared filesystems. This gap creates deployment fragility, security risks, and
+operational overhead at scale.
+
+### When your model weighs more than your entire app
+
+Modern foundation models are not small. A single model checkpoint can range from tens of gigabytes to several terabytes.
+For reference, a quantized LLaMA-3 70B model weighs approximately 140 GB, while frontier multimodal models can easily
+exceed 1 TB. These are not files you version-control with standard Git — they demand dedicated storage strategies,
+efficient transfer protocols, and careful access control.
+
+The core challenges are: _**storage at scale, distribution speed, and reproducibility**_. Teams need to store
+multiple model versions, rapidly distribute them to GPU inference nodes across regions, and guarantee that
+any deployment can be traced back to an exact, immutable artifact.
+
+### Three paths forward — and why none of them are enough
+
+|          | Git LFS (Hugging Face Hub)                                                                                                                       | Object Storage (S3, MinIO)                                                          | Distributed Filesystem (NFS, CephFS)                                                                                     |
+| :------- | :----------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------- |
+| **Pros** | Native version control (branches, tags, commits, history).                                                                                       | Standard offering from cloud providers. Native support in engines like vLLM/SGLang. | POSIX compatible. Low integration cost.                                                                                  |
+| **Cons** | Poor protocol adaptation for cloud-native environments. Inherits Git's transport inefficiencies, lacks optimizations for huge file distribution. | Lacks structured metadata. Weak version management capabilities.                    | Lacks structured metadata. Weak version management capabilities. High operational complexity for distributed filesystem. |
+
+## Rethinking the delivery pipeline: Models deserve better than a shell script
+
+The approach described here treats AI model weights as first-class OCI (Open Container Initiative) artifacts,
+packaging them in the same container registries used for application images. This enables model delivery to
+leverage the full ecosystem of container tooling: security scanning, signed provenance, GitOps-driven
+deployment, and Kubernetes-native pulling.
+
+### What If we shipped models the same way we ship code?
+
+In the cloud-native era, developers have long established a mature and efficient paradigm for software delivery.
+
+![p1](../img/cloud-native-ai-model-management/p1.png)
+
+**The software delivery:**
+
+1. **Develop:** Developers commit code to a Git repository, manage code changes through branches, and define versions
+   using tags at key milestones.
+2. **Build:** CI/CD pipelines compile and test, packaging the output into an immutable Container Image.
+3. **Manage and deliver:** Images are stored in a Container Registry. Supply chain
+   security (scanning/signing), RBAC, and P2P distribution ensure safe delivery.
+4. **Deploy:** DevOps engineers use declarative Kubernetes YAML to define the desired state.
+   The Container's lifecycle is managed by Kubernetes.
+
+![p2](../img/cloud-native-ai-model-management/p2.png)
+
+**The cloud native AI model delivery:**
+
+1. **Develop:** Algorithm engineers push weights and configs to the Hugging Face Hub, treating it as the Git Repository.
+2. **Build:** CI/CD pipelines package weights, runtime configurations, and metadata into an immutable Model Artifact.
+3. **Manage and deliver:** The Model Artifact is managed by an Artifact Registry, reusing the existing container
+   infrastructure and toolchain.
+4. **Deploy:** Engineers use Kubernetes OCI Volumes or a Model CSI Driver. Models are mounted into the inference
+   Container as Volumes via declarative semantics, decoupling the AI model from the inference engine (vLLM, SGLang, etc.).
+
+By applying _**software delivery paradigms**_ and _**supply chain**_ thinking to model lifecycle management, we constructed
+a granular, efficient system that resolves the challenges of managing and distributing AI models in production.
+
+### Walking the pipeline: A build story in four steps
+
+#### Build
+
+modctl is a CLI tool designed to package AI models into OCI artifacts. It standardizes
+versioning, storage, distribution and deployment, ensuring integration with the cloud-native ecosystem.
+
+![p3](../img/cloud-native-ai-model-management/p3.png)
+
+##### Step 1: Auto-generate Modelfile
+
+Run the following in the model directory to generate a definition file.
+
+```shell
+modctl modelfile generate .
+```
+
+##### Step 2: Customize Modelfile
+
+You can also customize the content of the Modelfile.
+
+```dockerfile
+# Model name (string), such as llama3-8b-instruct, gpt2-xl, qwen2-vl-72b-instruct, etc.
+NAME qwen2.5-0.5b
+
+# Model architecture (string), such as transformer, cnn, rnn, etc.
+ARCH transformer
+
+# Model family (string), such as llama3, gpt2, qwen2, etc.
+FAMILY qwen2
+
+# Model format (string), such as onnx, tensorflow, pytorch, etc.
+FORMAT safetensors
+
+# Specify model configuration file, support glob path pattern.
+CONFIG config.json
+
+# Specify model configuration file, support glob path pattern.
+CONFIG generation_config.json
+
+# Model weight, support glob path pattern.
+MODEL *.safetensors
+
+# Specify code, support glob path pattern.
+CODE *.py
+```
+
+##### Step 3: Login to Artifact Registry (Harbor)
+
+```shell
+modctl login -u username -p password harbor.registry.com
+```
+
+##### Step 4: Build OCI Artifact
+
+```shell
+modctl build -t harbor.registry.com/models/qwen2.5-0.5b:v1 -f Modelfile .
+```
+
+A Model Manifest is generated after the build. Descriptive information such as ARCH, FAMILY, and FORMAT is
+stored in a file with the media type _application/vnd.cncf.model.config.v1+json_.
+
+```json
+{
+  "schemaVersion": 2,
+  "mediaType": "application/vnd.oci.image.manifest.v1+json",
+  "artifactType": "application/vnd.cncf.model.manifest.v1+json",
+  "config": {
+    "mediaType": "application/vnd.cncf.model.config.v1+json",
+    "digest": "sha256:d5815835051dd97d800a03f641ed8162877920e734d3d705b698912602b8c763",
+    "size": 301
+  },
+  "layers": [
+    {
+      "mediaType": "application/vnd.cncf.model.weight.v1.raw",
+      "digest": "sha256:3f907c1a03bf20f20355fe449e18ff3f9de2e49570ffb536f1a32f20c7179808",
+      "size": 4294967296
+    },
+    {
+      "mediaType": "application/vnd.cncf.model.weight.v1.raw",
+      "digest": "sha256:6d923539c5c208de77146335584252c0b1b81e35c122dd696fe6e04ed03d7411",
+      "size": 5018536960
+    },
+    {
+      "mediaType": "application/vnd.cncf.model.weight.config.v1.raw",
+      "digest": "sha256:a5378e569c625f7643952fcab30c74f2a84ece52335c292e630f740ac4694146",
+      "size": 106
+    },
+    {
+      "mediaType": "application/vnd.cncf.model.weight.code.v1.raw",
+      "digest": "sha256:15da0921e8d8f25871e95b8b1fac958fc9caf453bad6f48c881b3d76785b9f9d",
+      "size": 394
+    },
+    {
+      "mediaType": "application/vnd.cncf.model.doc.v1.raw",
+      "digest": "sha256:5e236ec37438b02c01c83d134203a646cb354766ac294e533a308dd8caa3a11e",
+      "size": 23040
+    }
+  ]
+}
+```
+
+##### Step 5: Push
+
+```shell
+modctl push harbor.registry.com/models/qwen2.5-0.5b:v1
+```
+
+#### Management
+
+Current AI infrastructure workflows focus heavily on model distribution performance, often ignoring
+model management standards. Manual copying works for experiments, but in large-scale production, lacking
+unified versioning, metadata specs, and lifecycle management is poor practice. As the standard cloud-native
+Artifact Registry, Harbor is ideally suited for model storage, treating models as inference artifacts.
+
+Harbor standardizes AI model management through:
+
+- **Versioning:** Models are OCI Artifacts with immutable Tags and Sha256 Digests.
+  This guarantees deterministic inference environments. Meanwhile, it visually presents the model's basic
+  attributes, parameter configurations, display information, and the file list, which not only reduce
+  the risks of unknown versions but also achieves full transparency of the model.
+
+![p4](../img/cloud-native-ai-model-management/p4.png)
+
+- **RBAC:** Fine-grained access control. Control who can PUSH (e.g., Algorithm Engineers), who can
+  only PULL (e.g., Inference Services), and who has administrative privileges.
+
+![p5](../img/cloud-native-ai-model-management/p5.png)
+
+- **Lifecycle management:** Tag retention policies automatically purge non-release versions while
+  locking active versions, balancing storage costs with stability.
+
+![p6](../img/cloud-native-ai-model-management/p6.png)
+
+- **Supply chain security:** Integration with Cosign/Notation for signing. Harbor enforces signature
+  verification before distribution, preventing model poisoning attacks.
+
+![p7](../img/cloud-native-ai-model-management/p7.png)
+
+- **Replication:** Automated, incremental synchronization between central and edge registries or active-standby clusters.
+
+![p8](../img/cloud-native-ai-model-management/p8.png)
+
+- **Audit:** Comprehensive logging of all artifact operations (pull/push/delete) for security compliance and traceability.
+
+![p9](../img/cloud-native-ai-model-management/p9.png)
+
+#### Delivery
+
+Downloading terabyte-sized model weights directly from the origin introduces bandwidth bottlenecks. We utilize
+Dragonfly for P2P-based distribution, integrated with Harbor for preheating.
+
+![p10](../img/cloud-native-ai-model-management/p10.png)
+
+##### Dragonfly P2P-based distribution
+
+For large-scale distribution scenarios, Dragonfly has been deeply optimized based on P2P technology.
+Taking the example of 500 nodes downloading a 1TB model, the system distributes the initial download tasks
+of different layers across nodes to maximize downstream bandwidth utilization and avoid single-point congestion.
+Combined with a secondary bandwidth-aware scheduling algorithm, it dynamically adjusts download paths to
+eliminate network hotspots and long-tail latency. For individual model weight, Dragonfly splits individual
+model weights into pieces and fetches them concurrently from the origin. This enables streaming-based
+downloading, allowing users to share models without waiting for the complete file. This solution has been proven
+in high-performance AI clusters, utilizing _**70%-80%**_ of each node's bandwidth and improving model deployment efficiency.
+
+![p11](../img/cloud-native-ai-model-management/p11.png)
+
+##### Preheating
+
+For latency-sensitive inference services, Harbor triggers Dragonfly to distribute and cache data on target nodes
+before service scaling. When the instance starts, the model loads from the local disk, achieving zero network latency.
+
+![p12](../img/cloud-native-ai-model-management/p12.png)
+
+#### Deployment
+
+Deployment focuses on decoupling the Model (Data) from the Inference Engine (Compute). By leveraging Kubernetes
+declarative primitives, the Engine runs as a Container, while the Model is mounted as a Volume. This
+native approach not only enables multiple Pods on the same node to share and reuse the model, saving disk
+space, but also leverages the preheating and P2P capabilities of Harbor & Dragonfly to eliminate the latency of
+pulling large model weights, significantly improving startup speed.
+
+![p13](../img/cloud-native-ai-model-management/p13.png)
+
+##### OCI Volumes (Kubernetes 1.31+)
+
+Native support for mounting OCI artifacts as volumes via CRI-O/containerd. This feature was introduced
+as **alpha** in Kubernetes 1.31 (requires enabling the ImageVolume feature gate) and promoted
+to **beta** in Kubernetes 1.33 (enabled by default, no feature gate configuration needed). CRI-O specifically
+enhances this for LLMs by avoiding decompression overhead at mount time by storing layers uncompressed, resulting
+in superior performance when mounting large model files.
+
+###### Step 1: Build YAML {#oci-build-yaml}
+
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+  name: vllm-cpu-inference
+  labels:
+    app: vllm
+spec:
+  containers:
+    - name: vllm
+      image: openeuler/vllm-cpu:latest
+      command:
+        - 'python3'
+        - '-m'
+        - 'vllm.entrypoints.openai.api_server'
+      args:
+        - '--model'
+        - '/models'
+        - '--dtype'
+        - 'float32'
+        - '--host'
+        - '0.0.0.0'
+        - '--port'
+        - '8000'
+        - '--max-model-len'
+        - '1024'
+        - '--disable-log-requests'
+      env:
+        - name: VLLM_CPU_KVCACHE_SPACE
+          value: '1'
+        - name: VLLM_WORKER_MULTIPROC_METHOD
+          value: 'spawn'
+      resources:
+        requests:
+          memory: '2Gi'
+          cpu: '1'
+        limits:
+          memory: '16Gi'
+          cpu: '8'
+      volumeMounts:
+        - name: model-volume
+          mountPath: /models
+          readOnly: true
+      ports:
+        - containerPort: 8000
+          protocol: TCP
+          name: http
+      livenessProbe:
+        httpGet:
+          path: /health
+          port: 8000
+        initialDelaySeconds: 60
+        periodSeconds: 10
+        timeoutSeconds: 5
+      readinessProbe:
+        httpGet:
+          path: /health
+          port: 8000
+        initialDelaySeconds: 30
+        periodSeconds: 5
+  volumes:
+    - name: model-volume
+      image:
+        reference: ghcr.io/chlins/qwen2.5-0.5b:v1
+        pullPolicy: IfNotPresent
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: vllm-service
+spec:
+  selector:
+    app: vllm
+  ports:
+    - port: 8000
+      targetPort: 8000
+      protocol: TCP
+      name: http
+  type: ClusterIP
+```
+
+###### Step 2: Deploy inference Workload {#oci-deploy-inference-workload}
+
+![p14](../img/cloud-native-ai-model-management/p14.png)
+
+###### Step 3: Call Inference Workload {#oci-call-inference-workload}
+
+![p15](../img/cloud-native-ai-model-management/p15.png)
+
+##### Model CSI Driver
+
+For compatibility with Kubernetes 1.31 and older, we offer the Model CSI Driver as an interim solution to
+mount and deploy models as volumes. As OCI Volumes are slated for **GA** in Kubernetes 1.36, shifting to
+native OCI Volumes is recommended for the long term.
+
+###### Step 1: Build YAML {#csi-build-yaml}
+
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+  name: vllm-cpu-inference
+  labels:
+    app: vllm
+spec:
+  containers:
+    - name: vllm
+      image: openeuler/vllm-cpu:latest
+      command:
+        - 'python3'
+        - '-m'
+        - 'vllm.entrypoints.openai.api_server'
+      args:
+        - '--model'
+        - '/models'
+        - '--dtype'
+        - 'float32'
+        - '--host'
+        - '0.0.0.0'
+        - '--port'
+        - '8000'
+        - '--max-model-len'
+        - '1024'
+        - '--disable-log-requests'
+      env:
+        - name: VLLM_CPU_KVCACHE_SPACE
+          value: '1'
+        - name: VLLM_WORKER_MULTIPROC_METHOD
+          value: 'spawn'
+      resources:
+        requests:
+          memory: '2Gi'
+          cpu: '1'
+        limits:
+          memory: '16Gi'
+          cpu: '8'
+      volumeMounts:
+        - name: model-volume
+          mountPath: /models
+          readOnly: true
+      ports:
+        - containerPort: 8000
+          protocol: TCP
+          name: http
+      livenessProbe:
+        httpGet:
+          path: /health
+          port: 8000
+        initialDelaySeconds: 60
+        periodSeconds: 10
+        timeoutSeconds: 5
+      readinessProbe:
+        httpGet:
+          path: /health
+          port: 8000
+        initialDelaySeconds: 30
+        periodSeconds: 5
+  volumes:
+    - name: model-volume
+      csi:
+        driver: model.csi.modelpack.org
+        volumeAttributes:
+          model.csi.modelpack.org/reference: ghcr.io/chlins/qwen2.5-0.5b:v1
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: vllm-service
+spec:
+  selector:
+    app: vllm
+  ports:
+    - port: 8000
+      targetPort: 8000
+      protocol: TCP
+      name: http
+  type: ClusterIP
+```
+
+###### Step 2: Deploy Inference Workload {#csi-deploy-inference-workload}
+
+![p14](../img/cloud-native-ai-model-management/p14.png)
+
+###### Step 3: Call Inference Workload {#csi-call-inference-workload}
+
+![p15](../img/cloud-native-ai-model-management/p15.png)
+
+## Future
+
+- **Enhanced Preheating:** Allow models to be preheated to specified nodes and querying cache distribution across nodes
+  for model-aware pod scheduling.
+- **Dragonfly RDMA Acceleration:** Enable Dragonfly to utilize InfiniBand or RoCE to improve the speed of distribution.
+- **Lazy Loading:** Implement on-demand downloading of model weights to reduce startup latency.
+- **containerd Optimization:** Enhance the OCI Volumes implementation to reduce decompression overhead for large layers.
+- **Model Security Scanning:** Introduce deep scanning capabilities specifically designed for model weights to
+  detect embedded malicious payloads.
+
+## Collaborative Projects
+
+- Kubernetes: [https://github.com/kubernetes/kubernetes](https://github.com/kubernetes/kubernetes)
+- Harbor: [https://github.com/goharbor/harbor](https://github.com/goharbor/harbor)
+- Dragonfly: [https://github.com/dragonflyoss/dragonfly](https://github.com/dragonflyoss/dragonfly)
+- CRI-O: [https://github.com/cri-o/cri-o](https://github.com/cri-o/cri-o)
+- containerd: [https://github.com/containerd/containerd](https://github.com/containerd/containerd)
+- modctl: [https://github.com/modelpack/modctl](https://github.com/modelpack/modctl)
+- Model CSI Driver: [https://github.com/modelpack/model-csi-driver](https://github.com/modelpack/model-csi-driver)
+- Model Spec: [https://github.com/modelpack/model-spec](https://github.com/modelpack/model-spec)
+- ORAS: [https://github.com/oras-project/oras](https://github.com/oras-project/oras)
+
+## References
+
+- Kubernetes - Read Only Volumes Based On OCI Artifacts: [https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/](https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/)
+- Harbor - AI Model Processor: [https://github.com/goharbor/community/blob/main/proposals/new/AI-model-processor.md](https://github.com/goharbor/community/blob/main/proposals/new/AI-model-processor.md)
+- Dragonfly - Load-Aware Scheduling Algorithm: [https://d7y.io/docs/next/operations/deployment/applications/scheduler/#bandwidth-aware-scheduling-algorithm](https://d7y.io/docs/next/operations/deployment/applications/scheduler/#bandwidth-aware-scheduling-algorithm)
+- CRI-O - Add OCI Volume/Image Source Support: [https://github.com/cri-o/cri-o/pull/8317](https://github.com/cri-o/cri-o/pull/8317)
+- containerd - Add OCI/Image Volume Source support: [https://github.com/containerd/containerd/pull/10579](https://github.com/containerd/containerd/pull/10579)
diff --git a/content/blog/img/cloud-native-ai-model-management/p1.png b/content/blog/img/cloud-native-ai-model-management/p1.png
new file mode 100644
index 000000000..a794a4f5c
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p1.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p10.png b/content/blog/img/cloud-native-ai-model-management/p10.png
new file mode 100644
index 000000000..e0a8effd1
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p10.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p11.png b/content/blog/img/cloud-native-ai-model-management/p11.png
new file mode 100644
index 000000000..a4462643b
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p11.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p12.png b/content/blog/img/cloud-native-ai-model-management/p12.png
new file mode 100644
index 000000000..00a9fd6f9
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p12.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p13.png b/content/blog/img/cloud-native-ai-model-management/p13.png
new file mode 100644
index 000000000..788a3edd4
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p13.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p14.png b/content/blog/img/cloud-native-ai-model-management/p14.png
new file mode 100644
index 000000000..509994567
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p14.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p15.png b/content/blog/img/cloud-native-ai-model-management/p15.png
new file mode 100644
index 000000000..4e0ed412c
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p15.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p2.png b/content/blog/img/cloud-native-ai-model-management/p2.png
new file mode 100644
index 000000000..00a40e592
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p2.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p3.png b/content/blog/img/cloud-native-ai-model-management/p3.png
new file mode 100644
index 000000000..d3545f20b
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p3.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p4.png b/content/blog/img/cloud-native-ai-model-management/p4.png
new file mode 100644
index 000000000..8e7e53440
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p4.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p5.png b/content/blog/img/cloud-native-ai-model-management/p5.png
new file mode 100644
index 000000000..1a5368db2
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p5.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p6.png b/content/blog/img/cloud-native-ai-model-management/p6.png
new file mode 100644
index 000000000..04b948ab8
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p6.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p7.png b/content/blog/img/cloud-native-ai-model-management/p7.png
new file mode 100644
index 000000000..66d90ab43
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p7.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p8.png b/content/blog/img/cloud-native-ai-model-management/p8.png
new file mode 100644
index 000000000..8bd1bb9a2
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p8.png differ
diff --git a/content/blog/img/cloud-native-ai-model-management/p9.png b/content/blog/img/cloud-native-ai-model-management/p9.png
new file mode 100644
index 000000000..18b9f35ed
Binary files /dev/null and b/content/blog/img/cloud-native-ai-model-management/p9.png differ