Skip to content

Latest commit

 

History

History
270 lines (200 loc) · 14.9 KB

File metadata and controls

270 lines (200 loc) · 14.9 KB

❄️🐳 4th — Cloud‑Native AI/ML on Kubernetes

Curated by Fourth Industrial Systems (4th.is), this guide highlights open‑source tools and patterns for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, and Analytics designed to run natively on Kubernetes and Docker. We work across languages — Python, R, Scala, Java, C#, Go, Julia, C++ — with practical emphasis on Kubeflow, Seldon Core, Pachyderm, Banzai Pipeline, H2O, TensorFlow, CNTK, XGBoost, MXNet, PyTorch, ONNX, Argo, Airflow, Apache Beam, Apache Spark, Intel BigDL, Rook, and Ambassador.

“The wind and the waves are always on the side of the ablest navigator.” — Edmund Gibbon


Introduction

Across industry, Kubernetes has become the standard for orchestrating distributed systems — whether on‑prem, in a single cloud, or spanning many. In contrast, many ML and data workflows still begin on laptops or ad‑hoc notebook servers. This repository shows how to elevate those experiments into reliable, scalable, reproducible, and portable Kubernetes deployments.

Why Kubernetes for ML

  • Elastic scale for CPU/GPU resources with automated orchestration.
  • Portability: the same workloads run across all major clouds and on‑prem.
  • Ecosystem momentum: see wide adoption in the CNCF membership.
  • Self‑healing: immutable containers plus controllers enable resilient apps.

Practical Considerations

  • Containerization: package apps as small, efficient images for rapid scale‑out.
  • Immutability: predictable rollouts, easy rollbacks, and reproducibility.
  • Persistent storage: plan PVCs/CSI drivers early; projects like Rook are popular.
  • Service connectivity: ephemeral pods change the way services talk; a service mesh helps (see http://layer5.io/service-meshes/).

We focus on AI/ML/Data Science OSS that thrives in infinitely scalable Kubernetes environments. For a broader view of orchestration and operations, see Awesome Machine Learning Operations.

You may also want domain‑specific “awesome” lists:

Kubernetes

Spark

AI/ML

Other Data/ETL/Analytics


Community & Attribution

Open‑source projects are maintained by people and teams of all sizes. If you find value here, please star upstream repos, file issues/PRs, and thank maintainers. If something’s missing, open a discussion and we’ll add it.


Kubernetes — Name & Heritage

Kubernetes translates roughly to “helmsman” and draws design lineage from Google’s Borg. The internal codename Project Seven nods to Seven of Nine from Star Trek; the logo’s seven spokes reference that origin. More background: https://en.wikipedia.org/wiki/Kubernetes.

“The duties of the ruler are like those of the helmsman of a great ship…” — Han Fei


ML Built for Kubernetes (Native Kube)

“If you want to build a ship… teach them to yearn for the vast and endless sea.” — Antoine de Saint‑Exupéry

Kubeflow

http://kubeflow.org/ — Cloud‑native ML platform.

Seldon Core

https://www.seldon.io/ — Kubernetes‑native model serving: https://github.com/SeldonIO/seldon-core.

Pachyderm

http://pachyderm.io/ — Versioned data pipelines for production ML: https://github.com/pachyderm/pachyderm.

Fabric for Deep Learning (FfDL)

Multi‑framework deep learning on Kubernetes (TensorFlow, Caffe, PyTorch).
Docs: https://developer.ibm.com/patterns/deploy-and-use-a-multi-framework-deep-learning-platform-on-kubernetes/
Code: https://github.com/IBM/FfDL

Polyaxon

Platform for building, training, and monitoring large‑scale DL apps.
https://polyaxon.com/https://github.com/polyaxon/polyaxon

Datalayer

Big Data Science on Kubernetes.
https://github.com/datalayer/datalayerhttps://datalayer.iohttps://docs.datalayer.io

IntelAI Machine Learning Container Templates

Accelerators for building ML containers and K8s objects.
https://github.com/IntelAI/mlt


ML Adapted to Kubernetes

“Impossible is a word humans use far too often.” — Seven of Nine

Pipeline.AI

Real‑time enterprise AI platform with K8s quickstart:
https://github.com/PipelineAI/pipelinehttps://pipeline.ai

Dask & Friends

Kafka on K8s (Helm)

https://github.com/Landoop/kafka-helm-charts • Connectors: https://github.com/Landoop/stream-reactor

Big Data Playground

End‑to‑end sample stack (K8s, Spark/Flink/Beam, Kafka, etc.):
https://github.com/Chabane/bigdata-playground


Pipeline & Data Flow

Banzai Pipeline

From commit to scale on Kubernetes (CI/CD, logging, monitoring, autoscaling):
https://github.com/banzaicloud/pipeline

Argo

Container‑native workflows; cloud‑agnostic; runs on any Kubernetes cluster:
https://argoproj.github.io/https://github.com/argoproj/argo
Events: https://github.com/argoproj/argo-events

Apache Airflow

Author, schedule, and monitor DAGs for ETL/ML: https://airflow.apache.org/
Best practices: https://gtoonstra.github.io/etl-with-airflow/
K8s tools: https://github.com/mumoshu/kube-airflow • Operator: https://github.com/GoogleCloudPlatform/airflow-operator

Apache Beam / Dataflow


Storage for Kubernetes

Rook

Cloud‑native storage orchestration: https://rook.io/https://github.com/rook/rook

OpenEBS

Container‑attached block storage (Go), with SLAs, tiering, and multi‑AZ replica policies:
https://www.openebs.io/https://github.com/openebs/openebs
Maya orchestration: https://github.com/openebs/maya • Helm: https://github.com/openebs/charts


Spark at Sea (on K8s)

“Only those who brave its dangers comprehend its mystery.” — Longfellow
T.S. Eliot, The Waste Land (for perspective).

Note: Native K8s support arrived in Spark 2.3 and has matured since, but always check your target version’s capabilities.

Intel BigDL & Analytics Zoo


Spark on OKD / OpenShift


Utilities & Accessories


Odds & Ends

The podder‑ai ecosystem offers related components:


About Fourth Industrial Systems

Fourth Industrial Systems builds scalable, ethical AI solutions and agentic workflows that move seamlessly from prototype to production on Kubernetes.
Contact: freeman@4th.is • Learn: learn.4th.is • News: news.4th.is

Trademarks: Kubernetes®, Apache®, NVIDIA®, and other names are the property of their respective owners; references are for identification only and imply no endorsement.