feat: ai 모델 볼륨 StatefulSet+Retain 전환 및 EBS VolumeSnapshot 백업 (#59)#60
Merged
Conversation
- terraform/storage.tf: gp3-retain StorageClass(reclaimPolicy=Retain) 추가 - terraform/snapshots.tf: snapshot-controller EKS 애드온(VolumeSnapshot CRD) + snapscheduler Helm(3.5.0) 설치 - k8s base/ai: Deployment+standalone PVC → StatefulSet + volumeClaimTemplates(model-storage, 라벨 app=ai) - k8s eks overlay: volumeClaimTemplates에 gp3-retain 지정, patch 타깃 Deployment→StatefulSet - k8s kind overlay: patch 타깃 Deployment→StatefulSet (기본 SC 유지) - k8s overlays/eks/ai-backup: VolumeSnapshotClass(ebs.csi.aws.com) + SnapshotSchedule(매일, 7개 보존) - k8s/apps/ai-backup.yaml: ArgoCD App - README: EBS 스냅샷 백업 가이드 추가 로컬(kind)은 스냅샷/Retain 미적용. 복원은 스냅샷에서 dataSource로 PVC 생성. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
closes #59
개요
ai 모델 볼륨을 StatefulSet +
gp3-retain(Retain) 으로 전환하고, EBS VolumeSnapshot 주기 백업(snapshot-controller + snapscheduler)을 추가한다. PVC 삭제/노드 교체 시 데이터 보존 + 정기 스냅샷 확보. (코드리뷰 #43 백업 부재, #44 reclaim=Delete 위험)변경
terraform
storage.tf:gp3-retainStorageClass (reclaimPolicy=Retain) — PVC 삭제돼도 EBS 볼륨 보존snapshots.tf(신규)snapshot-controllerEKS 관리형 애드온 — VolumeSnapshot/Class/Content CRD + 컨트롤러snapschedulerHelm (3.5.0고정) — SnapshotSchedule CRD로 주기 스냅샷·보존k8s
base/ai:Deployment+ standalone PVC →StatefulSet+volumeClaimTemplates(model-storage, 라벨app=ai, 10Gi).deployment.yaml/pvc.yaml삭제overlays/eks/ai: volumeClaimTemplates에gp3-retain지정, patch 타깃Deployment→StatefulSetoverlays/kind/ai: patch 타깃Deployment→StatefulSet(기본 SC 유지)overlays/eks/ai-backup/(신규): VolumeSnapshotClass(ebs.csi.aws.com) + SnapshotSchedule(매일 02:00 KST, 최근 7개 보존,claimSelector app=ai)apps/ai-backup.yaml(신규): ArgoCD Appdocs
동작
비고 / 범위
standardSC).dataSource: {kind: VolumeSnapshot}로 새 PVC 생성.terraform fmt(snapshots.tf/storage.tf) 통과, 매니페스트 YAML 검증 완료.kustomize build실검증은 클러스터 환경 권장.🤖 Generated with Claude Code