Skip to content

devdoyen/rogic.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

436 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

rogic.io: Project Portfolio & Infrastructure

0.1. Engineering Constraints & Principles

๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ์ดˆ๊ฒฝ๋Ÿ‰/์ดˆ์ €๊ฐ€ ์ธํ”„๋ผ ํ™˜๊ฒฝ์—์„œ ๋†’์€ ์‹œ์Šคํ…œ ์•ˆ์ •์„ฑ์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•ด ์•„๋ž˜์™€ ๊ฐ™์€ 3๋Œ€ ์—”์ง€๋‹ˆ์–ด๋ง ์ œ์•ฝ ์กฐ๊ฑด ๋ฐ ๊ทน๋ณต ์›์น™์„ ์ˆ˜๋ฆฝํ•˜์—ฌ ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

rogic.io Engineering Principles & Constraints

0.2. Game Concept

rogic.io๋Š” ์ „ํ†ต์ ์ธ ์‚ฌ๊ฐํ˜• ๊ฒฉ์žํŒ์—์„œ ํผ์ฆ์„ ํ•ด๊ฒฐํ•˜๋Š” ๋„ค๋ชจ๋กœ์ง(๋…ธ๋…ธ๊ทธ๋žจ) ๊ฒŒ์ž„์ž…๋‹ˆ๋‹ค. ๋‹จ, ์ถœ์ œ ์‹œ์ ์— ์ž„์˜์˜ ๊ฐ๋„๋กœ ํšŒ์ „๋œ ํผ์ฆ์„ ํ•ด๊ฒฐํ•˜๋ฉด, ์™„๋ฃŒ๋˜๋Š” ์ˆœ๊ฐ„ ์›๋ž˜ ๋ฐฉํ–ฅ์œผ๋กœ ์ž๋™ ํšŒ์ „ ๋ณต์›๋˜๋ฉฐ ์™„์„ฑ๋œ ํŒจํ„ด์„ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋ณด์—ฌ์ฃผ๋Š” ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ๋‚ด์žฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

rogic.io Gameplay Demo

0.3. Service Environments

Service Environment Live URL Deployment Status
๐Ÿš€ Production rogic.io Active
๐Ÿงช Staging stage.rogic.io Idle / On-Demand

0.4. Technology Stack

Category Technologies
Frontend Vue 3 TypeScript Canvas API
Backend Java 17 Spring Boot
Database PostgreSQL
Infra & IaC AWS Terraform Ansible Docker
CI/CD GitHub Actions Vitest Playwright
Telemetry Prometheus Grafana CloudWatch

1. Infrastructure

1.1. System Architecture

1.1.1. High-Level Diagram

C4Context
    title System Context Diagram for rogic.io (Level 1: System Context)

    Person(player, "Player / User", "Accesses the puzzle game through a web browser.")
    
    System_Boundary(dns_cdn, "Global Edge Delivery") {
        System_Ext(route53, "Route 53", "DNS management mapping domains to CloudFront & EC2.")
        System_Ext(cloudfront, "Amazon CloudFront", "CDN distributing static web assets globally.")
        System(s3, "Amazon S3 Bucket", "Stores Vite-built Vue static compilation files.")
    }

    System_Boundary(backend, "Core API Server") {
        System(api, "rogic.io REST API (EC2)", "Spring Boot backend handling gameplay, XP levels, and leadership stats.")
        SystemDb(postgres, "PostgreSQL DB", "Relational database storing user logs, statistics, and stage metadata.")
    }

    Rel(player, route53, "Queries DNS for rogic.io / api.rogic.io", "DNS Protocol")
    Rel(player, cloudfront, "Requests static assets", "HTTPS / Port 443")
    Rel(cloudfront, s3, "Pulls origin static files", "S3 Protocol")
    Rel(player, api, "Calls REST API services (via DNS mapped to EC2 EIP)", "HTTPS / Port 443")
    Rel(api, postgres, "Reads/Writes game state", "JDBC & JPA / Port 5432")
Loading

1.1.2. Component Specification

  • Global Edge Delivery (Route 53 / CloudFront / S3)
    Vite ์ปดํŒŒ์ผ ๊ฒฐ๊ณผ๋ฌผ์„ Amazon S3 ๋ฒ„ํ‚ท(OAC ์„ค์ •์„ ํ†ตํ•œ ์ „๋ฉด ์ฐจ๋‹จ)์— ๋ฐฐํฌํ•˜๊ณ , Amazon CloudFront CDN์„ ํ†ตํ•ด ๊ธ€๋กœ๋ฒŒ ์—ฃ์ง€์— ์บ์‹ฑ ๋ฐฐํฌํ•˜์—ฌ ์ง€์—ฐ ์‹œ๊ฐ„์„ ์ตœ์†Œํ™”ํ•˜๊ณ  S3 ์ง์ ‘ ์š”์ฒญ ์š”๊ธˆ์„ ์ฐจ๋‹จํ–ˆ์Šต๋‹ˆ๋‹ค.
  • Core API Server & Database (EC2 / PostgreSQL)
    ๋‹จ์ผ EC2 ์ธ์Šคํ„ด์Šค ๋‚ด์—์„œ SSL/TLS ์ข…๋‹จ ๋ฐ ํฌํŠธ ํฌ์›Œ๋”ฉ์„ ์ˆ˜ํ–‰ํ•˜๋Š” Nginx ํ”„๋ก์‹œ, REST API๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” Spring Boot ์ปจํ…Œ์ด๋„ˆ, ๊ฒŒ์ž„ ๋ฐ์ดํ„ฐ๋ฅผ ์˜์†ํ™”ํ•˜๋Š” PostgreSQL DB ์ปจํ…Œ์ด๋„ˆ๋ฅผ ๊ฐ€์ƒ Docker ๋ธŒ๋ฆฟ์ง€ ๋„คํŠธ์›Œํฌ๋กœ ๋ถ„๋ฆฌ ๊ฐ€๋™ํ•ฉ๋‹ˆ๋‹ค.

1.2. Cost Optimization

1.2.1. Compute

  • ์ตœ์ ํ™” ์กฐ์น˜ (Optimization)
    • ์›” $3.5 ๋Œ€์˜ ์ดˆ๊ฒฝ๋Ÿ‰ t3a.nano ์ธ์Šคํ„ด์Šค(512MB RAM) ํ™˜๊ฒฝ ๋„์ž…
    • Spring Boot ๋Ÿฐํƒ€์ž„ ๋ฉ”๋ชจ๋ฆฌ ํ’‹ํ”„๋ฆฐํŠธ๋ฅผ 30MB ์ดํ•˜๋กœ ๋‚ฎ์ถ”๊ธฐ ์œ„ํ•ด GraalVM Native Image ๋นŒ๋“œ ๋„์ž…
    • Jackson ์—ญ์ง๋ ฌํ™” ์˜ค๋ฅ˜ ์˜ˆ๋ฐฉ์„ ์œ„ํ•ด NemologicRuntimeHints.java์— ๋ฆฌํ”Œ๋ ‰์…˜ ํžŒํŠธ ๋ช…์‹œ
    • ํ˜ธ์ŠคํŠธ ๋””์Šคํฌ ์šฉ๋Ÿ‰ ๊ด€๋ฆฌ๋ฅผ ์œ„ํ•ด ๋งค์ผ ์ƒˆ๋ฒฝ 3์‹œ๋งˆ๋‹ค Docker GC prune ์Šค์ผ€์ค„ ํฌ๋ก ํƒญ ๊ตฌ๋™
  • ๊ธฐ์ˆ ์  ์ œ์•ฝ (Trade-off)
    • 512MB ๋ฉ”๋ชจ๋ฆฌ ์ œ์•ฝ์œผ๋กœ ์ธํ•ด ์„œ๋ฒ„ ๋‚ด์—์„œ ์ง์ ‘ GraalVM ์ปดํŒŒ์ผ ๋นŒ๋“œ๊ฐ€ ๋ถˆ๊ฐ€๋Šฅํ•˜๋ฉฐ, ๋นŒ๋“œ ์—ฐ์‚ฐ ์‹œ JVM ์ปดํŒŒ์ผ ๋Œ€๋น„ 10๋ฐฐ ์ด์ƒ์˜ ์‹œ๊ฐ„ ์†Œ์š”
  • ์™„ํ™” ๋Œ€์ฑ… (Mitigation)
    • CI/CD ํŒŒ์ดํ”„๋ผ์ธ ์ƒ์—์„œ GitHub Actions๊ฐ€ ์ œ๊ณตํ•˜๋Š” ์™ธ๋ถ€ ๋นŒ๋“œ ์ธํ”„๋ผ(2 Core, 7GB RAM)์— ์ปดํŒŒ์ผ ์—ฐ์‚ฐ ๋ถ€ํ•˜๋ฅผ ์˜คํ”„๋กœ๋”ฉํ•˜๊ณ , ํ˜ธ์ŠคํŠธ ์„œ๋ฒ„๋Š” 30MB ์ˆ˜์ค€์˜ ๋ฌด๋ถ€ํ•˜ ๋ฐ”์ด๋„ˆ๋ฆฌ ์‹คํ–‰๋งŒ ์ „๋‹ดํ•˜๋„๋ก ๋ถ„๋ฆฌ

1.2.2. Network & Delivery

  • ์ตœ์ ํ™” ์กฐ์น˜ (Optimization)
    • ์›” $20 ์ƒ๋‹น์˜ AWS ALB(Application Load Balancer)๋ฅผ ๋ฐฐ์ œํ•˜๊ณ  Route 53 ๋„๋ฉ”์ธ๊ณผ ๊ณ ์ • Elastic IP ๋‹ค์ด๋ ‰ํŠธ ๋งคํ•‘
    • Docker Nginx ์ปจํ…Œ์ด๋„ˆ ๋‹จ์ผ ํ”„๋ก์‹œ ๊ฐ€๋™์„ ํ†ตํ•œ SSL/TLS ์ข…๋‹จ ๋ฐ ๋ฐฑ์—”๋“œ API ํฌํŠธ(8080) ํฌ์›Œ๋”ฉ ์ฒ˜๋ฆฌ ์ „๋‹ด
    • Vite ๋นŒ๋“œ ์ •์  ์ปดํŒŒ์ผ ์ž์‚ฐ์„ S3 ๋ฒ„ํ‚ท์— OAC(Origin Access Control) ๋ณด์•ˆ ์„ค์ •์œผ๋กœ ๋ฐฐํฌํ•˜๊ณ  CloudFront CDN์„ ์—ฐ๋™ํ•ด ๊ธ€๋กœ๋ฒŒ ์—์ง€ ์บ์‹ฑ ์ „์†ก์„ ๊ตฌํ˜„ํ•˜์—ฌ ์ง€์—ฐ ์‹œ๊ฐ„ ๋‹จ์ถ• ๋ฐ S3 ์ง์ ‘ ์š”์ฒญ ์š”๊ธˆ ์ฐจ๋‹จ
  • ๊ธฐ์ˆ ์  ์ œ์•ฝ (Trade-off)
    • ๋‹ค์ค‘ ๊ฐ€์šฉ๊ตฌ์—ญ(Multi-AZ) ๋ฌด์ค‘๋‹จ ์ด์ค‘ํ™” ๋ฐ ๋กค๋ง ๋ฐฐํฌ๋ฅผ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์—†์–ด, ํ˜ธ์ŠคํŠธ ๋ฌผ๋ฆฌ ์žฅ์•  ์‹œ ์„œ๋น„์Šค ์ „์ฒด ์ •์ „(SPOF) ๋ฆฌ์Šคํฌ์— ๋…ธ์ถœ๋จ
  • ์™„ํ™” ๋Œ€์ฑ… (Mitigation)
    • AWS CloudWatch Status Check Metric Alarms๋ฅผ ๊ฒฐํ•ฉํ•ด ๋ฌผ๋ฆฌ ํ•˜๋“œ์›จ์–ด ๊ฒฐํ•จ ๋ฐœ์ƒ ์‹œ 1๋ถ„ ์ด๋‚ด์— ์ธ์Šคํ„ด์Šค๋ฅผ ์ •์ƒ ๋ฌผ๋ฆฌ ํ˜ธ์ŠคํŠธ๋กœ ์ž๋™ ๋ณต์›(Auto Recovery) ๋ฐ EIP ์žฌ๋ฐ”์ธ๋”ฉ ์ฒ˜๋ฆฌ

1.2.3. Database & Storage

  • ์ตœ์ ํ™” ์กฐ์น˜ (Optimization)
    • ์›” $15~20 ์ด์ƒ์˜ RDS ์„œ๋น„์Šค ๋น„์šฉ ์ ˆ๊ฐ์„ ์œ„ํ•ด EC2 ๋‚ด๋ถ€ Docker Compose ํ™˜๊ฒฝ์—์„œ PostgreSQL ์ปจํ…Œ์ด๋„ˆ๋ฅผ ์ง์ ‘ ๊ฐ€๋™
  • ๊ธฐ์ˆ ์  ์ œ์•ฝ (Trade-off)
    • AWS RDS์˜ ์™„์ „๊ด€๋ฆฌํ˜• ์ด์ค‘ํ™” ๋ณต๊ตฌ ๋ฐ ์‹œ์  ๋ณต๊ตฌ(PITR) ํŽธ์˜์„ฑ์„ ์ƒ์‹คํ•˜์˜€์œผ๋ฉฐ, ์žฌํ•ด ๋ณต๊ตฌ ์‹œ ๋ฐฑ์—… ๋คํ”„ ๊ธฐ๋ฐ˜ ์ˆ˜๋™ ๋ณต์› ์ฒ˜๋ฆฌ๊ฐ€ ์š”๊ตฌ๋จ์— ๋”ฐ๋ผ ๋ณต๊ตฌ ๋ชฉํ‘œ ์‹œ๊ฐ„(RTO) ์•ฝ 20๋ถ„ ๋ฐ ์ตœ๋Œ€ ๋ฐ์ดํ„ฐ ์†์‹ค ํ•œ๊ณ„(RPO) 6์‹œ๊ฐ„์œผ๋กœ ์กฐ์ •๋จ
  • ์™„ํ™” ๋Œ€์ฑ… (Mitigation)
    • 6์‹œ๊ฐ„ ์ฃผ๊ธฐ DB dump ๋ฐ์ดํ„ฐ๋ฅผ S3 ๋…๋ฆฝ ๋ฐฑ์—… ๋ฒ„ํ‚ท์œผ๋กœ ์ „์†กํ•˜๋Š” ์‰˜ ์Šคํฌ๋ฆฝํŠธ์™€ Cron ๋ฐฐํฌ ๋ฐ 30์ผ ๊ฒฝ๊ณผ ๋ฐฑ์—… ์ž๋™ ํŒŒ๊ธฐ ์ •์ฑ… ์—ฐ๋™
    • Terraform/Ansible ์ฝ”๋“œํ™”๋ฅผ ํ†ตํ•ด ์ „์ฒด ์œ ์‹ค ๋ฐœ์ƒ ์‹œ์—๋„ 5๋ถ„ ์ด๋‚ด ์ธํ”„๋ผ ์žฌ์„ค์น˜ ๋ฐ ๋ฐ์ดํ„ฐ ์ˆ˜๋™ ๋ณต๊ตฌ ์ ˆ์ฐจ ์ˆ˜๋ฆฝ (ROA)

1.2.4. Staging Environment

  • ์ตœ์ ํ™” ์กฐ์น˜ (Optimization)
    • ๊ฐœ๋ฐœ/๊ฒ€์ฆ์šฉ Staging EC2 ์ธ์Šคํ„ด์Šค๋Š” ๋ถˆํ•„์š”ํ•œ ์ปดํ“จํŒ… ์ž์› ์š”๊ธˆ ๋‚ญ๋น„๋ฅผ ๋ง‰๊ธฐ ์œ„ํ•ด ํ‰์‹œ์— ์ค‘์ง€(Stopped) ์ƒํƒœ ์œ ์ง€
  • ์›Œํฌํ”Œ๋กœ์šฐ ์—ฐ๋™ (Workflow / Mitigation)
    • GitHub Actions deploy-staging ์‹คํ–‰ ์‹œ AWS CLI๋กœ ์ธ์Šคํ„ด์Šค๋ฅผ ์ž๋™์œผ๋กœ ๊ธฐ๋™(Start)ํ•˜์—ฌ ๋ฐฐํฌ ๋ฐ Playwright ๋ธŒ๋ผ์šฐ์ € E2E ํ…Œ์ŠคํŠธ ๊ฒ€์ฆ ์ง„ํ–‰
    • ๊ฒ€์ฆ ์™„๋ฃŒ ํ›„ ์•ผ๊ฐ„(๋งค์ผ ์ƒˆ๋ฒฝ 2์‹œ KST)์— ์ •์ง€ ์ž๋™ํ™” ์Šค์ผ€์ค„(staging-cleanup.yml)์„ ๊ตฌ๋™ํ•˜์—ฌ ๋น„์šฉ ํšจ์œจ์„ฑ ํ™•๋ณด

1.3. Security Infrastructure

๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ์„œ๋น„์Šค ๋ฌด๊ฒฐ์„ฑ๊ณผ ํ˜ธ์ŠคํŠธ ์‹œ์Šคํ…œ ๋ณดํ˜ธ๋ฅผ ์œ„ํ•ด AWS Well-Architected Framework์˜ ๋ณด์•ˆ ๊ธฐ๋‘ฅ(Security Pillar) ์„ค๊ณ„ ๊ฐ€์ด๋“œ๋ผ์ธ์— ๋ถ€ํ•ฉํ•˜๋Š” 3๋Œ€ ๋ณด์•ˆ ์ œ์–ด ์ •์ฑ…์„ ๊ตฌํ˜„ํ–ˆ์Šต๋‹ˆ๋‹ค.

1.3.1. Identity & Access Management

  • SSM Session Manager ๋„์ž…
    • ๋ฌด์ž‘์œ„ ๋Œ€์ž… ๊ณต๊ฒฉ๊ณผ SSH ํ‚ค ์œ ์ถœ ๋ฆฌ์Šคํฌ๊ฐ€ ๋†’์€ ํ˜ธ์ŠคํŠธ SSH(22) ํฌํŠธ๋ฅผ ์ธ๋ฐ”์šด๋“œ ๋ณด์•ˆ ๊ทธ๋ฃน์—์„œ ์™„์ „ ์ฐจ๋‹จ
    • IAM ์ž๊ฒฉ ์ฆ๋ช… ๊ธฐ๋ฐ˜์˜ AWS Systems Manager Session Manager๋ฅผ ๊ฒฝ์œ ํ•˜๋Š” ์„ธ์…˜ ํ†ต์‹ ๋งŒ ํ—ˆ์šฉ
  • Ansible SSM ํ„ฐ๋„ ์บก์Аํ™”
    • ํ˜ธ์ŠคํŠธ์˜ 22๋ฒˆ ํฌํŠธ๋ฅผ ์›๊ฒฉ ๊ฐœ๋ฐฉํ•˜์ง€ ์•Š๊ณ , ๋กœ์ปฌ ๋ฐ ๋ฐฐํฌ ๋Ÿฌ๋„ˆ ํ™˜๊ฒฝ์—์„œ aws ssm start-session ํ”„๋ก์‹œ ๋ช…๋ น(ProxyCommand)์„ SSH ํ„ฐ๋„๋กœ ์บก์Аํ™”
    • ํ•ด๋‹น ํ„ฐ๋„ ๋‚ด๋ถ€์—์„œ ๊ธฐ์กด SSH ์ธ์ฆ ํ‚ค(PEM)๋ฅผ ํ™œ์šฉํ•œ 2์ฐจ ์ธ์ฆ์„ ํ†ต๊ณผํ•ด์•ผ๋งŒ Ansible Playbook ๊ฐ€๋™์ด ๊ฐ€๋Šฅํ•˜๋„๋ก ์ด์ค‘ ๋ฐฉ์–ด์„  ๊ตฌ์ถ• (์ƒ์„ธ hosts.ini ๊ตฌ์„ฑ์€ ๋ถ€๋ก 5.1.3. AWS SSM Session Manager Setup ์ฐธ๊ณ )
  • OIDC Keyless Authentication
    • GitHub Actions ๋Ÿฌ๋„ˆ ๋ฐฐํฌ ์‹œ ํ•˜๋“œ์ฝ”๋”ฉ๋œ AWS API Access Key ์‚ฌ์šฉ์„ ์ „๋ฉด ๋ฐฐ์ œ
    • GitHub OIDC(OpenID Connect) ์—ฐ๋™์„ ์ˆ˜๋ฆฝํ•˜์—ฌ ๋ฐฐํฌ ์‹œ์ ์— AWS STS๋กœ๋ถ€ํ„ฐ 1ํšŒ์šฉ ๋‹จ๊ธฐ ์ž๊ฒฉ ์ฆ๋ช…(AssumeRole)์„ ํš๋“ํ•จ์œผ๋กœ์จ ์œ ์ถœ ๊ฒฝ๋กœ ์›์ฒœ ์ œ๊ฑฐ
  • ์„œ๋น„์Šค ์ˆ˜์ค€ ์ตœ์†Œ ๊ถŒํ•œ ์ •์ฑ… (Least Privilege)
    • ํ…Œ๋ผํผ ๋ฐ Ansible ๋ฐฐํฌ ๋ฒ”์œ„์— ์ •ํ™•ํžˆ ๋ถ€ํ•ฉํ•˜๋Š” ์„œ๋น„์Šค ์ˆ˜์ค€ ์ตœ์†Œ ๊ถŒํ•œ ์ •์ฑ…(Staging/Production ๋ณ„ ์ปค์Šคํ…€ IAM Policy) ๋ฐ”์ธ๋”ฉ
    • ํ—ˆ์šฉ ์ž์› ์ด์™ธ์˜ ํƒ€ ์„œ๋น„์Šค ์ž์›(์˜ˆ: RDS, Lambda, KMS ๋“ฑ) ๊ด€๋ฆฌ๋ฅผ ์›์ฒœ ๋ฐฐ์ œํ•˜์—ฌ ์œ„ํ˜‘ ๋ฐ˜๊ฒฝ ์ฐจ๋‹จ

1.3.1.1. IAM Least Privilege Design

EC2 ํ˜ธ์ŠคํŠธ ๋ฐ CI/CD ํŒŒ์ดํ”„๋ผ์ธ ๊ฐ๊ฐ์˜ ์‹คํ–‰ ์ฃผ์ฒด๋ณ„๋กœ ์‹ค์ œ ์ ์šฉ๋œ IAM ๊ถŒํ•œ๊ณผ OIDC ๋‹จ๊ธฐ ์ž๊ฒฉ ์ฆ๋ช… ๊ธฐ๋ฐ˜์˜ ์ž์› ํ†ต์ œ ์•„ํ‚คํ…์ฒ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

C4Component
    title Component Diagram for Identity & Access Management (Level 3: Security & IAM)

    Container(runner, "GitHub Actions Runner", "GitHub Cloud", "Deploys infra/app using temporal credentials.")
    Container(ec2, "EC2 App Server", "AWS EC2", "Runs application stack and background helpers.")

    System_Boundary(iam, "AWS IAM (Identity & Access Management)") {
        Component(oidc, "OIDC Provider", "token.actions.githubusercontent.com", "Verifies GitHub Actions runner token.")
        Component(run_role, "CI/CD Runner IAM Role", "IAM Role", "Assumed via OIDC federation.")
        Component(host_role, "EC2 Host IAM Role", "IAM Role (Instance Profile)", "Attached to EC2 hosting profile.")
        
        Component(tf_policy, "Terraform & Deploy Policy", "Customer Managed Policy", "Allows EC2, VPC, S3, DynamoDB, Route 53, CloudFront management.")
        Component(ssm_policy, "SSM Managed Policy", "AWS Managed Policy", "Allows SSM Systems Manager connectivity.")
        Component(cw_policy, "CloudWatch Log Policy", "Customer Managed Policy", "Allows log groups/streams push operations.")
        Component(s3_back_policy, "S3 Backup Write Policy", "Customer Managed Policy", "Allows database dump upload.")
    }

    System_Boundary(aws_resources, "AWS Resources Boundary") {
        System(s3_tf, "S3 tfstate & deploy Bucket", "Object Storage")
        System(ddb_lock, "DynamoDB tfstate lock Table", "NoSQL Database")
        System(cf_cdn, "CloudFront CDN / Route 53", "Edge Routing")
        System(cw_logs, "CloudWatch Logs", "Telemetry Store")
        System(s3_back, "S3 Backup Bucket", "Object Storage")
    }

    Rel(runner, oidc, "1. Authenticates", "OIDC Web Identity Token")
    Rel(oidc, run_role, "2. Issues short-term session", "AssumeRoleWithWebIdentity")
    Rel(run_role, tf_policy, "3. Binds permissions")
    
    Rel_D(tf_policy, ec2, "Manage VPC & Host", "AWS API")
    Rel_D(tf_policy, s3_tf, "Read/Write tfstate & deploy site", "AWS API")
    Rel_D(tf_policy, ddb_lock, "Acquire/Release Lock", "AWS API")
    Rel_D(tf_policy, cf_cdn, "Invalidate cache / Update DNS", "AWS API")

    Rel(ec2, host_role, "4. Obtains profile context", "Instance Metadata Service (IMDS)")
    Rel(host_role, ssm_policy, "5. Binds permissions")
    Rel(host_role, cw_policy, "5. Binds permissions")
    Rel(host_role, s3_back_policy, "5. Binds permissions")

    Rel_D(ssm_policy, ec2, "Establish secure tunnel", "SSM Tunnel")
    Rel_D(cw_policy, cw_logs, "Push application stdout", "CloudWatch API")
    Rel_D(s3_back_policy, s3_back, "Upload daily DB dump", "S3 API")
Loading
์ฃผ์ฒด (Principal) ์ธ์ฆ ๋ฐฉ์‹ (Auth Type) ์—ฐ๊ฒฐ๋œ IAM ์ •์ฑ… ๋ฐ ๊ถŒํ•œ (IAM Policies) ์ฃผ์š” ์—ญํ•  ๋ฐ ๋น„๊ณ  (Key Role)
EC2 Host Role Instance Profile AmazonSSMManagedInstanceCore
Staging: CloudWatchAgentServerPolicy (๊ด€๋ฆฌํ˜•)
Production: nemologic-cloudwatch-log-policy (์ปค์Šคํ…€)
s3_backup_policy (์ปค์Šคํ…€)
SSM ํ„ฐ๋„๋ง ํ™œ์„ฑํ™”, CloudWatch ๋กœ๊ทธ ์‹ค์‹œ๊ฐ„ ํฌ์›Œ๋”ฉ(Staging/Production ๋ณ„ ์ •์ฑ… ์ฐจ๋“ฑ ์ ์šฉ), DB ๋ฐฑ์—… S3 ์—…๋กœ๋“œ ๊ถŒํ•œ ์ œ์–ด
CI/CD Runner (GitHub) AWS OIDC (Keyless) nemologic-staging-github-policy
nemologic-production-github-policy (์ปค์Šคํ…€)
sts:AssumeRoleWithWebIdentity๋ฅผ ํ†ตํ•ด GitHub Actions OIDC ํ† ํฐ์œผ๋กœ 1ํšŒ์šฉ ๋‹จ๊ธฐ ์ž๊ฒฉ ์ฆ๋ช…์„ ํš๋“ํ•˜์—ฌ Terraform ๋ฐ ๋ฐฐํฌ ์ˆ˜ํ–‰ (Secret Key ํ•˜๋“œ์ฝ”๋”ฉ ๋ฐฐ์ œ ๋ฐ ์ตœ์†Œ ๊ถŒํ•œ ์ˆ˜๋ฆฝ)

1.3.2. Infrastructure Protection

C4Container
    title Container Diagram for rogic.io (Level 2: Network & Containers)

    Person(player, "Player / User", "Accesses the puzzle game through a web browser.")
    Person(sre, "SRE / QA (CI/CD)", "Deploys and tests the staging application.")

    System_Boundary(aws, "AWS Cloud (ap-northeast-2)") {
        
        System_Boundary(vpc_prod, "Production VPC (10.0.0.0/16)") {
            System_Boundary(fnet_prod, "frontend-net (Docker Bridge)") {
                Container(nginx, "Nginx Reverse Proxy", "Docker Container", "SSL/TLS termination, API routing, and Bearer token auth validation.")
            }
            
            System_Boundary(bnet_prod, "backend-net (Docker Bridge)") {
                ContainerDb(postgres, "PostgreSQL Database", "Docker Container", "Persists puzzle templates, user logs, clear history, and user stats.")
            }
            
            Container(spring, "Spring Boot App", "Docker Container (GraalVM) [frontend-net & backend-net]", "Handles business logic, daily puzzle scheduling, rating, and XP leaderboard.")
            
            Rel(nginx, spring, "Proxy API requests", "HTTP / Port 8080 [frontend-net]")
            Rel(spring, postgres, "Reads/Writes state", "JPA & JDBC / Port 5432 [backend-net]")
        }
        
        System_Boundary(vpc_stage, "Staging VPC (10.1.0.0/16)") {
            System_Boundary(fnet_stg, "frontend-net (Stage Bridge)") {
                Container(nginx_stg, "Nginx Reverse Proxy (Stage)", "Docker Container", "Staging SSL/TLS termination and API routing.")
            }
            
            System_Boundary(bnet_stg, "backend-net (Stage Bridge)") {
                ContainerDb(postgres_stg, "PostgreSQL Database (Stage)", "Docker Container", "Persists isolated staging state.")
            }
            
            Container(spring_stg, "Spring Boot App (Stage)", "Docker Container (JVM) [frontend-net & backend-net]", "Staging application runtime environment.")
            
            Rel(nginx_stg, spring_stg, "Proxy API requests", "HTTP / Port 8080 [frontend-net]")
            Rel(spring_stg, postgres_stg, "Reads/Writes state", "JPA & JDBC / Port 5432 [backend-net]")
        }

        Container(cloudfront, "Amazon CloudFront", "AWS CDN", "Distributes static web assets with low latency.")
        Container(s3, "Amazon S3", "AWS Bucket Storage", "Hosts Vite/Vue built static files (HTML, JS, CSS).")
    }

    Rel(player, cloudfront, "Fetches static web pages", "HTTPS / Port 443")
    Rel(cloudfront, s3, "Refreshes cache from origin", "S3 Protocol")
    Rel(player, nginx, "Calls API endpoints", "HTTPS / Port 443")
    Rel(sre, nginx_stg, "Calls API endpoints (Stage) during tests", "HTTPS / Port 443")
Loading
  • ๋ฌผ๋ฆฌ ๊ฒฉ๋ฆฌํ˜• VPC ๊ตฌ์„ฑ
    • Staging VPC(10.1.0.0/16)์™€ Production VPC(10.0.0.0/16)๋ฅผ ๊ฐœ๋ณ„ ์„œ๋ธŒ๋„ท ๋Œ€์—ญ๊ณผ ๋…๋ฆฝ ์ธํ”„๋ผ๋ง์œผ๋กœ ๋ถ„๋ฆฌ ํ”„๋กœ๋น„์ €๋‹
    • ๋ง๊ฐ„ ๊ต์ฐจ ์ ‘๊ทผ์„ ์›์ฒœ ์ฐจ๋‹จํ•˜์—ฌ ํ…Œ์ŠคํŠธ ํ™˜๊ฒฝ์˜ ๋ถˆ์•ˆ์ •์„ฑ์ด ์šด์˜๊ณ„์— ์ „์ด๋˜์ง€ ์•Š๋„๋ก ๊ฒฉ๋ฆฌ ์•ˆ์ „์„ฑ ํ™•๋ณด
  • ๋‹ค๊ณ„์ธต ๋„์ปค ๋ธŒ๋ฆฌ์ง€ ๋„คํŠธ์›Œํฌ ๊ฒฉ๋ฆฌ
    • ๋‹จ์ผ EC2 ๋‚ด๋ถ€ ์ปจํ…Œ์ด๋„ˆ ํ†ต์‹  ์‹œ ์ธํ„ฐ๋„ท ๊ฐœ๋ฐฉ์ ์ธ Nginx ํ”„๋ก์‹œ(frontend-net)๊ฐ€ DB(backend-net)์— ์ง์ ‘ ์ ‘๊ทผํ•  ์ˆ˜ ์—†๋„๋ก ๊ฐ€์ƒ ๋„คํŠธ์›Œํฌ ๋ถ„๋ฆฌ
    • ๋ฐฑ์—”๋“œ API ์ปจํ…Œ์ด๋„ˆ๋งŒ ์–‘์ชฝ ๋ธŒ๋ฆฌ์ง€ ๋„คํŠธ์›Œํฌ์— ๋™์‹œ ์†Œ์†๋˜์–ด ๊ฐ€๊ต ์—ญํ• ์„ ์ „๋‹ดํ•˜๊ฒŒ ํ•จ์œผ๋กœ์จ ํšก์  ์ด๋™(Lateral Movement) ์œ„ํ˜‘ ์ œํ•œ
  • Database ์•„์›ƒ๋ฐ”์šด๋“œ ์™„์ „ ์ฐจ๋‹จ
    • ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์ปจํ…Œ์ด๋„ˆ๊ฐ€ ์ƒ์ฃผํ•˜๋Š” backend-net ๋ธŒ๋ฆฌ์ง€๋ง์— internal: true ์˜ต์…˜ ์ธ๋ผ์ธ ์ง€์ •
    • ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์˜ ์™ธ๋ถ€ ์ธํ„ฐ๋„ท ์•„์›ƒ๋ฐ”์šด๋“œ ์‹œ๋„๋ฅผ ๋ด‰์‡„ํ•˜์—ฌ RCE(์›๊ฒฉ ์ฝ”๋“œ ์‹คํ–‰) ์นจํˆฌ ์‹œ ๋ฆฌ๋ฒ„์Šค ์ปค๋„ฅ์…˜ ๋ฐ ๋ฐ์ดํ„ฐ ๋ฌด๋‹จ ์œ ์ถœ(Exfiltration) ์‹œ๋„ ์›์ฒœ ์ฐจ๋‹จ
  • ์ตœ์†Œ ์ธ๋ฐ”์šด๋“œ ํฌํŠธ ์ œํ•œ
    • Staging ๋ฐ Production ํ™˜๊ฒฝ ๋ชจ๋‘ ์™ธ๋ถ€ ์„œ๋น„์Šค ๋ฐ ๋ชจ๋‹ˆํ„ฐ๋ง ์—ฐ๋™์„ ์œ„ํ•œ Nginx ํฌํŠธ(80, 443)๋งŒ ์™ธ๋ถ€ ์ธ๋ฐ”์šด๋“œ ๊ฐœ๋ฐฉ
    • SSH(22), Spring API(8080), Vite Frontend ๊ฐœ๋ฐœ(5173) ํฌํŠธ๋Š” ๋ณด์•ˆ ๊ทธ๋ฃน ๊ทœ์น™์—์„œ ์™„์ „ํžˆ ์ œ์™ธํ•˜์—ฌ ์™ธ๋ถ€ ์ ‘๊ทผ ์ฐจ๋‹จ
  • ์›๊ฒฉ ๋ฉ”ํŠธ๋ฆญ ์ˆ˜์ง‘ ํ”„๋ก์‹œ ์ค‘์žฌ
    • Grafana Cloud Mimir์˜ ์›๊ฒฉ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ˆ˜์ง‘๊ธฐ๊ฐ€ ๋ฉ”ํŠธ๋ฆญ์„ ์ˆ˜์ง‘(Pull)ํ•  ๋•Œ ์™ธ๋ถ€ Actuator ํฌํŠธ(8080) ์ง์ ‘ ํ˜ธ์ถœ ์ฐจ๋‹จ
    • Nginx HTTPS(443) ์ธํ„ฐํŽ˜์ด์Šค๋กœ ์Šคํฌ๋ž˜ํ•‘์„ ์š”์ฒญํ•˜๋ฉด, Nginx๊ฐ€ Bearer ํ† ํฐ ๋ณด์•ˆ ๊ฒ€์ฆ์„ ์™„๋ฃŒํ•œ ํ†ต์‹ ์— ํ•œํ•ด ๋กœ์ปฌ ๋ฃจํ”„๋ฐฑ๋ง์˜ /actuator/prometheus๋กœ ํฌ์›Œ๋”ฉ ์ค‘์žฌ
  • ์ปจํ…Œ์ด๋„ˆ ๋ณด์•ˆ ๋กœ๋“œ๋งต
    • ํ–ฅํ›„ ์ปจํ…Œ์ด๋„ˆ ๋‚ด๋ถ€ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ Non-root User ์‹คํ–‰ ๊ถŒํ•œ ์ „ํ™˜ ๋ฐ Read-Only root ํŒŒ์ผ์‹œ์Šคํ…œ ์ œํ•œ ์ ์šฉ ์˜ˆ์ •
    • DB ๋ฐฑ์—…์€ ํ˜ธ์ŠคํŠธ ๋‹จ์—์„œ Docker API ํ‘œ์ค€ ์ถœ๋ ฅ ํŒŒ์ดํ”„๋ผ์ธ(docker exec pg_dump)์œผ๋กœ ์•ˆ์ „ํ•˜๊ฒŒ ์ค‘์žฌ ์ฒ˜๋ฆฌํ•˜์—ฌ ๋ฐฑ์—… ๋ฌด๊ฒฐ์„ฑ ์œ ์ง€

1.3.2.1. Security Group Configuration

  • ๋ณด์•ˆ ๊ทธ๋ฃน ์ธ๋ฐ”์šด๋“œ ์ œ์–ด (Security Group Ingress/Egress Rule)
    ์™ธ๋ถ€ ์ธํ„ฐ๋„ท๊ณผ์˜ ๊ฒฝ๊ณ„์  ํฌํŠธ๋ฅผ ์ œ์–ดํ•˜๊ณ , ์•„์›ƒ๋ฐ”์šด๋“œ ์ „์†ก ํŠธ๋ž˜ํ”ฝ ๊ทœ๊ฒฉ์„ ๋ช…ํ™•ํžˆ ๊ณ ์ •ํ•ฉ๋‹ˆ๋‹ค.

    ํ—ˆ์šฉ ํฌํŠธ (Port) ํ”„๋กœํ† ์ฝœ (Protocol) ์†Œ์Šค (Source) ๋ชฉ์  ๋ฐ ๋Œ€์ƒ ์„œ๋น„์Šค
    80 TCP 0.0.0.0/0 Nginx HTTP ์›น ์„œ๋ฒ„ (HTTPS 301 ๋ฆฌ๋‹ค์ด๋ ‰ํŠธ์šฉ)
    443 TCP 0.0.0.0/0 Nginx HTTPS ๋ณด์•ˆ ์›น ์„œ๋น„์Šค ๋ฐ API ํ†ต์‹  (๋ชจ๋‹ˆํ„ฐ๋ง ์Šคํฌ๋ž˜ํ•‘ ํฌํ•จ)
    ํ—ˆ์šฉ ํฌํŠธ (Port) ํ”„๋กœํ† ์ฝœ (Protocol) ๋Œ€์ƒ (Destination) ๋น„๊ณ 
    All All 0.0.0.0/0 ํŒจํ‚ค์ง€ ์—…๋ฐ์ดํŠธ, ์™ธ๋ถ€ API ํ˜ธ์ถœ ๋ฐ DB ๋ฐฑ์—… S3 ์—…๋กœ๋“œ์šฉ

1.3.3. Data Protection

  • ์ธ์ฆ์„œ ์ž๋™ ๊ฐฑ์‹  ์ž๋™ํ™”
    • Let's Encrypt ๋ฌด๋ฃŒ SSL ์ธ์ฆ์„œ๋ฅผ ๋ฐœ๊ธ‰๋ฐ›์•„ HTTPS(443) ํ†ต์‹  ๋ฐ HTTP(80) 301 ๋ฆฌ๋‹ค์ด๋ ‰ํŠธ ์ •์ฑ… ๊ตฌํ˜„
    • 3๊ฐœ์›” ์ฃผ๊ธฐ ๋งŒ๋ฃŒ ์ „์— ์ธ์ฆ์„œ๋ฅผ ์ž๋™ ๊ฐฑ์‹ ํ•  ์ˆ˜ ์žˆ๋„๋ก pre/post ์‰˜ ์Šคํฌ๋ฆฝํŠธ ํ›…์„ Certbot ๋ฐ๋ชฌ์— ์—ฐ๋™ํ•˜์—ฌ ๋งŒ๋ฃŒ ๋‹ค์šดํƒ€์ž„ ์˜ˆ๋ฐฉ
  • ์›๊ฒฉ ์ƒํƒœ ํ˜•์ƒ ๋ณด์•ˆ
    • AWS S3 ๋ฒ„ํ‚ท๊ณผ DynamoDB ํ…Œ์ด๋ธ”(LockID)์„ ํ…Œ๋ผํผ Backend๋กœ ์ง€์ •ํ•˜์—ฌ ํ˜‘์—… ๋ฐ ๋ฐฐํฌ ์‹œ ์ƒํƒœ ํŒŒ์ผ(State)์˜ ๋™์‹œ ์ˆ˜์ • ์ถฉ๋Œ ๋ฐ ์†์ƒ ์›์ฒœ ๋ฐฉ์ง€
    • ์ƒํƒœ ํŒŒ์ผ ์•”ํ˜ธํ™” ์ •์ฑ…์„ ์—ฐ๋™ํ•˜์—ฌ ์ธํ”„๋ผ ํ˜•์ƒ ์ž์‚ฐ ์ •๋ณด ๋ณดํ˜ธ

1.4. Observability

๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ์‹œ์Šคํ…œ ๊ฐ€์šฉ์„ฑ๊ณผ ์ง€ํ‘œ ์ˆ˜์ง‘ ๋ถ€ํ•˜ ์ตœ์†Œํ™” ํ†ต์ œ๋ฅผ ์œ„ํ•ด ์—…๊ณ„ ํ‘œ์ค€ ๋ชจ๋‹ˆํ„ฐ๋ง ํ•ต์‹ฌ ์˜์—ญ(Metrics, Logs, Alerting & SLO)์„ ๊ด€์ œ ์•„ํ‚คํ…์ฒ˜๋กœ ๊ตฌ์ถ•ํ–ˆ์Šต๋‹ˆ๋‹ค.

1.4.1. Metrics & Telemetry

C4Container
    title Telemetry Diagram for rogic.io (Level 3: Observability & Alerting)

    System_Boundary(host, "AWS EC2 Instance (Target Host)") {
        Container(nginx, "Nginx Reverse Proxy", "Docker", "Bearer Token Authentication Endpoint.")
        Container(spring, "Spring Boot Backend", "Docker (GraalVM)", "Exposes Prometheus Actuator Metrics.")
        Rel(nginx, spring, "Forwards prometheus scraping requests", "Port 8080")
    }

    System_Boundary(grafana_cloud, "Grafana Cloud Platform") {
        Container(grafana, "Grafana Dashboards", "SaaS Dashboard", "Visualizes SLA metrics, CPU, Memory, and log groups.")
        Container(prometheus, "Prometheus / Mimir", "SaaS TSDB", "Scrapes metrics via Agentless Pull architecture.")
        Rel(grafana, prometheus, "Queries metrics data")
    }

    System_Boundary(observability, "AWS Management & Alerting") {
        Container(cw, "Amazon CloudWatch", "AWS Logging", "Collects application stdout log streams via awslogs driver.")
        Container(sns, "AWS SNS Topic", "AWS Alerting", "Triggers notifications based on metric filter threshold alarms.")
        Person(sre, "SRE Developer", "Receives real-time incident warning emails.")
        
        Rel(cw, sns, "Metric Filter Threshold Alarmed")
        Rel(sns, sre, "Sends warning email notification")
    }

    Rel(prometheus, nginx, "Scrapes metrics (Agentless Pull)", "HTTPS Bearer Auth / Port 443")
    Rel(spring, cw, "Streams application logs", "awslogs driver")
Loading
  • Agentless Pull ์•„ํ‚คํ…์ฒ˜ ์ˆ˜๋ฆฝ
    • ํ˜ธ์ŠคํŠธ ๋‚ด๋ถ€ CPU/๋ฉ”๋ชจ๋ฆฌ ์ž์›์„ ์†Œ๋ชจํ•˜๋Š” ๋ณ„๋„ ์ˆ˜์ง‘ ์—์ด์ „ํŠธ(Grafana Alloy ๋“ฑ)๋ฅผ ์™„์ „ํžˆ ๋ฐฐ์ œ
    • Nginx ๋ฆฌ๋ฒ„์Šค ํ”„๋ก์‹œ ๋‹จ์—์„œ Authorization: Bearer ํ—ค๋” ํ† ํฐ์„ ์ƒ์‹œ ๋Œ€์กฐ ๊ฒ€์ฆํ•˜๋Š” ๊ฐ€์ƒ ๋ผ์šฐํŒ… ๊ฒฝ๋กœ๋ฅผ ๊ฐœ๋ฐฉ
    • ์™ธ๋ถ€ Grafana Cloud์˜ Prometheus/Mimir ์„œ๋ฒ„๊ฐ€ ์ •๊ธฐ์ ์œผ๋กœ ์ง€ํ‘œ๋ฅผ ์ง์ ‘ Scrape(Scraping)ํ•˜๋„๋ก ์„ค๊ณ„ํ•˜์—ฌ ์—์ด์ „ํŠธ ๊ตฌ๋™ ๋ถ€ํ•˜๋ฅผ 0์œผ๋กœ ํ†ต์ œ

1.4.2. Log Aggregation & Storage

  • awslogs Docker ๋“œ๋ผ์ด๋ฒ„ ์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆฌ๋ฐ
    • ๊ฐœ๋ณ„ ์ปจํ…Œ์ด๋„ˆ ๋‚ด๋ถ€ ์ฝ˜์†” ์ถœ๋ ฅ์„ ๋””์Šคํฌ ํŒŒ์ผ ๋Œ€์‹  AWS CloudWatch Logs(/aws/ec2/nemologic)๋กœ ์ฆ‰์‹œ ๋ฆฌ๋‹ค์ด๋ ‰ํŠธ ํฌ์›Œ๋”ฉ
    • ํ˜ธ์ŠคํŠธ ๋กœ์ปฌ ๋‚ด์— ์›์‹œ ๋กœ๊ทธ๋ฅผ ์ถ•์ ํ•˜์ง€ ์•Š์•„ ๋””์Šคํฌ ๊ณต๊ฐ„ ๊ณ ๊ฐˆ ๋ฐ I/O ๋ณ‘๋ชฉ ๋ฆฌ์Šคํฌ ์‚ฌ์ „ ๊ฒฉ๋ฆฌ
  • ์•ก์„ธ์Šค ์ง€ํ‘œ ๋กœ๊ทธ ํ•„ํ„ฐ๋ง
    • ํ—ฌ์Šค์ฒดํฌ ๋ฐ ์ฃผ๊ธฐ์ ์ธ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ๋ฉ”ํŠธ๋ฆญ ์ˆ˜์ง‘ API ํ˜ธ์ถœ ๊ฒฝ๋กœ์˜ Nginx Access Log ๋กœ๊น…์„ ๊ฐ•์ œ ์ค‘์ง€(access_log off;) ์ฒ˜๋ฆฌ
    • ๋ถˆํ•„์š”ํ•œ ๊ด€์ œ ํŠธ๋ž˜ํ”ฝ์— ์˜ํ•œ ์Šคํ† ๋ฆฌ์ง€ ๋‚ญ๋น„ ๋ฐ CPU ์†Œ๋ชจ ํ†ต์ œ

1.4.3. Alerting & SLO Visualization

  • ์‹ฑ๊ฐ€ํฌ๋ฅด/์‹œ๋“œ๋‹ˆ/๋„์ฟ„ 3์ค‘ ๊ฐ€์šฉ์„ฑ ๊ด€์ œ
    • Grafana Cloud Synthetic Monitoring ํ”„๋กœ๋ธŒ๋ฅผ ํ†ตํ•ด ๋‹ค์ค‘ ๊ธ€๋กœ๋ฒŒ ๋ฆฌ์ „ ์—ฃ์ง€(์‹ฑ๊ฐ€ํฌ๋ฅด, ์‹œ๋“œ๋‹ˆ, ๋„์ฟ„)์—์„œ 1๋ถ„ ๊ฐ„๊ฒฉ์œผ๋กœ /actuator/health ํ—ฌ์Šค์ฒดํฌ ๋‹ค์ค‘ ๋ชจ๋‹ˆํ„ฐ๋ง ์ˆ˜ํ–‰
    • ๋‹จ์ผ ์ง€์  ํ”„๋กœ๋ธŒ ์˜ค๋ฅ˜์— ๋”ฐ๋ฅธ ์˜คํƒ์„ ๋ฐฉ์ง€ํ•˜๊ณ  ๋‹ค์ค‘ ๊ฐ์ƒ‰ ๊ฐ€์šฉ์„ฑ ๊ฒ€์ฆ ์ฒด๊ณ„ ๊ตฌํ˜„
  • AWS SNS ๊ฒฝ๋ณด ๋ฉ”์ผ ์ „์†ก
    • CloudWatch Logs Metric Filter ์ž„๊ณ„์น˜ ์ดˆ๊ณผ ์žฅ์•  ๊ฐ์ง€ ์‹œ AWS SNS ํ† ํ”ฝ์„ ํŠธ๋ฆฌ๊ฑฐํ•˜์—ฌ SRE ๋ฉ”์ผ๋กœ ์žฅ์•  ์ธ์‹œ๋˜ํŠธ ์ฆ‰์‹œ ์ „ํŒŒ
  • ํ†ตํ•ฉ SLA ๋Œ€์‹œ๋ณด๋“œ ์‹œ๊ฐํ™” (current_dashboard.json)
    • ํ•ต์‹ฌ ๊ฐ€์šฉ์„ฑ ์ง€ํ‘œ(Uptime SLA, Incident Count, MTTR, MTBF)๋ฅผ Grafana ๋Œ€์‹œ๋ณด๋“œ ์ƒ๋‹จ ๋‹จ์ผ ํ–‰ 4์—ด KPI ์นด๋“œ๋กœ ์ผ๊ด„ ๊ด€์ œ ๊ฐ€๋Šฅํ•˜๋„๋ก ๋™์  ์—ฐ๋™ ๊ตฌ์„ฑ
    • ๋ ˆ์ด์•„์›ƒ ๊ตฌ์„ฑ์šฉ ์˜ˆ์‹œ ๋งํฌ: Grafana Live Public Dashboard (๋ฏผ๊ฐ ๋ฉ”ํŠธ๋ฆญ ๋ฐฐ์ œ ๋ฐ๋ชจ์šฉ ๊ตฌ์„ฑ)
    • ์ƒ์„ธ ๊ด€์ œ PromQL ์ˆ˜์‹ ๋ฐ ์ฟผ๋ฆฌ ๊ตฌํ˜„์€ ๋ถ€๋ก 5.2. PromQL Query Formulations (SLO Metrics) ์ฐธ๊ณ 

1.5. Troubleshooting

1.5.1. Host Memory Exhaustion Incident

  • ๋ฐฐ๊ฒฝ
    • ์ธํ”„๋ผ ๋น„์šฉ ๊ทน ์ตœ์†Œํ™”(์›” $11.45 ๊ตฌ์„ฑ)๋ฅผ ์œ„ํ•ด t3a.nano ์ธ์Šคํ„ด์Šค(512MB RAM) ํ™˜๊ฒฝ์„ ์„ ํƒํ•˜์˜€์œผ๋‚˜, ๋ชจ๋‹ˆํ„ฐ๋ง ์ˆ˜์ง‘ ์—์ด์ „ํŠธ(Grafana Alloy)์˜ ๋ฉ”๋ชจ๋ฆฌ ์ ์œ (100MB+)์™€ ๋ธ”๋ฃจ/๊ทธ๋ฆฐ ๋ฐฐํฌ ์‹œ์ ์— Spring Boot ์ปจํ…Œ์ด๋„ˆ 2๊ฐœ๊ฐ€ ์ผ์‹œ์ ์œผ๋กœ ๋™์‹œ์— ๊ธฐ๋™ํ•˜๋ฉด์„œ ๋ฌผ๋ฆฌ ๋ฉ”๋ชจ๋ฆฌ ํ•œ๊ณ„๋ฅผ ์ดˆ๊ณผํ•˜์—ฌ OOM ๋ฐ CPU ์Šค๋ ˆ์‹ฑ ์žฅ์• ๊ฐ€ ๋นˆ๋ฒˆํžˆ ๋ฐœ์ƒํ•จ.
    • ํŠนํžˆ ์ตœ์ดˆ ๋ฐฐํฌ ์‹œ(์ฝœ๋“œ ์Šคํƒ€ํŠธ) ๋กœ์ปฌ ์บ์‹œ ์ด๋ฏธ์ง€๊ฐ€ ์—†๋Š” ์ƒํƒœ์—์„œ ์ˆ˜๋ฐฑ MB ์ƒ๋‹น์˜ Base Image ๋‹ค์šด๋กœ๋“œ ๋ฐ ์••์ถ• ํ•ด์ œ๊ฐ€ ๊ฒน์ณ ๋””์Šคํฌ I/O ๋ณ‘๋ชฉ์ด ๋ฐœ์ƒ, ๋ฐฐํฌ ํŒŒ์ดํ”„๋ผ์ธ์ด 1์‹œ๊ฐ„ ์ด์ƒ ๋ฉˆ์ถฐ์žˆ๋‹ค๊ฐ€ ์ค‘๋‹จ๋˜๋Š” ํ˜„์ƒ์ด ์ผ์–ด๋‚จ.
  • ํ•ด๊ฒฐ ๋ฐฉ์•ˆ
    • ์ž์› ์ง„๋‹จ ๋ฐ ์ž„๊ณ„ ์ง€ํ‘œ ์ด์‹: SSH ์ง€์—ฐ ์ƒํ™ฉ์—์„œ ๋ฆฌ๋ˆ…์Šค top ๋ฐ vmstat ๋ช…๋ น์–ด๋ฅผ ํ™œ์šฉํ•ด CPU idle 0% ์ˆ˜๋ ด ๋ฐ I/O Wait(wa)์˜ ๊ธ‰๊ฒฉํ•œ ์ƒ์Šน์— ๋”ฐ๋ฅธ ๋””์Šคํฌ/CPU ์Šค๋ž˜์‹ฑ ์ƒํƒœ๋ฅผ ์ •ํ™•ํžˆ ๊ทœ๋ช…ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ง„๋‹จ ๊ฒฐ๊ณผ๋ฅผ ํ† ๋Œ€๋กœ Grafana Cloud ๋ชจ๋‹ˆํ„ฐ๋ง ๋Œ€์‹œ๋ณด๋“œ์— wa ๋ฐ idle ์ง€ํ‘œ๋ฅผ ๊ด€์ธก ๊ฐ€๋Šฅํ•˜๋„๋ก ์ถ”๊ฐ€ ์ด์‹ํ–ˆ์Šต๋‹ˆ๋‹ค.
    • ์ˆ˜์ง‘ ์—์ด์ „ํŠธ ๊ฑท์–ด๋‚ด๊ธฐ: ์ž์› ์ ์œ ๊ฐ€ ํฐ Alloy ๋ฐ๋ชฌ์„ ์ œ๊ฑฐํ•˜๊ณ  Grafana Mimir๊ฐ€ Nginx ํ”„๋ก์‹œ๋ฅผ ํ†ตํ•ด ์ง€ํ‘œ๋ฅผ ์ง์ ‘ Scrapeํ•˜๋Š” Agentless Pull ๊ตฌ์กฐ๋กœ ์ „๋ฉด ์ „ํ™˜ํ–ˆ์Šต๋‹ˆ๋‹ค.
    • ๋Ÿฐํƒ€์ž„ ์ดˆ๊ฒฝ๋Ÿ‰ํ™” ๋ฐ ๋ฉ”๋ชจ๋ฆฌ ์Šค์™‘: Spring Boot ๊ตฌ๋™ ํ’‹ํ”„๋ฆฐํŠธ๋ฅผ 30MB ์ดํ•˜๋กœ ์••์ถ•ํ•˜๊ธฐ ์œ„ํ•ด GraalVM Native Image ์ปดํŒŒ์ผ ์˜ต์…˜์„ ๋„์ž…ํ•˜๊ณ , 2GB ํฌ๊ธฐ์˜ SWAP ํŒŒํ‹ฐ์…˜์„ ํ™œ์„ฑํ™”ํ•˜์—ฌ ์ปจํ…Œ์ด๋„ˆ ๊ต์ฒด ์ˆœ๊ฐ„์˜ ์ผ์‹œ์  ๋ฉ”๋ชจ๋ฆฌ ํ”ผํฌ๋ฅผ ์™„์ถฉํ–ˆ์Šต๋‹ˆ๋‹ค.
    • ๋„์ปค ์ด๋ฏธ์ง€ ์บ์‹œ ํ™œ์šฉ: ์ฒซ ์ฝœ๋“œ ๋ฐฐํฌ ์ดํ›„์—๋Š” ๋„์ปค ๋ ˆ์ง€์ŠคํŠธ๋ฆฌ ๋กœ์ปฌ ์ด๋ฏธ์ง€ ์บ์‹ฑ ๋ฐ ๋ ˆ์ด์–ด ์žฌ์‚ฌ์šฉ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ†ตํ•ด ์ด๋ฏธ์ง€ Pull/Extract ๋ถ€ํ•˜๊ฐ€ ๋Œ€ํญ ๋‚ฎ์•„์ ธ I/O ์Šค๋ž˜์‹ฑ ๋ณ‘๋ชฉ์ด ์ž๋™ ํ•ด์†Œ๋˜๋„๋ก ์œ ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ Docker ๋ฏธ์‚ฌ์šฉ ์ด๋ฏธ์ง€/๋ณผ๋ฅจ ์ •๋ฆฌ๋ฅผ ์œ„ํ•ด ์ฃผ๊ธฐ์  GC ํฌ๋ก ํƒญ์„ ๋ฐ”์ธ๋”ฉํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ๊ธฐ์ˆ ์  ๊ตํ›ˆ ๋ฐ ์˜์‚ฌ๊ฒฐ์ •(Retrospective)
    • ๊ทน๋‹จ์ ์ธ 512MB RAM ํ™˜๊ฒฝ์—์„œ๋„ ์—”์ง€๋‹ˆ์–ด๊ฐ€ ๋ฆฌ๋ˆ…์Šค ์ €์ˆ˜์ค€ ๋„๊ตฌ(top, vmstat)๋ฅผ ํ™œ์šฉํ•œ ์‹ค์‹œ๊ฐ„ ๋ฆฌ์†Œ์Šค ๊ฐ์ƒ‰ ๋ฐ ๋ฉ”ํŠธ๋ฆญ ์ด์‹์„ ๊ฑฐ์ณ ๋ณ‘๋ชฉ ์ง€์ ์„ ๊ณผํ•™์ ์œผ๋กœ ๋ฐํžˆ๊ณ  ๊ทน๋ณตํ•œ ์‚ฌ๋ก€์ž…๋‹ˆ๋‹ค.
    • ์ž์› ์Šค์ผ€์ผ์—… ๋Œ€์‹  ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ๋Ÿฐํƒ€์ž„ ์ž์ฒด๋ฅผ Nativeํ™”ํ•˜๊ณ  ๋„์ปค ๋กœ์ปฌ ์บ์‹œ ๋ฉ”์ปค๋‹ˆ์ฆ˜๊ณผ SWAP ์„ค์ •์„ ๊ฒฐํ•ฉํ•˜์—ฌ, ์ตœ์†Œ ๋น„์šฉ ์„œ๋ฒ„์—์„œ๋„ ๊ณ ๊ฐ€์šฉ์„ฑ ๋ฐ ๋ณต๊ตฌ ์ง€ํ–ฅ์  ์šด์˜(ROA)์ด ๊ฐ€๋Šฅํ•จ์„ ์‹ค์ฆํ–ˆ์Šต๋‹ˆ๋‹ค.

2. CI/CD

2.1. Pipeline Workflow

๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ์ฝ”๋“œ ํ˜•์ƒ ํ†ตํ•ฉ๋ถ€ํ„ฐ ์šด์˜๊ณ„ ์‹ค๋ฐฐํฌ๊นŒ์ง€์˜ ์ƒ์• ์ฃผ๊ธฐ๋ฅผ ์ œ์–ดํ•˜๊ธฐ ์œ„ํ•ด 4๋‹จ๊ณ„ GitOps ๋ฐฐํฌ ์›Œํฌํ”Œ๋กœ์šฐ๋ฅผ ์ ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.

2.1.1. GitOps Flowchart

stateDiagram-v2
    direction LR
    [*] --> CI : Git Push to main
    
    state "1. Continuous Integration (CI)" as CI {
        direction TB
        state "Backend: Gradle Tests" as UnitB
        state "Frontend: Vitest Tests" as UnitF
        state "Infra: Ansible Lint" as Lint
        
        [*] --> UnitB
        [*] --> UnitF
        [*] --> Lint
    }

    state "2. Continuous Delivery: Staging" as Staging {
        direction TB
        state "Build Backend (GHCR)" as BuildB
        state "Build & S3 Sync Frontend" as BuildF
        state "Terraform Apply Staging" as TFA_S
        state "Deploy Backend via Ansible" as Deploy_S
        state "Run Playwright E2E Tests" as E2E
 
        [*] --> BuildB
        [*] --> BuildF
        BuildB --> TFA_S
        BuildF --> TFA_S
        TFA_S --> Deploy_S
        Deploy_S --> E2E
    }

    state "3. Approval Gate" as Gate {
        state "Pause for Admin Manual Approval" as Approve
        [*] --> Approve
    }

    state "4. Continuous Deployment: Production" as Production {
        direction TB
        state "Terraform Apply Production" as TFA_P
        state "Deploy Production via Ansible" as Deploy_P
        state "Auto-SemVer Tag & Release" as Release
 
        [*] --> TFA_P
        TFA_P --> Deploy_P
        Deploy_P --> Release
    }

    CI --> Staging : Validations Pass
    Staging --> Gate : Playwright E2E Pass
    Gate --> Production : Approved
    Production --> [*] : Production Release Complete
Loading

2.1.2. Pipeline Trigger Optimization

  • ๊ฒฝ๋กœ ๊ธฐ๋ฐ˜ ๋นŒ๋“œ ์Šคํ‚ต (Path Filtering)
    • ๋‹จ์ˆœ ๋งˆํฌ๋‹ค์šด ๋ฌธ์„œ ์ˆ˜์ •(*.md) ์ด๋‚˜ ๋กœ์ปฌ ์„ค์ • ์ปค๋ฐ‹ ์œ ์ž… ์‹œ์—๋Š” ๋นŒ๋“œ/์ปดํŒŒ์ผ ๋‹จ๊ณ„๋ฅผ ์Šคํ‚ตํ•˜์—ฌ Actions ์ปดํ“จํŒ… ์ž์› ๋ฐ ๋ฐฐํฌ ์†๋„ ์ตœ์ ํ™”
  • ๋ฐฐํฌ ๊ฒฝํ•ฉ ๋ฐ ๋Œ€๊ธฐ ์ž๋™ ์ทจ์†Œ (Concurrency)
    • Staging ํ™˜๊ฒฝ ๋ฐฐํฌ๊ฐ€ ๊ธฐ๋™ ์ค‘์ธ ์ƒํƒœ์—์„œ ์‹ ๊ทœ ๋ณ€๊ฒฝ ์ปค๋ฐ‹์ด ์ถ”๊ฐ€ ์œ ์ž…๋˜๋Š” ์ฆ‰์‹œ ์ด์ „ ๋ฐฐํฌ ๋‹จ๊ณ„๋ฅผ ์ž๋™ ๊ฐ•์ œ ์ทจ์†Œ(cancel-in-progress: true)ํ•˜์—ฌ ๋ฐฐํฌ ๊ผฌ์ž„ ํ˜„์ƒ ์›์ฒœ ๋ฐฐ์ œ

2.2. Artifact & Release Management

์•ˆ์ •์ ์ธ ๋นŒ๋“œ ํŒŒ์ผ ์ƒ์„ฑ, ๋ฐฐํฌ ๊ฐ€์šฉ์„ฑ ํ™•๋ณด ๋ฐ ์ •๊ธฐ ๋ฆด๋ฆฌ์ฆˆ ์ฃผ๊ธฐ๋ฅผ ์ œ์–ดํ•˜๊ธฐ ์œ„ํ•œ ์‚ฐ์ถœ๋ฌผ๊ณผ ์ž์‚ฐ ๋ฐฐํฌ ๊ด€๋ฆฌ ์ฒด๊ณ„์ž…๋‹ˆ๋‹ค.

2.2.1. Compute Offloading

  • Actions Runner ์ปดํŒŒ์ผ ์˜คํ”„๋กœ๋”ฉ
    • 512MB RAM ๊ทน๋‹จ ์‚ฌ์–‘์„ ์ง€๋‹Œ ์šด์˜ ์„œ๋ฒ„์˜ ์ปดํŒŒ์ผ ๋ถ€ํ•˜ ๊ณ ๊ฐˆ์„ ์˜ˆ๋ฐฉํ•˜๊ธฐ ์œ„ํ•ด ๋นŒ๋“œ ๋ฐ ํŒจํ‚ค์ง• ์—ฐ์‚ฐ์„ GitHub Actions ํด๋ผ์šฐ๋“œ ํ™˜๊ฒฝ์œผ๋กœ ์™„์ „ ์˜คํ”„๋กœ๋”ฉ (์ƒ์„ธ ์™„ํ™” ๊ตฌ์กฐ๋Š” 1.2. Cost Optimization ๋‚ด Compute ์ตœ์ ํ™” ๋‹จ๋ฝ ์ฐธ๊ณ )

2.2.2. Static Asset Delivery

  • Vite ์ •์  ์ž์‚ฐ ๋‹ค์ด๋ ‰ํŠธ ๋™๊ธฐํ™”
    • ํ”„๋ก ํŠธ์—”๋“œ ๋นŒ๋“œ ์‹œ ๋ฌด๊ฑฐ์šด ๋„์ปค ์ด๋ฏธ์ง€ ์บก์Аํ™” ๋ฐฐํฌ ๋Œ€์‹ , ์ปดํŒŒ์ผ ์™„๋ฃŒ๋œ ์ •์  ํŒŒ์ผ ๋ฒˆ๋“ค(index.html, JS/CSS)์„ AWS S3 ๋ฒ„ํ‚ท์œผ๋กœ ๋‹ค์ด๋ ‰ํŠธ ๋™๊ธฐํ™”(aws s3 sync)ํ•˜๊ณ  CloudFront Edge Invalidation์„ ํŠธ๋ฆฌ๊ฑฐํ•˜์—ฌ ์ดˆ๊ฒฝ๋Ÿ‰ CDN ์—์ง€ ๋”œ๋ฆฌ๋ฒ„๋ฆฌ ์ˆ˜๋ฆฝ

2.2.3. Release Versioning Automation

  • ์ž๋™ํ™”๋œ SemVer ๋ฐ Release ์ž‘์„ฑ
    • ์ปค๋ฐ‹ ๋ฉ”์‹œ์ง€ ํ—ค๋” ํ† ํฐ(feat:, fix:) ๊ทœ๊ฒฉ์„ ๊ธฐ๊ณ„์ ์œผ๋กœ ํŒŒ์‹ฑํ•˜์—ฌ Semantic Versioning ๋ฒ„์ „์„ ์ž๋™ ๊ฐฑ์‹ 
    • ๋ณ€๊ฒฝ ์ด๋ ฅ(Changelog) ์ž‘์„ฑ ๋ฐ ๋ฆด๋ฆฌ์ฆˆ ๋ฐœํ–‰ ๊ณผ์ •์„ 100% ์ž๋™ํ™”ํ•˜์—ฌ ๋ฐฐํฌ ์‹ ๋ขฐ์„ฑ๊ณผ ๋ณ€๊ฒฝ ์ถ”์  ๊ฐ€๋…์„ฑ ํš๋“

2.3. Continuous Validation

์ฝ”๋“œ ํ†ตํ•ฉ ์‹œ์ ๋ถ€ํ„ฐ ํ”„๋กœ๋•์…˜ ๋ฐฐํฌ ์™„๋ฃŒ ์‹œ์ ๊นŒ์ง€ ์‹œ์Šคํ…œ์˜ ์•ˆ์ „์„ฑ๊ณผ ์ •ํ•ฉ์„ฑ์„ ์‹ค์‹œ๊ฐ„ ์ž…์ฆํ•˜๋Š” ํ’ˆ์งˆ ๊ฒ€์ฆ ๋ฐ ์Šน์ธ ๊ฒŒ์ดํŠธ ํ†ต์ œ ์ฒด๊ณ„์ž…๋‹ˆ๋‹ค.

2.3.1. Verification Gates

  • ๋‹ค๋‹จ๊ณ„ ์ฝ”๋“œ ์ •ํ•ฉ์„ฑ ๊ฒ€์ฆ
    • ์ปดํŒŒ์ผ & ๋‹จ์œ„ ํ…Œ์ŠคํŠธ: ๋กœ์ปฌ PR ์ƒ์„ฑ ์‹œ์  ๋ฐ Actions ํŒŒ์ดํ”„๋ผ์ธ์—์„œ Spring Boot ๋‹จ์œ„ ํ…Œ์ŠคํŠธ(Gradle) ๋ฐ Vue ๋‹จ์œ„ ํ…Œ์ŠคํŠธ(Vitest)๋ฅผ ์ž๋™ ๊ฒ€์ฆ
    • Ansible Lint ์ •์  ๊ฒ€์‚ฌ: ์ธํ”„๋ผ ๋ณ€๊ฒฝ ์‹œ ํ”Œ๋ ˆ์ด๋ถ์˜ ๋ฌธ๋ฒ• ๊ทœ๊ฒฉ ์–ด๊ธ‹๋‚จ์„ ์ปดํŒŒ์ผ ์ „์— ์ž๋™ ์ง„๋‹จํ•ด ์„ค์ • ๊ฒฐํ•จ ์‚ฌ์ „ ์ฐจ๋‹จ
  • Playwright E2E ๋ธŒ๋ผ์šฐ์ € ํ…Œ์ŠคํŠธ
    • Staging ์„œ๋ฒ„ ๋ฐฐํฌ ์™„๋ฃŒ ์ฆ‰์‹œ ์‹ค์ œ ํ—ค๋“œ๋ฆฌ์Šค ๋ธŒ๋ผ์šฐ์ €(frontend/e2e/staging.spec.ts)๋ฅผ ๊ฐ€๋™ํ•˜์—ฌ ํ™ˆ ํ™”๋ฉด ๋กœ๋”ฉ, ๋…ธ๋…ธ๊ทธ๋žจ ์บ”๋ฒ„์Šค ํด๋ฆญ/์ƒ‰์น  ๋ฐ ์ต๋ช… ๊ฐ€์ž… ๋กœ์ง์„ ์œ ์ € ๊ด€์ ์—์„œ ์ž๋™ ์ ๊ฒ€ํ•˜์—ฌ ํ’ˆ์งˆ ๊ฒฐํ•จ ์œ ์ž… ์˜ˆ๋ฐฉ

2.3.2. Delivery Gates

  • ์ˆ˜๋™ ์Šน์ธ ๋ฐฐํฌ ํ†ต์ œ (Manual Gate)
    • Staging ํ™˜๊ฒฝ์—์„œ ์œ ๋‹›/E2E ํ…Œ์ŠคํŠธ๊ฐ€ 100% ํ•ฉ๊ฒฉํ•˜๋ฉด ๋ฐฐํฌ ์›Œํฌํ”Œ๋กœ์šฐ๋ฅผ ์ผ์‹œ ์ •์ง€์‹œํ‚ค๊ณ , ๊ด€๋ฆฌ์ž๊ฐ€ ์ง์ ‘ GitHub Environment ์Šน์ธ ์ฝ˜์†”์—์„œ ๋ฆด๋ฆฌ์ฆˆ ์•ˆ์ •์„ฑ์„ ๊ฒ€ํ† /์Šน์ธํ•ด์•ผ๋งŒ Production ํ™˜๊ฒฝ์œผ๋กœ ์Šน๊ฒฉ ๋ฐฐํฌ๋˜๋„๋ก ์„ค๊ณ„ํ•˜์—ฌ ์˜ค๋ฐฐํฌ ๋ฆฌ์Šคํฌ ์ฐจ๋‹จ

2.4. Troubleshooting

2.4.1. Deployment Pipeline Conflict

  • ๋ฐฐ๊ฒฝ
    • Staging๊ณผ Production ์ธํ”„๋ผ ์„ค์ •์ด ๋™์ผ Terraform ์ฝ”๋“œ์— ๋ฌถ์—ฌ ์ผ๊ด„ ๋ฐ˜์˜๋˜๋˜ ์ค‘, ์šด์˜ ํ™˜๊ฒฝ S3 ๋ฒ„ํ‚ท์— ์ •์  ์ž์‚ฐ์ด ์‹œ๋”ฉ๋˜์ง€ ์•Š์€ ์ƒํƒœ์—์„œ DNS A ๋ ˆ์ฝ”๋“œ๊ฐ€ CloudFront/S3๋กœ ๋จผ์ € ์Šค์œ„์นญ๋˜์–ด ์šด์˜ ์ „์ฒด ์ ‘์† ์ฐจ๋‹จ(AccessDenied) ์žฅ์•  ๋ฐœ์ƒ (Access Failure Report).
    • ํ•ซํ”ฝ์Šค ๋„์ค‘ GitHub Actions์˜ cancel-in-progress: true ์„ค์ •์œผ๋กœ ์ธํ•ด Nginx ์ธ์ฆ์„œ ๋ฐœ๊ธ‰ ํ”„๋กœ์„ธ์Šค ๋„์ค‘ ํ›„์† ์ปค๋ฐ‹์ด ์ด์ „ ๋นŒ๋“œ๋ฅผ ๊ฐ•์ œ ์ทจ์†Œํ•˜๋ฉด์„œ ์‹ค์„œ๋ฒ„ SSL ์ธ์ฆ์„œ ์œ ์‹ค๋กœ ์ธํ•œ HTTPS API ํ†ต์‹  ๋ถˆ๋Šฅ ์žฅ์•  ๋ฐœ์ƒ (Handshake Failure Report).
  • ํ•ด๊ฒฐ ๋ฐฉ์•ˆ
    • ์ธํ”„๋ผ ํ™˜๊ฒฝ ๋ฌผ๋ฆฌ ๊ฒฉ๋ฆฌ
      Terraform Workspace ๋ฐ ๋””๋ ‰ํ† ๋ฆฌ ๊ตฌ์กฐ๋ฅผ Staging๊ณผ Production์œผ๋กœ ์—„๊ฒฉํžˆ ๋ถ„ํ• ํ•˜์—ฌ ๋‹จ์ผ ์‹คํ–‰์ด ์‹ค ์šด์˜๊ณ„์— ์ฆ‰๊ฐ ์˜ํ–ฅ์„ ๋ฏธ์น˜์ง€ ์•Š๋„๋ก ์กฐ์น˜.
    • ์ค‘์š” ๋ฐฐํฌ ๋™์‹œ์„ฑ ์ฐจ๋‹จ ์˜ต์…˜ ์ œ๊ฑฐ
      ์ค‘์š” ์„œ๋ฒ„ ์„ค์ • ๋ฐฐํฌ ๋‹จ๊ณ„(deploy-production)์—์„œ cancel-in-progress: false๋ฅผ ๋ช…์‹œํ•˜์—ฌ ์ด์ „ ์ž‘์—…์ด ์ค‘๋„ ํŒŒ๊ธฐ๋˜๋Š” ์„ค์ • ์ •ํ•ฉ์„ฑ ํ›ผ์†์„ ์›์ฒœ ์ฐจ๋‹จ.
    • ๋ฐฐํฌ ๋‹จ๊ณ„์˜ ๋А์Šจํ•œ ๊ฒฐํ•ฉ(Loose Coupling)
      CloudFront TLS ๋ฐ SSL ์ธ์ฆ์„œ ๊ต์ฒด ๋“ฑ์˜ ์ƒํ˜ธ ์˜์กด์ ์ธ ์ž‘์—…๋“ค์ด ์‹ค์ œ ์„œ๋ฒ„ ์ค€๋น„ ์ƒํƒœ๋ฅผ ๊ฒ€์ฆํ•œ ํ›„์— ์ด๋ฃจ์–ด์ง€๋„๋ก ์ˆ˜๋™ ์Šน์ธ ๊ฒŒ์ดํŠธ(Manual Approval Gate)๋ฅผ ๋„์ž…ํ•ด ์ธํ”„๋ผ ํ”„๋กœ๋ชจ์…˜ ๋ฐฉ์‹์„ ๊ฐœ์„ ํ•จ.


3. AI Engineering

3.1. LLM Generation Pipeline

์‚ฌ์šฉ์ž๊ฐ€ ์–ธ์ œ๋‚˜ ์‹ ์„ ํ•œ ์Šคํ…Œ์ด์ง€๋ฅผ ํ”Œ๋ ˆ์ดํ•  ์ˆ˜ ์žˆ๋„๋ก ์ดˆ๊ฒฝ๋Ÿ‰ ์ƒ์„ฑํ˜• LLM ๋ฐ ๋น„๋™๊ธฐ ๋ฐฐ์น˜ ์Šค์ผ€์ค„๋Ÿฌ๋ฅผ ์œ ๊ธฐ์ ์œผ๋กœ ๊ตฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

  • Gemini ๋น„๋™๊ธฐ ์ƒ์„ฑ ์Šค์ผ€์ค„๋Ÿฌ
    • ๋งค์ผ ์ƒˆ๋ฒฝ 04:17 ํฌ๋ก  ์Šค์ผ€์ค„๋Ÿฌ์— ์˜ํ•ด gemini-3.1-flash-lite LLM API๊ฐ€ ๋น„๋™๊ธฐ ํŠธ๋ฆฌ๊ฑฐ๋˜์–ด ์‹ ๊ทœ ํผ์ฆ ๋ ˆ์ด์•„์›ƒ(๊ทธ๋ฆฌ๋“œ, ์†”๋ฃจ์…˜ ๋งต)์„ ์ž๋™ ์ƒ์„ฑ
    • API Rate Limit ์˜ค๋ฅ˜ ๋ฐฉ์–ด๋ฅผ ์œ„ํ•ด ๋ฐฑ์—”๋“œ ๋‹จ์— 5์ดˆ์˜ ์ง€์—ฐ ๊ฐ„๊ฒฉ(Delay Interval) ๋ฐ 3ํšŒ์˜ ์ง€์ˆ˜ ๋ฐฑ์˜คํ”„ ์žฌ์‹œ๋„(Exponential Retry) ์žฅ์น˜๋ฅผ ํ†ตํ•ฉ ์„ค๊ณ„
  • ํผ์ฆ ํ›„๋ณด ์˜ˆ๋น„ ๋ฒ„ํผ๋ง (FIFO Buffer)
    • ๋งค ๋นŒ๋“œ ์‹œ์  ๋ฐ ์‹ค์„œ๋ฒ„ ๊ธฐ๋™ ์‹œ ํฌ๊ธฐ๋ณ„(5x5, 10x10, 15x15, 20x20 ๋“ฑ)๋กœ ์ตœ์†Œ 5๊ฐœ ์ด์ƒ์˜ ์˜ˆ๋น„ ํผ์ฆ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฏธ๋ฆฌ ์ƒ์„ฑํ•˜์—ฌ ๋ฒ„ํผ ํ…Œ์ด๋ธ”์— ์„ ์ž…์„ ์ถœ(FIFO) ํ˜•ํƒœ๋กœ ์ƒ์‹œ ์ ์žฌ
    • API ์žฅ์•  ์‹œ์—๋„ ๋‹ค์šดํƒ€์ž„ ์—†์ด ์•ˆ์ •์ ์ธ ์ผ ๋‹จ์œ„ ํผ์ฆ ์—…๋ฐ์ดํŠธ ์„œ๋น„์Šค๊ฐ€ ์ง€์†๋˜๋„๋ก ์•ˆ์ „์žฅ์น˜ ๊ตฌ๋น„

3.2. Automated Quality Guardrails

LLM์ด ์ฐฝ์กฐํ•œ ๋ฌด์ž‘์œ„ ํŒจํ„ด ์ค‘ ๋…ผ๋ฆฌ์  ๋ฌด๊ฒฐ์„ฑ์ด ๊ฒฐ์—ฌ๋œ ๋ถˆ๋Ÿ‰ ๋ฌธ์ œ๋ฅผ ๊ธฐ๊ณ„์ ์œผ๋กœ ์ž๋™ ํ•„ํ„ฐ๋งํ•˜๋Š” ์‹ค์‹œ๊ฐ„ ๊ฒ€์ฆ ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค.

  • DFS ๋ฐฑํŠธ๋ž˜ํ‚น ๊ธฐ๋ฐ˜ ๋…ผ๋ฆฌ ๊ฒ€์ฆ ์—”์ง„ (isLogicalOnly)
    • ์ƒ์„ฑ๋œ ์†”๋ฃจ์…˜ ๊ทธ๋ฆฌ๋“œ๋กœ๋ถ€ํ„ฐ ๊ฐ€๋กœ/์„ธ๋กœ ํžŒํŠธ ์ˆซ์ž๋ฅผ ์—ญ์‚ฐํ•œ ๋’ค, Java ๋ฐฑ์—”๋“œ ๋‹จ์˜ ๊ณ ์„ฑ๋Šฅ ๋…ธ๋…ธ๊ทธ๋žจ ๋ถ„์„ ์†”๋ฒ„ ์•Œ๊ณ ๋ฆฌ์ฆ˜(NonogramSolver)์„ ๊ตฌ๋™ํ•˜์—ฌ ์ถ”๋ก 
    • ๋‹จ์ˆœ ์ฐ๊ธฐ(Guessing)๋‚˜ ๋‹ค์ค‘ ํ•ด(Multiple Solutions)๊ฐ€ ๋‚˜์˜ค๋Š” ๋น„์ •ํ˜• ์ถœ์ œ๋ฅผ ๊ธฐ๊ณ„์ ์œผ๋กœ ์ฐจ๋‹จํ•˜๊ณ , ์ˆ˜ํ•™์ ์œผ๋กœ ์œ ์ผํ•œ ํ•ด(Unique Solution)๋งŒ ์กด์žฌํ•˜๋Š” ๋ฌด๊ฒฐ์„ฑ ํ†ต๊ณผ ๋ฐ์ดํ„ฐ๋งŒ ๊ฒ€์ฆ
  • ๋ฐ์ดํ„ฐ ํ•„ํ„ฐ๋ง ๊ฐ€๋“œ๋ ˆ์ผ
    • ์†”๋ฒ„ ๊ฒ€์ฆ์— ์‹คํŒจํ•˜๊ฑฐ๋‚˜ ํ•ด์˜ ์œ ์ผ์„ฑ์ด ์ž…์ฆ๋˜์ง€ ์•Š์€ ์ƒ์„ฑ ๋ฐ์ดํ„ฐ๋Š” ๋ฒ„ํผ ์ ์žฌ ์ „ ์ฆ‰์‹œ ๋ฒ„๊ธฐ(Discard) ์ฒ˜๋ฆฌ
    • ์ด๋ฅผ ํ†ตํ•ด LLM ํŠน์œ ์˜ ํ• ๋ฃจ์‹œ๋„ค์ด์…˜(๋น„์ •ํ˜• ๋ฐ ๋ฌธ๋ฒ• ์˜ค๋ฅ˜ ๋ฐ์ดํ„ฐ ์ถœ๋ ฅ)์ด ์ตœ์ข… ์‚ฌ์šฉ์ž ๊ฒฝํ—˜์— ์œ ์ž…๋  ๊ฒฝ๋กœ๋ฅผ ์™„์ „ ๊ฒฉ๋ฆฌ

3.3. AI Governance & Human-in-the-Loop

์ธ๊ฐ„์ด ๋ฃจํ”„์— ์ฐธ์—ฌ(HITL)ํ•˜์—ฌ ์ธ๊ณต์ง€๋Šฅ์ด ์ƒ์„ฑํ•œ ์Šคํ…Œ์ด์ง€์˜ ํ‰ํŒ์„ ์ˆ˜์ง‘ํ•˜๊ณ  ์ง€์† ๊ฐ€๋Šฅํ•œ ํ’ˆ์งˆ์„ ๊ฐ๋…ํ•˜๋Š” ํ†ต์ œ ์žฅ์น˜์ž…๋‹ˆ๋‹ค.

  • ์‚ฌ์šฉ์ž ํ”ผ๋“œ๋ฐฑ ๋ฃจํ”„ (HITL Feedback)

    • ์‹ค์ œ ํ”Œ๋ ˆ์ด์–ด๋“ค์ด ๊ฒŒ์ž„ ํด๋ฆฌ์–ด ์‹œ ๋ถ€์—ฌํ•˜๋Š” ์ถ”์ฒœ/๋น„์ถ”์ฒœ(๐Ÿ‘/๐Ÿ‘Ž) ํ‰์  ์นด๋“œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฐฑ์—”๋“œ DB stages ํ…Œ์ด๋ธ”์— ์‹ค์‹œ๊ฐ„ ์ง‘๊ณ„
    • ํ‰์  ๋น„์œจ์„ ๊ฐ€์‹œ์ ์œผ๋กœ ์‹๋ณ„ํ•˜์—ฌ LLM ์ƒ์„ฑ ํ”„๋กฌํ”„ํŠธ ๋งค๊ฐœ๋ณ€์ˆ˜ ๋ฐ ์ถœ์ œ ๊ฒฝํ–ฅ ์กฐ์ ˆ์˜ ๊ธฐ์ดˆ ์ž์‚ฐ์œผ๋กœ ํ™œ์šฉ
  • ๋ฐฑ์˜คํ”ผ์Šค ๊ธฐ๋ฐ˜ ์—ฐ์‡„ ํ•˜๋“œ ๋”œ๋ฆฌํŠธ

    • ํ‰์ ์ด ๋ถˆ๋Ÿ‰ํ•˜๊ฑฐ๋‚˜ ๊ธฐํ˜•์ ์ธ ํŒจํ„ด์œผ๋กœ ํŒ์ •๋œ ์Šคํ…Œ์ด์ง€๋ฅผ ๊ด€๋ฆฌ์ž๊ฐ€ ์ˆ˜๋™ ์‹๋ณ„ ์‹œ ๋ฐฑ์˜คํ”ผ์Šค ์–ด๋“œ๋ฏผ ๋Œ€์‹œ๋ณด๋“œ ์ƒ์—์„œ ๋‹จ ํ•œ ๋ฒˆ์˜ ์กฐ์ž‘์œผ๋กœ DB์—์„œ ์ฆ‰๊ฐ ํ•˜๋“œ ๋”œ๋ฆฌํŠธ(Hard Delete)ํ•  ์ˆ˜ ์žˆ๋Š” ๊ด€๋ฆฌ ํ”„๋กœ์„ธ์Šค ์ˆ˜๋ฆฝ
  • AI ํ˜‘์—… ๊ฐœ๋ฐœ ์Šคํƒ (AI-assisted Engineering Stack)
    ๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ๋ชจ๋˜ AI ์—์ด์ „ํŠธ ๊ฐœ๋ฐœ ํ˜‘์—… ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ค€์ˆ˜ํ•˜์—ฌ, Antigravity IDE ํ™˜๊ฒฝ์—์„œ ์ œ๊ณต๋˜๋Š” AI ์ฝ”๋”ฉ ์—์ด์ „ํŠธ์™€์˜ ์ •๊ตํ•œ ํŽ˜์–ด ํ”„๋กœ๊ทธ๋ž˜๋ฐ(Pair Programming) ๋ฐ ์ฝ”๋“œ ์ž๋™ ๋ฆฌํŒฉํ† ๋ง์„ ํ†ตํ•ด ๊ฐœ๋ฐœ ์ƒ์‚ฐ์„ฑ์„ ๊ทน๋Œ€ํ™”ํ–ˆ์Šต๋‹ˆ๋‹ค.

    • Gemini 3.5 Flash (Medium):
      • IDE ๋‚ด ๊ธฐ๋™: ์ฃผ ๊ฐœ๋ฐœ ํ”„๋กœ์„ธ์Šค ๊ธฐ๋™, ๋ฐ˜๋ณต์ ์ธ ์ฝ”๋“œ ๊ตฌํ˜„(Boilerplate ์ž‘์„ฑ ๋“ฑ) ๋ฐ ๋งˆํฌ๋‹ค์šด ๋ฌธ์„œ ์ •๊ทœํ™” ๋“ฑ์˜ ์ง€์—ฐ ์ตœ์†Œํ™” ์ž‘์—…์— ์ƒ์‹œ ๊ฐ€๋™
      • ํฌ๋กฌ(Chrome) ์—ฐ๋™ ๊ธฐ๋™: ์›น ๋ธŒ๋ผ์šฐ์ € ์—ฐ๋™ ์–ด์‹œ์Šคํ„ดํŠธ ํ™˜๊ฒฝ์„ ํ†ตํ•ด AWS CloudWatch/Grafana ์›น ์ฝ˜์†” ์ƒ์˜ ์‹ค์‹œ๊ฐ„ ์ธํ”„๋ผ ๋กœ๊ทธ ๋ถ„์„, Chrome DevTools ๊ฒฐํ•ฉํ˜• ํ”„๋ก ํŠธ์—”๋“œ Canvas ๊ฒฉ์ž ๋””์ž์ธ ๋ฐ CSS ๋ ˆ์ด์•„์›ƒ ์ ๊ฒ€, ์›น ์ฝ˜์†” ๋””๋ฒ„๊น… ์ž‘์—…์— ์œ ๊ธฐ์ ์œผ๋กœ ๋‹ค๊ฐ ๊ฐ€๋™
    • Claude Sonnet 4.6 (Thinking): ์ž‘์„ฑ๋œ ์†Œ์Šค์ฝ”๋“œ์˜ ์˜๋ฏธ๋ก ์  ๋ถ„์„, ์ปดํŒŒ์ผ ๋‹จ์œ„ ํ…Œ์ŠคํŠธ ์ •์  ์ง„๋‹จ ํ”ผ๋“œ๋ฐฑ ๋Œ€์‘ ๋ฐ ์•ต์ปค ๋งํฌ ๋ฌด๊ฒฐ์„ฑ ํฌ๋กœ์Šค ์ฒดํฌ ๋“ฑ ๊ณ ๋ฐ€๋„ ๋ฆฌ๋ทฐ/๊ฒ€ํ†  ์ž‘์—…์— ๊ธฐ๋™
    • Claude Opus 4.6 (Thinking): ํ† ํฐ ์‚ฌ์šฉ๋Ÿ‰ ๋ฐ ๊ฐ€์„ฑ๋น„(Token Cost Trade-off) ์ œ์•ฝ์„ ๊ณ ๋ คํ•˜์—ฌ ์ผ๋ฐ˜ ๊ฐœ๋ฐœ ๋‹จ๊ณ„์—์„œ์˜ ์ƒ์‹œ ์‚ฌ์šฉ์„ ์ œํ•œํ•˜๊ณ , ์ „์ฒด ์ธํ”„๋ผ ์•„ํ‚คํ…์ฒ˜ ์ •ํ•ฉ์„ฑ ์ตœ์ข… ์ ๊ฒ€ ๋ฐ ํฌํŠธํด๋ฆฌ์˜ค ์ „๋ฐ˜์˜ ๋ฌด๊ฒฐ์„ฑ ๊ฒ€์ˆ˜ ๋“ฑ ์ตœ์ƒ์œ„ ๋ฆฌ๋ทฐ ๊ฒŒ์ดํŠธ ์—ญํ• ์— ํ•œํ•ด ๊ทนํžˆ ์ œํ•œ์ ์œผ๋กœ ์„ ๋ณ„ ๊ฐ€๋™
  • ์—์ด์ „ํŠธ ๊ฑฐ๋ฒ„๋„Œ์Šค ๊ทœ์น™ (.agents/rules/)
    AI ์ฝ”๋”ฉ ์—์ด์ „ํŠธ์™€ ํ˜‘์—…ํ•˜์—ฌ ์ง€์† ๊ฐ€๋Šฅํ•œ ๋ฆฌ์Šคํฌ ๊ด€๋ฆฌ ๋ฐ ๊ณ ์‹ ๋ขฐ์„ฑ ์ฝ”๋”ฉ ์ปจ๋ฒค์…˜์„ ์ค€์ˆ˜ํ•˜๊ธฐ ์œ„ํ•ด ์ •์˜๋œ ํŒŒ์ผ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. ์ธ๊ฐ„ ๊ฐœ๋ฐœ์ž๊ฐ€ ์ตœ์ข… ๊ฒ€ํ† /์Šน์ธํ•˜๋Š” ํ†ต์ œ ๊ตฌ์กฐ ํ•˜์— ๊ฐœ๋ฐœ ์ •ํ•ฉ์„ฑ์„ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค:

    ๊ทœ์น™ ํŒŒ์ผ ์ฃผ์š” ๊ด€๋ฆฌ ๋ชฉ์  ๋ฐ ์ •์ฑ… ์š”์•ฝ ํ˜•์ƒ ์ถ”์  ์—ฌ๋ถ€
    architecture-and-tech-stack.md ํ”„๋ก ํŠธ/๋ฐฑ์—”๋“œ/์ธํ”„๋ผ ๋ ˆ์ด์–ด์˜ ๋‹ค์ค‘ ๋™์‹œ ์ˆ˜์ • ์ฐจ๋‹จ, Vue Reactivity ๋…ผ๋ฆฌ ์œ ์ถœ ๋ฐฉ์ง€, ์ˆœ์ฐจ ๋ฐฐํฌ ์ค€์ˆ˜ Git Tracked
    documentation-guidelines.md ์ƒ๋Œ€๊ฒฝ๋กœ(file:// ๊ธˆ์ง€) ์‚ฌ์šฉ, ๋งˆํฌ๋‹ค์šด ๊ฐœํ–‰ ๊ทœ์ • ์ค€์ˆ˜, ๋น„๊ต ์ˆ˜์น˜ ๋ฐ์ดํ„ฐ ๊ธฐ์ˆ  ์‹œ ํ…Œ์ด๋ธ”(Table) ์‹œ๊ฐํ™” ์˜๋ฌดํ™” Git Tracked
    git-and-commit-guidelines.md Conventional Commits ๊ทœ์น™ ์ค€์ˆ˜, ๋กœ์ปฌ ์ปค๋ฐ‹ ์ž๋™ ๋ณด์กด ๋ฐ ์›๊ฒฉ push ๊ฐœ๋ฐœ์ž ์œ„์ž„ Git Tracked (Force Added)
    workflow-and-tdd.md ์ฝ”์–ด ๋กœ์ง ์ž‘์„ฑ ์‹œ TDD(Test-Driven Development) ์„ ํ–‰ ์˜๋ฌดํ™” ๋ฐ progress_state.md ์ˆ˜์‹œ ๋™๊ธฐํ™” Git Tracked
    safety-and-communication.md ์š”๊ตฌ์‚ฌํ•ญ์ด ๋ชจํ˜ธํ•œ ๊ฒฝ์šฐ ์ž„์˜ ๊ตฌํ˜„(No Guessing)์„ ์ค‘๋‹จํ•˜๊ณ  ๊ฐœ๋ฐœ์ž ์Šน์ธ ๋Œ€๊ธฐ Git Tracked
    incident-reporting.md ์žฅ์•  ๋ฆฌํฌํŠธ ์ž‘์„ฑ ์‹œ 3W1H ์‚ฌ์ƒ์— ๊ทผ๊ฑฐํ•œ ๊ตฌ์ฒด์  ์›์ธ-๊ฒฐ๊ณผ ์ˆ˜์น˜ ๋ช…์„ธ ๋ฐ ํฌ์ŠคํŠธ๋ชจํ…œ ๊ตฌ์กฐํ™” Git Tracked

3.4. Troubleshooting

3.4.1. AI Puzzle Generation Parsing Incident

  • ๋ฐฐ๊ฒฝ
    • ์ดˆ๊ฒฝ๋Ÿ‰ LLM ๋ชจ๋ธ์ด 30x30 ๋Œ€ํ˜• ๊ทธ๋ฆฌ๋“œ ์ƒ์„ฑ ์‹œ ์‘๋‹ต ์ง€์—ฐ์„ ์•„๋ผ๊ธฐ ์œ„ํ•ด JSON ํฌ๋งท ๋Œ€์‹  Array(30).fill(0) ๊ฐ™์€ JS ๋ฌธ๋ฒ•์„ ๋ณ€ํ˜• ๋ฐ˜ํ™˜ํ•˜์—ฌ ๋ฐฑ์—”๋“œ Jackson ์—ญ์ง๋ ฌํ™” ์˜ค๋ฅ˜(JsonParseException) ๋ฐ ๋ฐฐ์น˜ ์Šค์ผ€์ค„๋Ÿฌ ์ค‘๋‹จ ์žฅ์•  ๋ฐœ์ƒ (Daily Puzzle Failure Report).
  • ํ•ด๊ฒฐ ๋ฐฉ์•ˆ
    • AI ํ”„๋กฌํ”„ํŠธ์— MUST be a literal 2D JSON array ์ œ์•ฝ ๊ฐ€๋“œ๋ ˆ์ผ์„ ์ฃผ์ž…ํ•˜๊ณ , ๋Œ€ํ˜• ํผ์ฆ ์ƒ์„ฑ ์‹œ ์ถœ๋ ฅ ํ† ํฐ ์•ˆ์ „์„ฑ ํ™•๋ณด๋ฅผ ์œ„ํ•ด ํ›„๋ณด๊ตฐ(Candidate) ๊ฐœ์ˆ˜๋ฅผ 5๊ฐœ์—์„œ 2๊ฐœ๋กœ ์ถ•์†Œ ์กฐ์ ˆํ•˜์—ฌ ํŒŒ์‹ฑ ์‹ ๋ขฐ์„ฑ์„ 100%๋กœ ํ™•๋ณดํ•จ.


4. Performance & Cost Analysis

์ดˆ๊ฒฝ๋Ÿ‰ ์ธํ”„๋ผ ์ž์›์„ ๋ฐ”ํƒ•์œผ๋กœ ๊ตฌ์ถ•๋œ ์„œ๋น„์Šค์˜ ์žฌ๋ฌด์  ๋น„์šฉ ํšจ์œจ์„ฑ๊ณผ ์‹œ์Šคํ…œ ์‹ ๋ขฐ์„ฑ(Reliability) ๋ฐ ์ด์šฉ์ž ์ง€ํ‘œ ์‹ค์ธก ๊ฒฐ๊ณผ๋ฅผ ๋Œ€์กฐ ๋ถ„์„ํ•˜์—ฌ ๊ธฐ์ˆ  ์˜์‚ฌ๊ฒฐ์ •์˜ ํƒ€๋‹น์„ฑ์„ ๊ฒ€์ฆํ•ฉ๋‹ˆ๋‹ค.

4.1. Operational Cost Comparison

  • ์ธํ”„๋ผ ์›”๊ฐ„ ์šด์˜ ๋น„์šฉ ๋ถ„์„ (Monthly Billing Summary)
    ์ž์› ๋‹ค์ค‘ํ™” ๋ฐ ๊ด€๋ฆฌํ˜• DB ๋ฐฐ์ œ ๋“ฑ์œผ๋กœ ๊ธฐ์กด ์˜ˆ์ƒ ์šด์˜๋น„ ๋Œ€๋น„ ์•ฝ 80%์˜ ๋น„์šฉ ์ ˆ๊ฐ์„ ์œ ์ง€ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

    ๊ตฌ๋ถ„ (Category) ๊ธฐ์กด ๊ตฌ์„ฑ ์˜ˆ์ƒ ๋น„์šฉ (Estimated) ์ตœ์ ํ™” ๊ตฌ์„ฑ ์‹ค์ œ ๋น„์šฉ (2026๋…„ 6์›”) ์ฃผ์š” ๋น„๊ณ  (Key Notes)
    ์ปดํ“จํŒ… ๋ฐ ์Šคํ† ๋ฆฌ์ง€ $20.00 / ์›” (t3.micro) $5.50 / ์›” (t3a.nano + EBS) GraalVM ๋„ค์ดํ‹ฐ๋ธŒ ์ปจํ…Œ์ด๋„ˆํ™”๋ฅผ ํ†ตํ•ด ๋ฉ”๋ชจ๋ฆฌ ์Šค๋ ˆ์‹ฑ ๊ทน๋ณต
    ๋กœ๋“œ ๋ฐธ๋Ÿฐ์„œ $20.00 / ์›” (AWS ALB) $0.00 / ์›” (Self-hosted Nginx) ALB ์ œ๊ฑฐ ํ›„ Route 53 ๊ณ ์ • EIP ๋‹ค์ด๋ ‰ํŠธ ๋งคํ•‘
    ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค $15.00 / ์›” (RDS PostgreSQL) $0.00 / ์›” (PostgreSQL Container) EC2 ํ˜ธ์ŠคํŠธ ๋‚ด๋ถ€ Docker Compose ํ™˜๊ฒฝ ๊ฐ€๋™
    ๋„คํŠธ์›Œํฌ & ๋„๋ฉ”์ธ N/A *1 $4.74 / ์›” (IP ์ฃผ์†Œ + Route 53) ํผ๋ธ”๋ฆญ IPv4 ์‚ฌ์šฉ๋ฃŒ ($3.70) + ํ˜ธ์ŠคํŒ… ์˜์—ญ ($1.04)
    ๊ธฐํƒ€ (๋ฐ์ดํ„ฐ ์ „์†ก ๋“ฑ) N/A *1 $1.21 / ์›” ๋ฐ์ดํ„ฐ ํŠธ๋ž˜ํ”ฝ ์ „์†ก ๋ฐ ์œ ํ‹ธ๋ฆฌํ‹ฐ ์ž์› ๋น„์šฉ
    ํ•ฉ๊ณ„ (Total) ์•ฝ $55.00 / ์›” ์ด $11.45 / ์›” ๊ธฐ์กด ๋Œ€๋น„ ์•ฝ 80% ๋น„์šฉ ์ ˆ๊ฐ ๋‹ฌ์„ฑ (์„ธํ›„ ์‹ค์ฒญ๊ตฌ์•ก)

    *1: ๊ธฐ์กด ๊ตฌ์„ฑ ๋‹จ๊ณ„์—์„œ ์‚ฐ์ถœ๋˜์ง€ ์•Š์€ ๋„คํŠธ์›Œํฌ ์œ ์ง€ ๋ฐ ๋„๋ฉ”์ธ ๊ณ ์ • ๋น„์šฉ์ž…๋‹ˆ๋‹ค.

4.2. SLO Targets vs Actual Performance

  • ์„œ๋น„์Šค ์ˆ˜์ค€ ๋ฐ ์‹ ๋ขฐ๋„ ๋น„๊ต ๋ถ„์„ (Reliability Performance Dashboard)
    ์ตœ๊ทผ 7์ผ(6์›” 25์ผ ~ 7์›” 2์ผ)๊ฐ„ ์ตœ์ ํ™” ํŠœ๋‹์ด ์™„์ „ํžˆ ์ข…๊ฒฐ๋˜์–ด ์•ˆ์ • ๊ถค๋„์— ์ง„์ž…ํ•œ ์‹œ์ ์˜ Grafana Cloud ์‹ค์ธก ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ๋Œ€์กฐ ๋ถ„์„์ž…๋‹ˆ๋‹ค. ๊ทน๋‹จ์ ์ธ 512MB RAM ์ž์› ์ œ์•ฝ์„ ๊ทน๋ณตํ•˜๊ณ  ์ƒ์šฉ ๊ฐ€์šฉ์„ฑ ๋ชฉํ‘œ๋ฅผ ์™„๋ฒฝํžˆ ์ถฉ์กฑํ•˜๊ณ  ์žˆ์Œ์„ ์ฆ๋ช…ํ•ฉ๋‹ˆ๋‹ค.

    ์„œ๋น„์Šค ์ˆ˜์ค€ ์ง€ํ‘œ (SLI) ๋ชฉํ‘œ ํ•œ๊ณ„์น˜ (SLO Target) ์‹ค์ธก ์„ฑ๊ณผ (7์ผ ํ‰๊ท  ์‹ค์ธก์น˜) ์ฃผ์š” ๋ถ„์„ ๋ฐ ์„ค๊ณ„ ๊ทผ๊ฑฐ (Design Rationale)
    Availability (๊ฐ€์šฉ์„ฑ) 99.0% 98.6% ~ 98.7%
    (ํ‰๊ท  98.63%)
    ์ดˆ๊ธฐ ์ž์› ๊ณ ๊ฐˆ(OOM) ๋ฌธ์ œ๋ฅผ ๊ทน๋ณตํ•œ ๋’ค ์ƒ์šฉ ์ˆ˜์ค€(99.0% ์ž„๊ณ„)์— ๊ทผ์ ‘ํ•œ ์•ˆ์ •์„ฑ ํ™•๋ณด
    MTBF (ํ‰๊ท  ๊ณ ์žฅ ๊ฐ„๊ฒฉ) 720์‹œ๊ฐ„ (30์ผ) ์ด์ƒ 10.4 ~ 18.4 ์‹œ๊ฐ„
    (ํ‰๊ท  14.2 ์‹œ๊ฐ„)
    ์ปจํ…Œ์ด๋„ˆ ๊ฒฝ๋Ÿ‰ํ™” ๋ฐ SWAP ํ™œ์„ฑํ™”๋กœ ๋ฐฐํฌ ์ฃผ๊ธฐ ์•ˆ์ •ํ™” ๋ฐ ๋น„์ •์ƒ ์ค‘์ง€ ์˜ˆ๋ฐฉ
    MTTR (ํ‰๊ท  ๋ณต๊ตฌ ์‹œ๊ฐ„) 10๋ถ„ ์ด๋‚ด 9.0 ~ 14.9 ๋ถ„
    (ํ‰๊ท  11.76 ๋ถ„)
    GraalVM Native Image ์ดˆ๊ณ ์† ์ปจํ…Œ์ด๋„ˆ ๊ฐ€๋™ ๋ฐ ๊ฒฝ๋ณด ์—ฐ๋™์„ ํ†ตํ•œ ๋ณต๊ตฌ ๋Œ€์‘ ๋‹จ์ถ•
    RPO (๋ณต๊ตฌ ์‹œ์  ๋ชฉํ‘œ) ์ตœ๋Œ€ 6์‹œ๊ฐ„ ์ตœ๋Œ€ 6์‹œ๊ฐ„ (๋ฐ์ดํ„ฐ ์‹ค์œ ์‹ค 0๊ฑด) ์ผ 4ํšŒ DB Dump ํŒŒ์ผ Amazon S3 ์›๊ฒฉ ์†Œ์‚ฐ ์Šค์ผ€์ค„ ๊ฐ€๋™
    RTO (๋ณต๊ตฌ ์‹œ๊ฐ„ ๋ชฉํ‘œ) ์ตœ๋Œ€ 20๋ถ„ 3๋ถ„ ์ด๋‚ด (๋ณต์› ์ž๋™ํ™” ํ…Œ์ŠคํŠธ ๊ฒฐ๊ณผ) Terraform/Ansible ์ฝ”๋“œ๋ฅผ ํ†ตํ•œ ์›ํด๋ฆญ ์žฌ๋นŒ๋“œ ๋ฐ ๋คํ”„ ์ž๋™ ์ ์žฌ
  • ์‹ค์ธก ์ง€ํ‘œ์— ๋Œ€ํ•œ ๊ธฐ์ˆ  ํšŒ๊ณ  (Operational Metrics Retrospective)

    • ๊ฐ€์šฉ์„ฑ ์ €ํ•˜ ์š”์ธ ๋ถ„์„: ํ”„๋กœ์ ํŠธ ์ดˆ๊ธฐ t3a.nano(512MB RAM)์˜ ๊ทน๋‹จ์ ์ธ ์ž์› ์ œ์•ฝ ํ•˜์—์„œ Nginx/Spring/PostgreSQL์„ ๋™์‹œ ๊ตฌ๋™ํ•  ๋•Œ์˜ OOM(Out of Memory) ํ˜„์ƒ๊ณผ Docker ๋ ˆ์ด์–ด๋ฅผ ํ†ตํ•œ ๋””์Šคํฌ ๊ณ ๊ฐˆ์ด ์ฃผ ์žฅ์•  ์š”์ธ์œผ๋กœ ๊ธฐ๋ก๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
    • ๋ณต๊ตฌ ์‹œ๊ฐ„(MTTR) ์ง€์—ฐ: ์ดˆ๊ธฐ ๊ฒฝ๋ณด ์ฑ„๋„(Slack/Email SNS) ๋ฐ SSM ์„ธ์…˜ ๋งค๋‹ˆ์ €๋ฅผ ํ†ตํ•œ ๋ณต๊ตฌ ์ž๋™ํ™” ์ธํ”„๋ผ๊ฐ€ ์™„์ „ํžˆ ๊ตฌ์ถ•๋˜๊ธฐ ์ „, ์ˆ˜๋™ SSH ์ ‘์† ๋ฐ ๋ฐ๋ชฌ ๋ถ„์„ ์ฒ˜๋ฆฌ์— ๋งŽ์€ ์‹œ๊ฐ„์ด ์ง€์—ฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
    • ์•ˆ์ •ํ™” ์„ฑ๊ณผ: ํŠธ๋Ÿฌ๋ธ”์ŠˆํŒ…(1.5.1. Host Memory Exhaustion Incident) ์กฐ์น˜(Agentless Pull ์Šค์œ„์นญ, 30MB ์ดํ•˜ GraalVM Native Image ๋ฐฐํฌ, swap ๊ฐ€์ƒ ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์„ฑ, Docker GC ์Šคํฌ๋ฆฝํŠธ ๋ฐ SSM ํ„ฐ๋„๋ง ๊ณ ๋„ํ™”)๋ฅผ ์™„๋ฃŒํ•œ ์ตœ๊ทผ 7์ผ ๊ฐ€๋™ ๊ธฐ์ค€์œผ๋กœ๋Š” ํ‰๊ท  ๊ฐ€์šฉ์„ฑ 98.63% ๋ฐ ํ‰๊ท  MTTR 11.76๋ถ„ ์ˆ˜์ค€์œผ๋กœ ์•ˆ์ • ๊ถค๋„์— ์•ˆ์ฐฉํ•˜์—ฌ ์„ฑ๋Šฅ ๊ฐœ์„  ํšจ๊ณผ๋ฅผ ๊ฒ€์ฆํ–ˆ์Šต๋‹ˆ๋‹ค.
  • Grafana Live Service SLA Dashboard
    ์ƒ์„ธ ๊ฐ€์šฉ์„ฑ ๋ฉ”ํŠธ๋ฆญ, MTTR, MTBF ์‹ค์‹œ๊ฐ„ ๋ณ€๋™ ์ถ”์ด๋ฅผ ์ฆ๋น™ํ•˜๋Š” Grafana Live Service SLA Dashboard Snapshot ์บก์ฒ˜๋ณธ์ž…๋‹ˆ๋‹ค.

    Grafana SLA Dashboard

4.3. User & System Traffic Metrics

  • ๊ตฌ์ถ• ์ดํ›„ ์„œ๋น„์Šค ๋ˆ„์  ์‹ค์ธก ์ง€ํ‘œ (Google Analytics 4 / Actuator)

    • ํ™œ์„ฑ ์‚ฌ์šฉ์ž ์ˆ˜ (Active Users): 39๋ช… (์ตœ๊ทผ 7์ผ Google Analytics 4 ์‹ค์ธก ๊ธฐ์ค€)
    • ์ด ์ด๋ฒคํŠธ ์ˆ˜ (Total Events): 535ํšŒ (์‚ฌ์šฉ์ž ์ƒํ˜ธ์ž‘์šฉ ๋ฐ ๊ฒŒ์ž„ ํ”Œ๋ ˆ์ด ํ–‰์œ„ ๋กœ๊ทธ)
    • ์‚ฌ์šฉ์ž๋‹น ํ‰๊ท  ์ฐธ์—ฌ ์‹œ๊ฐ„ (Average Engagement Time): 2๋ถ„ 53์ดˆ (์ฐธ์—ฌ ๋ชฐ์ž…๋„ ํ–ฅ์ƒ ํ™•์ธ)
    • AI ์ž๋™ ์ƒ์„ฑ ํผ์ฆ ์ˆ˜ (Daily Generated): 60+ ๊ฐœ (๋ฐ์ผ๋ฆฌ ์ƒ์„ฑ๊ธฐ ๋ฐ ๋ฌด๊ฒฐ์„ฑ ์†”๋ฒ„ ๊ฒ€์ฆ ํ†ต๊ณผ ๋ฐ์ดํ„ฐ ๋ˆ„์ )
  • Google Analytics 4 User Report
    ์ตœ๊ทผ 7์ผ๊ฐ„์˜ rogic.io ์‹ค ์‚ฌ์šฉ์ž ํ†ต๊ณ„(ํ™œ์„ฑ ์‚ฌ์šฉ์ž 39๋ช…, ์ƒˆ ์‚ฌ์šฉ์ž 37๋ช…, ํ‰๊ท  ์ฐธ์—ฌ ์‹œ๊ฐ„ 2๋ถ„ 53์ดˆ)๋ฅผ ์ฆ๋น™ํ•˜๋Š” ๊ตฌ๊ธ€ ์• ๋„๋ฆฌํ‹ฑ์Šค 4 ํš๋“ ๋ณด๊ณ ์„œ ์›๋ณธ ์บก์ฒ˜๋ณธ์ž…๋‹ˆ๋‹ค.

    Google Analytics 4 User Report

5.1. Local Development Setup

To run rogic.io on your local workstation, select one of the options below:

5.1.1. Docker Compose Stack Deployment

์ „์ฒด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์Šคํƒ(Database, Backend, Frontend)์„ ํ•œ ๋ฒˆ์— ๋นŒ๋“œํ•˜๊ณ  ๊ธฐ๋™ํ•˜๋ ค๋Š” ๊ฒฝ์šฐ ์•„๋ž˜ ์˜ต์…˜์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.

# In the project root, compile, build and start all container services
docker compose up --build
  • Frontend Web Client: http://localhost:5173
  • Backend REST API: http://localhost:8080
  • Prerequisites: Docker & Docker Compose ์„ค์น˜ ํ•„์š”.

5.1.2. Local and Container Hybrid Run

์ฝ”๋“œ ์ˆ˜์ • ์‹œ ์ฆ‰๊ฐ์ ์ธ ๋ผ์ด๋ธŒ ๋ฐ˜์˜ ๋ฐ ํ•ซ ๋ฆฌ๋กœ๋”ฉ(Vite dev server)์„ ์›ํ•˜๋Š” ๊ฒฝ์šฐ ์•„๋ž˜ ๋‹จ๊ณ„๋ณ„๋กœ ์„œ๋น„์Šค๋ฅผ ๊ธฐ๋™ํ•ฉ๋‹ˆ๋‹ค.

  • Step 1: PostgreSQL ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ๊ธฐ๋™

    # Start only the database container in the background
    docker compose up -d db
  • Step 2: ๋ฐฑ์—”๋“œ API ์„œ๋ฒ„ ์‹คํ–‰

    cd backend
    ./gradlew bootRun
    • API Server ๊ตฌ๋™ ์ฃผ์†Œ: http://localhost:8080
    • Prerequisites: Java 17 JDK ์„ค์น˜ ํ•„์š”.
  • Step 3: ํ”„๋ก ํŠธ์—”๋“œ ํด๋ผ์ด์–ธํŠธ ์‹คํ–‰

    cd frontend
    npm install
    npm run dev
    • Frontend Client ๊ตฌ๋™ ์ฃผ์†Œ: http://localhost:5173
    • Prerequisites: Node.js 20+ ์„ค์น˜ ํ•„์š”.

5.1.3. AWS SSM Session Manager Setup

๋ณด์•ˆ ๊ทธ๋ฃน 22๋ฒˆ ํฌํŠธ ํ์‡„ ํ™˜๊ฒฝ ํ•˜์—์„œ ์›๊ฒฉ EC2 ์ธ์Šคํ„ด์Šค ํ„ฐ๋ฏธ๋„์— ์ ‘์†ํ•˜๊ฑฐ๋‚˜ Ansible ํ„ฐ๋„์„ ์„ค์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

  • AWS CLI ๋ฐ Session Manager Plugin ์„ค์น˜
    ๋กœ์ปฌ ๊ธฐ๊ธฐ์— AWS CLI๋ฅผ ์ตœ์‹  ์ƒํƒœ๋กœ ์œ ์ง€ํ•˜๊ณ , SSH ํ„ฐ๋„๋ง์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด AWS ๊ณต์‹ session-manager-plugin์„ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

  • ๋กœ์ปฌ SSH Config ์„ค์ • (~/.ssh/config)
    ๋ณด์•ˆ ๊ทธ๋ฃน์—์„œ SSH(22) ํฌํŠธ๊ฐ€ ํ์‡„๋˜์—ˆ๋”๋ผ๋„ ํ˜ธ์ŠคํŠธ์˜ SSM ์—์ด์ „ํŠธ๋ฅผ ํ”„๋ก์‹œ๋กœ ์‚ผ์•„ SSH ํ„ฐ๋„์„ ์ˆ˜๋ฆฝํ•  ์ˆ˜ ์žˆ๋„๋ก ์•„๋ž˜ ์„ค์ •์„ ๋กœ์ปฌ SSH ํ™˜๊ฒฝ ํŒŒ์ผ์— ๋“ฑ๋กํ•ฉ๋‹ˆ๋‹ค.

    # SSH over SSM Tunnel Configuration
    Host i-* mi-*
        ProxyCommand aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters portNumber=%p
    
  • EC2 Host Connection Command
    ์ธ์Šคํ„ด์Šค ID์™€ ๊ธฐ์กด SSH ์ธ์ฆ ํ‚ค๋ฅผ ์‚ฌ์šฉํ•ด 22ํฌํŠธ ๋ฐฉํ™”๋ฒฝ ์ฐจ๋‹จ์„ ์šฐํšŒํ•˜์—ฌ ์‰˜ ์„ธ์…˜์„ ์•ˆ์ „ํ•˜๊ฒŒ ์ˆ˜๋ฆฝํ•ฉ๋‹ˆ๋‹ค.

    ssh -i ~/.ssh/nemologic-key.pem ubuntu@i-xxxxxxxxxxxxxxxxx
  • Ansible SSM SSH Tunneling Configuration (hosts.ini)
    22๋ฒˆ ํฌํŠธ ์ฐจ๋‹จ ์ƒํƒœ์—์„œ Ansible Playbook ๊ฐ€๋™์„ ์œ„ํ•ด ํ˜ธ์ŠคํŠธ์˜ SSM ์—์ด์ „ํŠธ๋ฅผ ํ”„๋ก์‹œ ํ„ฐ๋„๋กœ ์‚ผ์•„ ์—ฐ๊ฒฐํ•  ์ˆ˜ ์žˆ๋„๋ก ์•„๋ž˜์™€ ๊ฐ™์ด hosts.ini ์„ค์ •์„ ๊ตฌ์„ฑํ•˜์—ฌ SSH ์—ฐ๊ฒฐ์„ ์บก์Аํ™”ํ•ฉ๋‹ˆ๋‹ค.

[nemologic_servers]
nemologic-app-server ansible_host=<EC2_Instance_ID> ansible_user=ubuntu ansible_ssh_private_key_file=<PEM_File_Path> ansible_ssh_common_args='-o ProxyCommand="aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters portNumber=%p"'

5.2. PromQL Query Formulations (SLO Metrics)

Note

์ˆ˜์‹ ๋‚ด ๊ธฐํ˜ธ ์ •์˜: $P_t \in {0, 1}$๋Š” ํŠน์ • ์ธก์ • ์‹œ์  $t$์˜ API ํ—ฌ์Šค์ฒดํฌ ๊ฐ€์šฉ ์„ฑ๊ณต ์—ฌ๋ถ€(probe_success)๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์ดˆ๊ธฐ ์ˆ˜์ง‘ ์‹œ์ ์— ๊ฐ€์šฉ ์ƒํƒœ๊ฐ€ 0(์žฅ์• )์œผ๋กœ ์‹œ์ž‘ํ•˜๋Š” ๊ฒฝ์šฐ, ์ฒซ ๋ฒˆ์งธ ๋ณ€ํ™”(0 โ†’ 1)๊ฐ€ ์žฅ์•  ๋ณต๊ตฌ์ž„์—๋„ ํ™€์ˆ˜ ๋ณ€ํ™” ํšŸ์ˆ˜๊ฐ€ ๋ฐ˜ํ™˜๋˜์–ด ๋‚˜๋ˆ—์…ˆ ๊ฒฐ๊ณผ์— ์†Œ์ˆ˜์ ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์ฟผ๋ฆฌ์—์„œ๋Š” ์ •์ˆ˜ ๋‚˜๋ˆ—์…ˆ(๋‚ด๋ฆผ) ์ฒ˜๋ฆฌ๋ฅผ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.

  • API Health Status

$$\text{API Health} = \sum P_t$$

sum(probe_success{job="nemologic-api-health", instance="https://rogic.io/actuator/health"})
  • Dynamic Service Availability

$$\text{Availability (%)} = \text{avg}_{t \in \text{range}}(P_t) \times 100$$

avg_over_time(probe_success{job="nemologic-api-health", instance="https://rogic.io/actuator/health"}[$__range]) * 100
  • Dynamic Incident Count

$$\text{Incident Count} = \left\lfloor \frac{\text{changes}(P_t)}{2} \right\rfloor$$

floor(changes(probe_success{job="nemologic-api-health", instance="https://rogic.io/actuator/health"}[$__range]) / 2)
  • Dynamic MTTR (Mean Time To Recovery)

$$\text{MTTR (sec)} = \frac{\left(\text{count}_{t \in \text{range}}(P_t) - \sum_{t \in \text{range}} P_t\right) \times 60}{\text{clamp}_{\text{min}}\left(\frac{\text{changes}(P_t)}{2}, 1\right)}$$

((count_over_time(probe_success{job="nemologic-api-health", instance="https://rogic.io/actuator/health"}[$__range]) - sum_over_time(probe_success{job="nemologic-api-health", instance="https://rogic.io/actuator/health"}[$__range])) * 60) / clamp_min(changes(probe_success{job="nemologic-api-health", instance="https://rogic.io/actuator/health"}[$__range]) / 2, 1)
  • Dynamic MTBF (Mean Time Between Failures)

$$\text{MTBF (sec)} = \frac{\sum_{t \in \text{range}} P_t \times 60}{\text{clamp}_{\text{min}}\left(\frac{\text{changes}(P_t)}{2}, 1\right)}$$

(sum_over_time(probe_success{job="nemologic-api-health", instance="https://rogic.io/actuator/health"}[$__range]) * 60) / clamp_min(changes(probe_success{job="nemologic-api-health", instance="https://rogic.io/actuator/health"}[$__range]) / 2, 1)
  • $\text{clamp}_{\text{min}}(x, d) = \max(x, d)$์„ ์˜๋ฏธํ•˜๋ฉฐ, ์ธก์ • ๋Œ€์ƒ ๊ธฐ๊ฐ„ ์ค‘ ์žฅ์• /๋ณต๊ตฌ ์ „ํ™˜ ์ด๋ฒคํŠธ๊ฐ€ 0ํšŒ ๋ฐœ์ƒํ•  ๊ฒฝ์šฐ ๋ฐœ์ƒํ•˜๋Š” ๋ถ„๋ชจ 0 ์˜ค๋ฅ˜(Zero-division) ๋ฐฉ์ง€๋ฅผ ์œ„ํ•ด PromQL ํ•จ์ˆ˜๋กœ ๋ณด์ •ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

About

๐Ÿงฉ A variant Nonogram puzzle game featuring AI-generated grids and a rotation mechanism.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors