pip3 install -r requirements.txt
python3 -m playwright install chromium.env.exampleμ 볡μ¬ν΄μ νκ²½λ³ νμΌ μμ±:
cp .env.example .env.local # λ‘컬 κ°λ°
cp .env.example .env.production # νλ‘λμ
μλ² μ€ν μ APP_ENVλ‘ νκ²½ μ ν (κΈ°λ³Έκ°: local):
APP_ENV=production uvicorn main:app --host 0.0.0.0 --port 8000main.py β FastAPI μλ² + APScheduler (μ£Όκ° μ€μΌμ€λ¬)
scraper.py β μ€ν¬λν μ½μ΄ + CLI μ§μ
μ
models.py β Product λ°μ΄ν° ν΄λμ€
parsers.py β νμ± ν¨μ (μ΄λ¦/μλ, κ°κ²©, νμ )
storage.py β μ μ₯ ν¨μ (JSON / CSV / SQL)
uvicorn main:app --host 0.0.0.0 --port 8000GET /β μ€μΌμ€λ¬ μν λ° λ§μ§λ§ μ€ν κ²°κ³ΌPOST /runβ μλμΌλ‘ μ€ν¬λν μ¦μ μ€ν
μ€μΌμ€: λ§€μ£Ό μμμΌ μμ (KST) μλ μ€ν
# κΈ°λ³Έ: products/<YYYY-MM-DD>/<μΉ΄ν
κ³ λ¦¬>.json + SQL μ μ₯
python3 scraper.py "https://www.costco.co.kr/Foods/RiceGrains/c/cos_10.1"
# μΆλ ₯ κ²½λ‘ μ§μ μ§μ
python3 scraper.py "https://www.costco.co.kr/Foods/RiceGrains/c/cos_10.1" --output products.csvλ°μ΄ν° νλ¦:
scrape_url()β λΈλΌμ°μ μ€ν, μ 체 νμ΄μ§ μν_page_url()β νμ΄μ§ URL μμ± (1νμ΄μ§ β κΈ°λ³Έ URL, Nνμ΄μ§ β?page=N-1)scrape_page()β μ€ν¬λ‘€λ‘ lazy-load νΈλ¦¬κ±° νli.product-list-itemνμ±
CSS μ λ ν° (costco.co.kr Angular μ±):
| νλ | μ λ ν° | ν΄λ°± |
|---|---|---|
| μν 컨ν μ΄λ | li.product-list-item |
li[class*='product'] |
| μνλͺ | a.lister-name .notranslate |
a.lister-name |
| κ°κ²© | .original-price .product-price-amount |
β |
| νμ | .star-ratings-css[aria-label] |
β |
| μ΄λ―Έμ§ | picture source[type='image/webp'] |
picture img |
Product μ€ν€λ§:
name, quantity, price, rating, review_count, image_url, product_url
SQL μ€ν€λ§ (products ν
μ΄λΈ):
name, quantity, price, rating, review_count, image_url, product_url, created_at