Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 90 additions & 13 deletions .claude/rules/fault-tolerance.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,21 +13,41 @@ This rule covers retry, rate limiting, and timeout configuration for API command
All commands support configurable retry, rate limiting, and timeout via CLI flags:

```bash
# Light commands (colors, typography, download subcommands)
# Light commands (colors, typography, download colors/typography/icons/images)
exfig colors --max-retries 6 --rate-limit 15 --timeout 60

# Heavy commands (icons, images) also support fail-fast and concurrent downloads
# Heavy commands (icons, images, download all) also support fail-fast and
# concurrent downloads. `download all` shares one rate-limited client across
# its colors/typography/icons/images sub-flows.
exfig icons --max-retries 4 --rate-limit 15 --timeout 90 --fail-fast
exfig icons --concurrent-downloads 50 # Increase CDN parallelism (default: 20)
exfig download all --rate-limit 25 --concurrent-downloads 50

# Batch command with timeout (overrides all per-config timeouts)
# Batch command — `--timeout` is the resolved batch-level timeout.
# In batch mode `figma.*` rate-limiting fields (incl. `timeout`) are read ONLY from
# the first config; per-target `figma.timeout` in subsequent configs is ignored
# (warned under -v). Precedence: CLI > first config's figma.timeout > built-in default.
exfig batch ./configs/ --timeout 60 --rate-limit 20

# fetch command has its own --timeout in DownloadOptions
exfig fetch -f FILE_ID -r "Frame" -o ./out --timeout 45 --fail-fast
```

**Timeout precedence:** CLI `--timeout` > PKL `figma.timeout` > FigmaClient default (30s)
**Precedence (per knob):** CLI flag > PKL `figma.*` > built-in default. Same rule applies to:
`--timeout` / `figma.timeout`, `--rate-limit` / `figma.rateLimit`, `--max-retries` / `figma.maxRetries`,
`--concurrent-downloads` / `figma.concurrentDownloads`. Boolean flags (`--fail-fast`, `--resume`) and
batch settings (`--parallel`, `batch.parallel`/`failFast`/`resume`) follow OR semantics for booleans
and standard precedence for `parallel`.

`fetch` is config-free — it does not read `figma.*` PKL fields; only CLI flags and built-in defaults apply.

`colors` and `typography` make no CDN downloads, so `figma.concurrentDownloads` is silently ignored
(under `-v` a debug log records the skip).

`exfig batch` reads `batch:` and `figma.*` rate-limiting fields ONLY from the FIRST config in argv —
per-target `batch:` blocks in subsequent configs are ignored (logged under `-v`). The shared rate
limiter and download queue mean per-config `figma.rateLimit/maxRetries/concurrentDownloads` are
intentionally unused inside the batch run.

## Implementing Fault Tolerance in New Commands

Expand Down Expand Up @@ -64,18 +84,75 @@ let data = try await client.request(endpoint)

## Defaults

| Setting | Default | Description |
| ----------------- | ------- | ------------------------------------- |
| `maxRetries` | 4 | Number of retry attempts |
| `rateLimit` | 10 | Requests per minute |
| `timeout` | 30s | Request timeout |
| `failFast` | false | Stop on first error |
| `resume` | false | Resume from checkpoint |
| `checkpointExpiry`| 24h | Checkpoint file expiration |
| `concurrentDownloads` | 20 | Parallel CDN downloads |
| Setting | Default | PKL key | Description |
| ----------------- | ------- | -------------------------------- | ------------------------------------- |
| `maxRetries` | 4 | `figma.maxRetries` | Number of retry attempts |
| `rateLimit` | 10 | `figma.rateLimit` | Requests per minute |
| `timeout` | 30s | `figma.timeout` | Request timeout |
| `failFast` | false | `batch.failFast` (batch only) | Stop on first error |
| `resume` | false | `batch.resume` (batch only) | Resume from checkpoint |
| `checkpointExpiry`| 24h | (not configurable) | Checkpoint file expiration |
| `concurrentDownloads` | 20 | `figma.concurrentDownloads` | Parallel CDN downloads |
| `parallel` | 3 | `batch.parallel` (batch only) | Concurrent batch configs |

## PKL Config Alternative

Instead of repeating CLI flags across CI workflows, set the values in `exfig.pkl`:

```pkl
figma = new Figma.FigmaConfig {
lightFileId = "..."
rateLimit = 25 // was --rate-limit 25
maxRetries = 6 // was --max-retries 6
concurrentDownloads = 50 // was --concurrent-downloads 50
timeout = 60 // was --timeout 60
}

batch = new Batch.BatchConfig {
parallel = 8 // was exfig batch --parallel 8
failFast = true // was exfig batch --fail-fast
}
```

CLI flags still override these values per-run.

## Retry Behavior

- Exponential backoff: 2s -> 4s -> 8s -> 16s with jitter
- Respects `Retry-After` header from Figma API on 429 errors
- Checkpoint system saves progress for resumption on failure

## Batch Settings Architecture

`BatchSettingsResolver.resolve(...)` is the single point where CLI flags, the FIRST config's
`batch:`/`figma.*` blocks, and built-in defaults merge into `ResolvedBatchSettings`.
`BatchConfigRunner` consumes the resolved values and MUST NOT read per-config `figma.timeout`
(or any other rate-limiting field) again — doing so silently overrides the documented
"first-config-wins" rule and contradicts the `ignoredPerTargetFigmaRateLimiting` warning.

**Single source of truth pattern:**
- `FaultToleranceDefaults` — Swift defaults; parity with PKL schema asserted at runtime via
`BatchSettingsResolverExtendedTests.testPKLDefaultsMatchSwiftDefaults`.
- `FaultToleranceValidator.sanitized*` — clamps PKL values to ranges, falls back to default
with a `.invalidConfigValue` warning. Used both by the resolver and by per-command
`effective*` accessors.
- `FaultToleranceValidator.warnOnce(key:value:fallback:ui:)` — process-level dedup so the
same out-of-range PKL value doesn't produce duplicate warnings across many call-sites.
Has a `resetWarnedKeys()` test hook called from XCTest `setUp()` (`Lock<T>` pattern,
same shape as `WarningCollector` / `ManifestTracker`).

**PKL module reuse:**
- `PKLModuleCache` (actor) caches `ExFig.ModuleImpl` by URL across `BatchSettingsResolver`,
`logIgnoredPerTargetSettings`, and `BatchConfigRunner` to avoid re-evaluating the same
config 2-3 times in one batch run. URL keys go through `standardizedFileURL`.
- Consumers must call `try options.validateUsing(preloadedModule:)` (mirror of
`validate()` — same env+path checks minus PKL eval) to keep validation surfaces in sync.

**Slot product cap:** `concurrentDownloads * parallel` is capped at
`FaultToleranceDefaults.maxDownloadSlots = 1000` in `Batch.swift` to prevent EMFILE / CDN
throttling at extreme combinations (e.g. 200 × 50 = 10000). Out-of-range emits
`.excessiveDownloadSlots` warning.

**Batch RetryPolicy:** when constructing `RetryPolicy` in batch mode, honor `failFast`:
`RetryPolicy(maxRetries: resolved.failFast ? 0 : resolved.maxRetries)` — matches
`HeavyFaultToleranceOptions.createRetryPolicy()` outside batch.
39 changes: 39 additions & 0 deletions .claude/rules/gotchas.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,9 @@ Use `var` + `.append()` pattern or computed property returning the array.
- Use `Data("string".utf8)` not `"string".data(using: .utf8)!`
- Add `// swiftlint:disable:next force_try` before `try!` in tests
- Add `// swiftlint:disable file_length` for files > 400 lines
- For test files exceeding 400 lines: prefer splitting a second `final class` into its own
file (e.g. `PKLModuleCacheTests.swift` was extracted from
`BatchSettingsResolverExtendedTests.swift`) over `// swiftlint:disable file_length`

### swiftlint:disable with Doc Comments

Expand Down Expand Up @@ -218,6 +221,42 @@ Without the extension in `PKLEvaluator.swift`, `.localizedDescription` returns u
`"The operation couldn't be completed. (PklSwift.PklError error 1.)"` instead of the actual PKL error.
Fix: `extension PklError: @retroactive LocalizedError` in `Sources/ExFigConfig/PKL/PKLEvaluator.swift`.

## PKL Test Fixtures

### Number-typed fields require `.0` suffix

pkl-swift strictly distinguishes `Int` vs `Double` at decode time. PKL fields generated as
`Double?` (e.g. `figma.timeout: Number(isBetween(1, 600))? = 30.0`) must be written with
explicit decimal in test fixtures — otherwise decode throws
`DecodingError.typeMismatch: expected value of type Double`:

```pkl
# BAD — pkl-swift fails to decode
figma { timeout = 60 }

# GOOD
figma { timeout = 60.0 }
```

This is independent of how PKL itself parses the value; the breakage is in pkl-swift's
strict type matching, not in PKL semantics.

### PKL constraints validate at amends-time, not at sanitize-time

`Int(isBetween(1, 50))` constraints in `Schemas/*.pkl` fail during PKL evaluation, BEFORE
`FaultToleranceValidator.sanitized*` ever runs. Fixtures with `parallel = 99` will fail
to load entirely, not produce a clamped value:

```swift
// BAD — fixture never loads, test asserts wrong condition
let url = try BatchResolverFixture.make(batch: "parallel = 99")

// GOOD — to test sanitizer with out-of-range values, call it directly
XCTAssertEqual(FaultToleranceValidator.sanitizedParallel(99, ui: ui), 3)
```

If a test needs both an in-range PKL value AND an out-of-range case, split into two tests.

## Figma API Rate Limits

**Official docs:** https://developers.figma.com/docs/rest-api/rate-limits/
Expand Down
24 changes: 19 additions & 5 deletions Sources/ExFigCLI/Batch/BatchConfigRunner.swift
Original file line number Diff line number Diff line change
Expand Up @@ -222,10 +222,16 @@ struct BatchConfigRunner {
let cachePath: String?
let experimentalGranularCache: Bool
let concurrentDownloads: Int
/// CLI timeout override (in seconds). When set, overrides per-config timeout.
/// Resolved batch-level timeout (in seconds). Already merged via `BatchSettingsResolver`:
/// CLI flag > FIRST config's `figma.timeout` > built-in default. Per-config `figma.timeout`
/// values in subsequent configs are intentionally ignored — `BatchSettingsResolver` warns
/// the user under `--verbose` via `.ignoredPerTargetFigmaRateLimiting`.
let cliTimeout: Int?
/// Priority for this config's downloads (lower = higher priority, based on submission order).
let configPriority: Int
/// Optional pre-evaluated PKL module cache (avoids re-eval of configs already loaded by
/// `BatchSettingsResolver`).
let moduleCache: PKLModuleCache?
/// Test-only: injected exporter for unit testing.
private let _testExporter: (any ConfigExportPerforming)?

Expand All @@ -240,9 +246,10 @@ struct BatchConfigRunner {
force: Bool = false,
cachePath: String? = nil,
experimentalGranularCache: Bool = false,
concurrentDownloads: Int = FileDownloader.defaultMaxConcurrentDownloads,
concurrentDownloads: Int = FaultToleranceDefaults.concurrentDownloads,
cliTimeout: Int? = nil,
configPriority: Int = 0,
moduleCache: PKLModuleCache? = nil,
exporter: (any ConfigExportPerforming)? = nil
) {
self.rateLimiter = rateLimiter
Expand All @@ -258,6 +265,7 @@ struct BatchConfigRunner {
self.concurrentDownloads = concurrentDownloads
self.cliTimeout = cliTimeout
self.configPriority = configPriority
self.moduleCache = moduleCache
_testExporter = exporter
}

Expand Down Expand Up @@ -299,13 +307,19 @@ struct BatchConfigRunner {
do {
var options = ExFigOptions()
options.input = configFile.url.path
try options.validate()
if let cached = await moduleCache?.get(for: configFile.url) {
try options.validateUsing(preloadedModule: cached)
} else {
try options.validate()
}

let retryHandler = RetryLogger.createHandler(ui: ui, maxAttempts: maxRetries)

// CLI timeout takes precedence over per-config timeout
// Use ONLY the batch-level resolved timeout. Per-config `figma.timeout` is
// intentionally ignored — `BatchSettingsResolver` already merged CLI flag +
// FIRST config's value, and warns about ignored per-target values under -v.
// Honoring per-config timeout here would silently override that resolution.
let effectiveTimeout: TimeInterval? = cliTimeout.map { TimeInterval($0) }
?? options.params.figma?.timeout

let baseClient = try FigmaClient(
accessToken: options.requireFigmaToken(),
Expand Down
Loading
Loading