Update

jackowfish · jackowfish · commit cd044cc85238 · 2026-01-29T14:30:25.000-08:00
diff --git a/deploy/configuration-as-code/services/autoscaling.mdx b/deploy/configuration-as-code/services/autoscaling.mdx
@@ -41,9 +41,36 @@ For high availability, set `minInstances` to at least 3. See [High Availability
 
 When either CPU or memory usage exceeds your configured threshold, Porter automatically adds replicas. When usage drops, replicas are removed (down to your minimum).
 
-For example, with an 80% CPU threshold:
-- If average CPU across pods exceeds 80%, new replicas are added
-- If average CPU drops below 80%, excess replicas are removed
+### Example: Autoscaling in Action
+
+Consider an API service with this configuration:
+
+```yaml
+autoscaling:
+  enabled: true
+  minInstances: 2
+  maxInstances: 10
+  cpuThresholdPercent: 60
+  memoryThresholdPercent: 80
+```
+
+Here's how the autoscaler responds to changing load:
+
+| Time | Avg CPU | Avg Memory | Replicas | What Happens |
+|------|---------|------------|----------|--------------|
+| t=0 | 30% | 40% | 2 | Baseline: both metrics below thresholds |
+| t=1 | 75% | 50% | 4 | CPU (75%) exceeds 60% threshold → scale up |
+| t=2 | 90% | 60% | 6 | CPU still high → continue scaling up |
+| t=3 | 55% | 85% | 8 | CPU stabilized, but memory (85%) exceeds 80% → scale up |
+| t=4 | 45% | 70% | 8 | Both metrics below thresholds → no change (cooldown period) |
+| t=5 | 40% | 50% | 5 | Sustained low usage → scale down |
+| t=6 | 35% | 45% | 2 | Continue scaling down to minimum |
+
+Key behaviors:
+- **Either metric triggers scaling**: If CPU *or* memory exceeds its threshold, replicas are added
+- **Both must be low to scale down**: Replicas are only removed when both CPU and memory are below their thresholds
+- **Respects bounds**: Replicas never drop below `minInstances` (2) or exceed `maxInstances` (10)
+- **Gradual changes**: The autoscaler adjusts incrementally, not all at once, to avoid oscillation
 
 ## Custom Metrics Autoscaling (Prometheus)
 
@@ -78,11 +105,15 @@ Custom metrics autoscaling requires Prometheus to be accessible in your cluster.
 
 Scale Temporal workflow workers based on task queue depth. Porter monitors your Temporal task queues and automatically adjusts worker count.
 
+<Info>
+Temporal autoscaling requires a Temporal integration to be configured. See [Temporal Autoscaling](/configure/temporal-autoscaling) for setup details.
+</Info>
+
 | Field | Type | Description |
 |-------|------|-------------|
 | `temporalAutoscaling.temporalIntegrationId` | string | UUID of the Temporal integration |
 | `temporalAutoscaling.taskQueue` | string | Name of the Temporal task queue to monitor |
-| `temporalAutoscaling.targetQueueSize` | integer | Target number of tasks in queue per replica |
+| `temporalAutoscaling.targetQueueSize` | integer | How many queued tasks each replica should handle (e.g., set to 10 with 100 tasks queued → 10 replicas) |
 
 ```yaml
 services:
@@ -98,10 +129,6 @@ services:
         targetQueueSize: 10
 ```
 
-<Info>
-Temporal autoscaling requires a Temporal integration to be configured. See [Temporal Autoscaling](/configure/temporal-autoscaling) for setup details.
-</Info>
-
 ## Related Documentation
 
 - [Autoscaling Overview](/configure/autoscaling) - UI-based configuration and concepts