Commit 65dfd34
Project Team
Raise default Ollama timeout from 60s to 120s
llama3.2-vision on a T4 can take 60-90s for inference on some images,
particularly during the image-encoding phase before the first token.
With a 60s timeout, the first request sometimes consumed the entire
budget, leaving queued requests nothing to wait with. 120s gives
enough headroom for worst-case inference while still bounding truly
hung requests.1 parent 015127b commit 65dfd34
1 file changed
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
| 29 | + | |
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| |||
0 commit comments