Commit 9d6bbb8
Project Team
Create fresh ollama.Client per request to avoid stale connection hangs
A persistent httpx client reuses connections. After a streaming response
completes, the underlying HTTP/1.1 connection can be left in a state
where the server has closed it but the client hasn't detected that yet.
The next request then hangs silently until the read timeout fires,
holding the flock the entire time and starving every subsequent request.
Create a new ollama.Client (and therefore a fresh httpx connection) for
each inference call. The per-request overhead is negligible compared to
the 10s inference time.1 parent 1ebec5c commit 9d6bbb8
1 file changed
Lines changed: 10 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
93 | 93 | | |
94 | 94 | | |
95 | 95 | | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
100 | 101 | | |
101 | 102 | | |
102 | 103 | | |
| |||
271 | 272 | | |
272 | 273 | | |
273 | 274 | | |
274 | | - | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
275 | 280 | | |
276 | 281 | | |
277 | 282 | | |
| |||
0 commit comments