fix: add configurable HTTP timeout for LLM API calls#238
Open
zy84338719 wants to merge 2 commits into
Open
Conversation
The LLM HTTP client had no user-configurable request timeout. When the
LLM API becomes unresponsive (e.g., rate limiting, network issues), the
ocr process hangs indefinitely with 0% CPU, holding an ESTABLISHED TCP
connection that never completes.
Changes:
- Add Timeout field to ResolvedEndpoint struct
- Add OCR_LLM_TIMEOUT environment variable (value in seconds)
- Add timeout_sec field to config.json llm and provider sections
- Pass timeout from ResolvedEndpoint to ClientConfig in NewLLMClient
The default timeout remains 5 minutes when not explicitly configured.
Users can now set a shorter timeout via:
- Environment: OCR_LLM_TIMEOUT=120
- Config file: {"llm": {"timeout_sec": 120}}
Fixes alibaba#237
Contributor
|
✅ OpenCodeReview: No comments generated. Looks good to me. |
There was a problem hiding this comment.
Pull request overview
Adds user-configurable HTTP request timeouts for LLM API calls by extending endpoint resolution/config parsing and wiring the resolved timeout into the LLM client factory, aiming to prevent ocr scan from hanging indefinitely on stalled LLM requests.
Changes:
- Introduce
ResolvedEndpoint.Timeout(time.Duration) and parse timeout from env/config (OCR_LLM_TIMEOUT,timeout_sec). - Support
timeout_secin both legacyllmconfig and provider entries. - Propagate the resolved timeout into
ClientConfigviaNewLLMClient.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| internal/llm/resolver.go | Adds timeout fields and parsing from environment/config into ResolvedEndpoint. |
| internal/llm/client.go | Wires ResolvedEndpoint.Timeout into ClientConfig.Timeout so downstream clients can apply it. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+129
to
+134
| var timeout time.Duration | ||
| if timeoutStr := os.Getenv(envOCRLLMTimeout); timeoutStr != "" { | ||
| timeoutSec, err := strconv.Atoi(strings.TrimSpace(timeoutStr)) | ||
| if err != nil { | ||
| return ResolvedEndpoint{}, false, fmt.Errorf("OCR_LLM_TIMEOUT must be an integer (seconds): %w", err) | ||
| } |
Comment on lines
+129
to
+138
| var timeout time.Duration | ||
| if timeoutStr := os.Getenv(envOCRLLMTimeout); timeoutStr != "" { | ||
| timeoutSec, err := strconv.Atoi(strings.TrimSpace(timeoutStr)) | ||
| if err != nil { | ||
| return ResolvedEndpoint{}, false, fmt.Errorf("OCR_LLM_TIMEOUT must be an integer (seconds): %w", err) | ||
| } | ||
| if timeoutSec > 0 { | ||
| timeout = time.Duration(timeoutSec) * time.Second | ||
| } | ||
| } |
Comment on lines
+308
to
+312
| var timeout time.Duration | ||
| if entry.TimeoutSec > 0 { | ||
| timeout = time.Duration(entry.TimeoutSec) * time.Second | ||
| } | ||
|
|
Comment on lines
+362
to
+365
| var timeout time.Duration | ||
| if cfg.Llm.TimeoutSec > 0 { | ||
| timeout = time.Duration(cfg.Llm.TimeoutSec) * time.Second | ||
| } |
Comment on lines
+150
to
153
| TimeoutSec int `json:"timeout_sec,omitempty"` // per-request HTTP timeout in seconds | ||
| ExtraBody map[string]any `json:"extra_body,omitempty"` | ||
| ExtraHeaders map[string]string `json:"extra_headers,omitempty"` | ||
| } |
Comment on lines
196
to
204
| cfg := ClientConfig{ | ||
| URL: ep.URL, | ||
| APIKey: ep.Token, | ||
| Model: ep.Model, | ||
| AuthHeader: ep.AuthHeader, | ||
| Timeout: ep.Timeout, | ||
| ExtraBody: ep.ExtraBody, | ||
| ExtraHeaders: ep.ExtraHeaders, | ||
| } |
- Make OCR_LLM_TIMEOUT a global override (works with any endpoint strategy) - Add proper overflow detection for timeout values (check before multiply) - Add validateTimeoutSec helper with negative/overflow guards - Add comprehensive tests for timeout parsing, validation, and forwarding - Reject negative timeout_sec in config files with explicit error
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The LLM HTTP client has no user-configurable request timeout. When the LLM API becomes unresponsive (rate limiting, network issues, server overload), the
ocr scanprocess hangs indefinitely with 0% CPU, holding an ESTABLISHED TCP connection that never completes.Reproduction: Run
ocr scanagainst an LLM API that occasionally hangs. The process blocks forever until manually killed.Root cause in
internal/llm/client.go:195:NewLLMClientcreatesClientConfigwithout settingTimeout(defaults to 0). WhileNewOpenAIClient/NewAnthropicClienthave a 5-minute fallback, there is no way for users to configure this.Fixes #237
Changes
Timeoutfield toResolvedEndpointstructOCR_LLM_TIMEOUTenvironment variable (value in seconds)timeout_secfield toconfig.jsonllm and provider sectionsResolvedEndpointtoClientConfiginNewLLMClientUsage
The default remains 5 minutes when not explicitly configured. All existing behavior is preserved.