Skip to content

fix(login): add APIClient request timeout to stop login page infinite spinner#195

Open
yujiawei wants to merge 1 commit into
mainfrom
fix/login-stuck-spinner-request-timeout
Open

fix(login): add APIClient request timeout to stop login page infinite spinner#195
yujiawei wants to merge 1 commit into
mainfrom
fix/login-stuck-spinner-request-timeout

Conversation

@yujiawei
Copy link
Copy Markdown
Contributor

@yujiawei yujiawei commented Jun 1, 2026

问题

登录页「一直转圈」无法恢复(YUJ-2628,群里反馈、紧急)。

根因

packages/dmworkbase/src/Service/APIClient.ts 里的 axios 客户端没有任何请求超时。一旦后端/网关迟迟不返回(连接 hang、网关 504 前的长挂起、移动弱网),user/login / user/loginuuid / space/my 这些请求的 Promise 永远不会 settle,于是 LoginVM.loginLoading 一直停在 true,登录按钮和二维码就一直转圈,且没有任何路径能复位。

OIDC 流程之前单独加过 10s 超时(#1061),但主登录路径走的普通 axios 没有兜底超时。

改动

  • 给 axios 配置 20s 全局默认超时。上传等长耗时请求走的是直接的 axios.post/put(各自带 timeout),不经过 APIClient 的 get/post/put/delete 封装,因此不受影响。
  • normalizeApiError 里把「无响应」错误(超时 ECONNABORTED / 网络中断 ERR_NETWORK)归类成本地化提示(新增 i18n key api.error.timeout / api.error.network,中英文都加),这样请求超时被 reject 后前端 .catch/finally 能复位 loading 并提示用户重试,而不是落到「未知错误」。

测试

  • apiError.test.ts:新增超时 / 网络错误归类用例 + 不误判普通 HTTP 错误的回归守卫。
  • APIClient.test.ts:断言全局默认 timeout 已配置。
  • dmworkbase Service + i18n 全套测试通过(132 passed;唯一失败的 UploadCredentials.test.ts 是 lottie-web/jsdom 预存在的导入问题,与本改动无关,clean tree 上同样失败)。

Closes #2628

… spinner

The axios-based APIClient had no request timeout, so a hung backend/gateway
response left user/login, user/loginuuid and space/my Promises unsettled.
LoginVM.loginLoading then stayed true forever and the login button / QR code
spun indefinitely with no way to recover (YUJ-2628).

- Set a 20s global axios default timeout (uploads use direct axios.post/put
  with their own timeouts, so they are unaffected).
- Classify no-response errors (timeout ECONNABORTED / network ERR_NETWORK) in
  normalizeApiError into localized messages (api.error.timeout / .network) so
  .catch/finally resets loading and the user is told to retry, instead of the
  generic unknown-error fallback.

Closes #2628
@yujiawei yujiawei requested a review from a team as a code owner June 1, 2026 12:26
@github-actions github-actions Bot added the size/M PR size: M label Jun 1, 2026
Copy link
Copy Markdown
Contributor

@lml2468 lml2468 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Fixes login page infinite spinner (YUJ-2628) by adding a 20s global default timeout to the axios instance in APIClient.initAxios(), and classifying timeout/network errors in normalizeApiError with user-facing i18n messages.

Findings

No blocking issues.

  • P2 axios.defaults.timeout is a global setting on the default axios instance, affecting ALL requests through it — not just APIClient wrappers. The comment says file uploads use direct axios.post/put with their own timeout configs so they are unaffected, but this relies on every upload callsite explicitly setting timeout. If any upload path omits config.timeout, it will inherit the 20s default and potentially fail on large files. Worth auditing upload callsites to confirm, but not blocking since per-request timeout always takes precedence.

Verification

  • DEFAULT_REQUEST_TIMEOUT_MS = 20_000 (20s) — reasonable for API calls; long enough for slow networks, short enough to unblock UI. ✅
  • normalizeApiError guard httpStatus === undefined && !isRecord(data) correctly identifies no-response scenarios (timeout / network errors have no HTTP response). ✅
  • Timeout detection: ECONNABORTED code OR /timeout/i message match. Network detection: ERR_NETWORK code OR /network\s*error/i match. Both are standard axios error signatures. ✅
  • Misclassification guard tested: normal HTTP 500 error not caught by timeout/network branch. ✅
  • i18n strings added in both en-US.json and zh-CN.json. ✅
  • 5 new test cases covering timeout, network, and non-misclassification scenarios. ✅
  • Build CI passed. ✅

Verdict

APPROVED — clean, well-tested fix for a real UX bug.

Copy link
Copy Markdown
Contributor Author

@yujiawei yujiawei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #195 (octo-web)

Verdict: APPROVED

This is a correct, well-scoped, and well-tested fix for the login-page infinite-spinner bug. No blocking (P0/P1) issues. Two minor documentation/comment-accuracy nits (P2) noted below — neither blocks merge.

1. What the change does

  • Sets a global default request timeout axios.defaults.timeout = 20_000 in APIClient.initAxios() (packages/dmworkbase/src/Service/APIClient.ts:78).
  • Classifies no-response errors (request timeout / network failure) into localized messages in normalizeApiError (packages/dmworkbase/src/Service/apiError.ts:95-108), adding i18n keys api.error.timeout / api.error.network in both en-US.json and zh-CN.json.

2. Correctness — confirmed

  • Root-cause fix is valid. Before this change axios had no timeout, so a hung user/login / user/loginuuid / space/my request never settled and LoginVM.loginLoading / qrcodeLoading stayed true forever. With a 20s timeout the promise now rejects, which fires the existing .finally() / .catch() blocks in login_vm.tsx (e.g. requestLoginWithUsernameAndPwd at :231-232, requestUUID at :429-430) that reset the loading flags. The spinner can now recover. The fix targets the actual root cause, not just a symptom.
  • No-response classification guard is sound. The branch only runs when input.httpStatus === undefined && !isRecord(data) (apiError.ts:95). On a real HTTP error (e.g. 500), error.response.status is defined, so the new branch is correctly skipped and the existing status-based handling applies. The regression test does not misclassify a normal HTTP error as timeout/network locks this in.
  • Interceptor wiring is correct. The response error interceptor passes raw: error (APIClient.ts:104-108), so the AxiosError (code, message) reaches normalizeApiError and the ECONNABORTED / message-timeout detection fires.
  • Robust against the axios version in use. This repo pins axios@^0.25.0, where the ERR_NETWORK code does not yet exist (introduced in 0.27) — network failures surface only as message: "Network Error" with no code. The message-regex fallback (/network\s*error/i, /timeout/i) correctly covers this. Good defensive design.
  • i18n keys resolve. t("base.api.error.timeout") uses the registered base namespace; keys are stored as api.error.* and flattened under base. at registration. Consistent with all sibling calls (base.api.error.unknown, etc.).

3. Tests — verified locally

Ran the two affected suites with vitest:

Test Files  2 passed (2)
Tests       20 passed (20)

Both apiError.test.ts (timeout/network classification + non-misclassification guard) and APIClient.test.ts (global default timeout registered) pass.

4. Minor findings (P2 — non-blocking)

P2-1 — Inaccurate claim that direct-axios uploads are unaffected.
The PR description and the code comment (APIClient.ts:61) state that file/avatar uploads "carry their own timeout, so they are unaffected." This is imprecise. The three avatar uploads — BotDetailModal/index.tsx:246, MeInfo/vm.tsx:280, ChannelAvatar/index.tsx:29 — call axios.post(...) on the default axios instance with no per-request timeout, so they now inherit the 20s default. In practice this is harmless (avatars are small images well under 20s), and large file/voice transfers use XMLHttpRequest / task-based transport that bypasses axios entirely. Suggest tightening the wording to "small avatar uploads inherit the 20s default, which is sufficient; large transfers use XHR and are unaffected."

P2-2 — config.timeout override is not actually supported by the wrapper methods.
The comment at APIClient.ts:77 says individual requests "can still override via config.timeout." However RequestConfig exposes only param / data / resp (APIClient.ts:207-211), and get/post/put/delete never forward a timeout into the axios call. So a per-request override is not possible through the APIClient wrapper today — only by calling axios directly. Either drop the comment or add timeout?: number to RequestConfig and thread it through. Not needed for this fix, since no current APIClient route is a legitimate long-poll exceeding 20s (polling endpoints like loginstatus use repeated short requests).

5. Additional observations

  • The 20s value is reasonable for interactive auth/space calls. No APIClient-routed long-poll or streaming endpoint was found that would be broken by it.
  • Request cancellation (axios cancel) is not matched by the new branch and would fall through to "unknown error," but this is a pre-existing edge unrelated to this fix and not user-facing on the login path.

Recommendation

Approve and merge. The P2 nits are comment/description wording only and can be addressed in a follow-up or a quick amend; they do not affect behavior or correctness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants