Skip to content

fix(pool): add connect timeout — prevent 20-second stall on Deluge reconnect#34

Merged
wcypierre merged 1 commit into
masterfrom
fix/pool-connect-timeout
May 2, 2026
Merged

fix(pool): add connect timeout — prevent 20-second stall on Deluge reconnect#34
wcypierre merged 1 commit into
masterfrom
fix/pool-connect-timeout

Conversation

@wcypierre
Copy link
Copy Markdown
Owner

Problem

The torrent list was taking 20+ seconds to load. Root cause: the pool worker goroutine was blocking indefinitely inside conn.Connect().

pool.getConn() called conn.Connect() synchronously. The pool has a single worker goroutine — while it is inside getConn(), the entire select loop in worker() is frozen. All concurrent HTTP requests waiting for a pool connection are also stuck (they cannot send on pool.get until the pool worker is ready to receive).

conn.Connect() in go-libdeluge uses new(net.Dialer) with no dial timeout. When the TCP connection hangs (no RST, just silence), Go waits for the OS SYN timeout — ~20 seconds on Windows.

Why it regressed now

The April 26 commit changed go.mod from go 1.17go 1.24, then the telemetry commit raised it to go 1.25.0. From Go 1.21+, the go directive activates the pure Go DNS resolver, which returns both ::1 and 127.0.0.1 for localhost. If the Deluge daemon listens only on IPv4 and the system drops (rather than refuses) IPv6 packets, the SYN to ::1:PORT silently times out after ~20 s before Go falls back to IPv4.

With IdleConnectionTime = 30s, connections expire if idle for 30 seconds. Each reconnect event then triggers the 20-second hang, blocking all concurrent requests.

Fix

Run conn.Connect() in a goroutine and select on the result against a configurable timeout (default 10 s, overridable via --connect-timeout / POOL_CONNECT_TIMEOUT). Orphaned Connect() goroutines on timeout are cleaned up asynchronously.

The pool worker is still serialised (one connection established at a time), but it can no longer be stalled for longer than ConnectTimeout.

…CP dial

pool.getConn() called conn.Connect() synchronously, blocking the single
pool worker goroutine for the full OS TCP timeout (~20 s on Windows) when
a connection attempt hung. This stalled ALL concurrent HTTP requests.

Root cause: go.mod upgraded from go 1.17 to go 1.25.0 activates Go 1.21+
DNS behaviour, where net.Dialer tries IPv6 (::1) before IPv4 for localhost.
If IPv6 packets are dropped rather than refused, the SYN times out silently
before falling back – blocking the pool worker for the entire duration.

Fix: run Connect() in a goroutine and select on result vs. a configurable
ConnectTimeout (default 10 s, overridable via --connect-timeout / POOL_CONNECT_TIMEOUT).
Orphaned Connect() goroutines are cleaned up asynchronously on timeout.
@wcypierre wcypierre merged commit 7618099 into master May 2, 2026
4 checks passed
@wcypierre wcypierre deleted the fix/pool-connect-timeout branch May 2, 2026 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant