Skip to content

Prevent stream errors from being silently swallowed when servers misbehave#55

Merged
dcrockwell merged 1 commit into
developfrom
fix/stream-error-decode
Mar 3, 2026
Merged

Prevent stream errors from being silently swallowed when servers misbehave#55
dcrockwell merged 1 commit into
developfrom
fix/stream-error-decode

Conversation

@dcrockwell
Copy link
Copy Markdown
Contributor

Summary

When making streaming HTTP requests, transport-level failures (connection refused, socket dropped mid-stream, DNS errors) caused the HTTP client to lose the actual error details and replace them with unhelpful generic messages like "Unknown stream error". This meant developers couldn't tell why their streaming request failed -- the real reason was silently discarded.

This fix ensures that no matter what the server sends back -- malformed data, non-UTF-8 bytes, raw Erlang atoms, or abrupt disconnects -- the error reason always surfaces as a readable, meaningful string.

Why

  • Streaming errors from Erlang's httpc arrive as raw atoms or tuples (e.g., socket_closed_remotely, {failed_connect, ...}), not strings. The pull-based streaming path (stream_yielder) passed these through without formatting, so Gleam's string decoder failed and the error was replaced with "Unknown stream error".
  • The message-based streaming path (start_stream) did format errors, but io_lib:format can produce Latin-1 encoded bytes. Gleam enforces strict UTF-8, so it rejected these with a cryptic DecodeError("String", "String", []), again hiding the real error.
  • Previous tests only hit the mock server which always returns well-formed HTTP responses. Transport-level failures (connection refused, socket drops) were never exercised, so this gap went undetected.

What

  • Erlang layer: Added ensure_utf8_binary/1 that guarantees valid UTF-8 output from any Erlang term. Updated all 5 error formatting functions to use it. Fixed the pull-based path to format raw error reasons before passing them to Gleam.
  • Gleam layer: Added three-tier fallback decoders in both decode_error_reason and receive_next -- try string, try bit_array with UTF-8 conversion, fall back to string.inspect. Error reasons can never be silently lost.
  • Mock server: Added /stream/drop (sends chunks then crashes to simulate connection drop) and /non-utf8-error (returns HTTP 400 with invalid UTF-8 bytes in the body).
  • Tests: 9 new tests covering connection refused, mid-stream connection drops, and non-UTF-8 error bodies across all three execution modes (send, start_stream, stream_yielder). All 177 tests pass.
  • Version: Bumped to 5.1.1 with changelog and release notes.

How

The fix uses a "belt and suspenders" approach -- errors are sanitized to valid UTF-8 strings on the Erlang side and decoded with robust fallbacks on the Gleam side. This dual-layer defense ensures that even if a future Erlang OTP update changes the error format, the client will still surface a useful error message rather than crashing or swallowing the details.

Test plan

  • Connection refused errors surface meaningful messages across all streaming modes
  • Mid-stream connection drops are caught and reported correctly
  • Non-UTF-8 response bodies don't crash the client
  • Error strings contain useful information (not "Unknown stream error")
  • All 177 existing tests still pass (no regressions)

…ll Erlang term types

## Why This Change Was Made
- Erlang's `httpc` returns transport-level error reasons (e.g., `socket_closed_remotely`,
  `{failed_connect, ...}`, `econnrefused`) as raw atoms or tuples, not strings. The
  pull-based streaming path passed these through unformatted, causing `d.string` decode
  failures in Gleam. The message-based path formatted them via `io_lib:format("~p", ...)`,
  which can produce Latin-1 binaries that Gleam's UTF-8-strict string decoder rejects.
  In both cases, the actual error information was lost and replaced with generic messages.

## What Was Changed
- Added `ensure_utf8_binary/1` in `dream_httpc_shim.erl` that validates UTF-8, falls
  back to Latin-1 reinterpretation, then to `~w` (pure ASCII) as a last resort
- Updated all error formatting functions (`format_error`, `format_complete_response_error`,
  `format_exit_reason`, `to_binary`, `ref_to_string`) to use `ensure_utf8_binary`
- Fixed `stream_owner_wait` and `stream_owner_next_message` to call `format_error(Reason)`
  instead of passing raw Erlang terms through
- Added three-tier fallback decoders in `decode_error_reason` (client.gleam) and
  `receive_next` (internal.gleam): try d.string -> try d.bit_array -> string.inspect
- Added `/non-utf8-error` and `/stream/drop` mock server endpoints
- Added 9 new tests covering connection refused, mid-stream drops, and non-UTF-8 bodies
- Bumped version to 5.1.1 with changelog and release notes

## Note to Future Engineer
- The "belt and suspenders" strategy (format on Erlang side AND fallback on Gleam side) is
  intentional -- Erlang's type system is about as strict as a speed limit sign at 3 AM, so
  we defend at both layers. If you're wondering why `ensure_utf8_binary` handles binaries,
  lists, AND arbitrary terms: congratulations, you've discovered that `io_lib:format` returns
  an iolist (a deeply nested list of integers and binaries), not a binary. Yes, really.
- The `/stream/drop` test endpoint literally panics on purpose. If you see "intentional crash"
  in the logs during tests, that's working as designed, not a cry for help.
@dcrockwell dcrockwell self-assigned this Mar 3, 2026
@dcrockwell dcrockwell added bug Something isn't working module Change to a dream module labels Mar 3, 2026
@dcrockwell dcrockwell merged commit 8e330ac into develop Mar 3, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working module Change to a dream module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant