Skip to content

feat: add cross-platform HTTP/1.1 parser and local proxy server for iOS and Android#367

Merged
dcalhoun merged 9 commits intotrunkfrom
jkmassel/swift-http-proxy-server
Mar 26, 2026
Merged

feat: add cross-platform HTTP/1.1 parser and local proxy server for iOS and Android#367
dcalhoun merged 9 commits intotrunkfrom
jkmassel/swift-http-proxy-server

Conversation

@jkmassel
Copy link
Copy Markdown
Contributor

@jkmassel jkmassel commented Mar 11, 2026

What?

Adds a hardened, RFC-conformant HTTP/1.1 request parser and local proxy server for both iOS (Swift) and Android (Kotlin), with shared cross-platform test fixtures to guarantee behavioral parity.

Why?

GutenbergKit's native integration embeds a web editor that communicates with native networking through an in-process HTTP server. Because the server is exposed to JavaScript running in the WebView, the parser must be hardened against malformed and adversarial input per RFC 7230/9110/9112 — a lenient parser could enable request smuggling, header injection, or denial-of-service via resource exhaustion.

How?

Swift (iOS)

  • GutenbergKitHTTP module — Incremental, stateful parser (HTTPRequestParser) that buffers to a temporary file on disk so memory stays flat regardless of body size. Strict RFC conformance: rejects obs-fold, whitespace before colon, conflicting Content-Length, Transfer-Encoding, invalid UTF-8 (round-trip validated), lone surrogates, overlong encodings.
  • HTTPServer — Local HTTP/1.1 server on Network.framework with async handler API, connection limits, read timeouts, and constant-time bearer token authentication.
  • Multipart parsing — RFC 7578 multipart/form-data support with lazy body references (file slices, not copies).
  • RequestBody — Abstracts over in-memory and file-backed storage with InputStream and async data access.

Kotlin (Android)

  • Pure-Kotlin HTTP parser (org.wordpress.gutenberg.http) — Feature-identical port of the Swift parser with no native dependencies. Includes HeaderValue, HTTPRequestSerializer, HTTPRequestParser (with disk-backed buffering via Buffer/TempFileOwner), ParsedHTTPRequest, MultipartPart, and RequestBody.
  • HttpServer — Local HTTP/1.1 server with connection limits, read timeouts, pure-Kotlin response serialization, and proper 400 responses on premature connection close.

Shared cross-platform test fixtures

  • 163 JSON test fixtures in test-fixtures/http/ covering header value extraction (20 cases), request parsing (58 basic + 42 error + 4 incremental), and multipart parsing (32 field-based + 7 error).
  • 8 dedicated UTF-8 edge cases — overlong encodings, lone surrogates, truncated sequences, code points above U+10FFFF.
  • 2 whitespace-before-colon cases — space and tab before colon, returning the specific whitespaceBeforeColon error (request smuggling vector per RFC 7230 §3.2.4).
  • Both platforms load the same fixture files, guaranteeing identical parse behavior.

Key design decisions

  • 64-bit Content-Length on both platforms (Swift Int64, Kotlin Long) with a 4 GB default max body size.
  • UTF-8 round-trip validation on both platforms to reject silently-accepted malformed sequences.
  • Disk-backed buffering with configurable inMemoryBodyThreshold (512 KB default) — bodies below threshold stay in memory, larger ones reference the temp file directly as a slice.
  • Multipart part bodies are lazy file-slice references for file-backed sources, avoiding copies during parsing.

Testing Instructions

Automated (CI runs these on every push)

  1. swift test — runs 820+ tests including 163 fixture-based cross-platform tests
  2. cd android && ./gradlew :Gutenberg:test — runs all Android unit tests including fixture tests

On-device (manual)

  1. iOS: Open ios/Demo-iOS/Gutenberg.xcodeproj in Xcode, run on a device, navigate to the Media Proxy Server screen — it prints the server URL
  2. Android: Build and install the demo app, open the Media Proxy Server activity — it prints the server URL
  3. Send adversarial requests to verify hardening (all should return appropriate 4xx errors):
    # Missing Host header → 400
    printf "GET / HTTP/1.1\r\n\r\n" | nc -w 3 <device-ip> <port>
    
    # Transfer-Encoding (rejected) → 400
    printf "GET / HTTP/1.1\r\nHost: localhost\r\nTransfer-Encoding: chunked\r\n\r\n" | nc -w 3 <device-ip> <port>
    
    # Oversized headers → 431
    printf "GET / HTTP/1.1\r\nHost: localhost\r\nX-Long: $(python3 -c 'print("X"*70000)')\r\n\r\n" | nc -w 3 <device-ip> <port>
    
    # Conflicting Content-Length → 400
    printf "POST / HTTP/1.1\r\nHost: localhost\r\nContent-Length: 5\r\nContent-Length: 10\r\n\r\nhello" | nc -w 3 <device-ip> <port>
    
    # Whitespace before colon (smuggling vector) → 400
    printf "GET / HTTP/1.1\r\nHost: localhost\r\nContent-Length : 0\r\n\r\n" | nc -w 3 <device-ip> <port>
    
    # Premature connection close → 400
    printf "\r\n\r\n" | nc -w 3 <device-ip> <port>
  4. Instrumented tests: cd android && ./gradlew :Gutenberg:connectedDebugAndroidTest -Pandroid.testInstrumentationRunnerArguments.class=org.wordpress.gutenberg.http.InstrumentedFixtureTests runs all 163 fixture cases on a connected device

🤖 Generated with Claude Code

@jkmassel jkmassel added the [Type] Enhancement A suggestion for improvement. label Mar 11, 2026
@jkmassel jkmassel force-pushed the jkmassel/swift-http-proxy-server branch 3 times, most recently from cce3ff6 to 57d1b7c Compare March 12, 2026 22:36
@jkmassel jkmassel changed the title Add GutenbergKitHTTP: HTTP/1.1 parser and local proxy server feat: add cross-platform HTTP/1.1 parser and local proxy server for iOS and Android Mar 14, 2026
@jkmassel jkmassel force-pushed the jkmassel/swift-http-proxy-server branch 16 times, most recently from 29f9c27 to 0a09c12 Compare March 17, 2026 17:49
jkmassel and others added 4 commits March 17, 2026 17:32
Add a hardened, RFC-conformant HTTP/1.1 request parser and local proxy
server for iOS. The parser is exposed to JavaScript running in the
WebView, so it enforces strict validation per RFC 7230/9110/9112 to
prevent request smuggling, header injection, and resource exhaustion.

Includes:
- HTTPRequestParser: Incremental parser with disk-backed buffering
- HTTPRequestSerializer: Stateless header parsing with full RFC validation
- HeaderValue: RFC 2045 parameter extraction with quoted string handling
- MultipartPart: RFC 7578 multipart/form-data with lazy file-slice refs
- RequestBody: In-memory and file-backed storage with InputStream access
- HTTPServer: Local server on Network.framework with async handler API,
  connection limits, read timeouts (408 per RFC 9110 §15.5.9), and
  constant-time bearer token auth
- HTTPResponse: Response serialization with header sanitization,
  Content-Length always derived from actual body size

All size-related types use Int64 to match Kotlin's Long and prevent
ambiguity on future platforms.

820+ tests covering RFC 7230, 7578, 8941, 9110, 9112, and 9651.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tures

Add a pure-Kotlin implementation of the HTTP/1.1 request parser that
mirrors the Swift GutenbergKitHTTP library, enabling HTTP parsing on
Android without native dependencies.

Includes HeaderValue, HTTPRequestSerializer, HTTPRequestParser (with
disk-backed buffering), ParsedHTTPRequest, MultipartPart, RequestBody,
and HttpServer with bearer token authentication, constant-time token
comparison, status code clamping, and Content-Length always derived
from actual body size — matching the Swift server's security model.

Both platforms are validated against 101 shared JSON test fixtures in
test-fixtures/http/ covering header value extraction, request parsing
(basic, error, incremental), multipart parsing (field-based, raw-body,
error), and 8 dedicated UTF-8 edge cases (overlong encodings, lone
surrogates, truncated sequences).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add demo app screens that start the local HTTP server and display its
address for manual testing with curl or a browser. Enables on-device
validation of the parser against adversarial inputs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run the shared JSON test fixtures on an actual Android device via
connectedDebugAndroidTest, validating the pure-Kotlin HTTP parser
under ART in addition to the JVM unit tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jkmassel jkmassel requested a review from dcalhoun March 18, 2026 16:07
@jkmassel jkmassel self-assigned this Mar 18, 2026
@jkmassel jkmassel force-pushed the jkmassel/swift-http-proxy-server branch from 0a09c12 to e8cda3c Compare March 18, 2026 16:15
@jkmassel jkmassel marked this pull request as ready for review March 18, 2026 16:38
Copy link
Copy Markdown
Member

@dcalhoun dcalhoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkmassel the testing instructions succeeded for me.

I also had Claude swiftly rebase #357 atop this work and update it to use the HTTP server library in this work. Good news is that it seems to work. The result of the work is in the feat/leverage-host-media-processing-stacked branch (here is the current diff).

I did encounter a couple of issues that I noted in inline comments below. I worked around them in the feat/leverage-host-media-processing-stacked branch, but we might address them in the library instead. WDYT?

jkmassel and others added 2 commits March 18, 2026 17:08
CORS preflight requests (OPTIONS) never include credentials per
Fetch spec §3.3.5, so the server now skips authentication for
OPTIONS requests. This allows browser-based clients to complete
the preflight handshake without a bearer token.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Browser fetch() silently strips Proxy-* headers (Fetch spec §2.2.2),
making Proxy-Authorization unusable from web contexts. Add
Relay-Authorization as a non-forbidden alternative that carries the
same bearer token. Proxy-Authorization remains supported and takes
precedence when both headers are present.

Relay-Authorization is treated as hop-by-hop and stripped before
forwarding to upstream servers, keeping Authorization free for
upstream credentials.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dcalhoun
Copy link
Copy Markdown
Member

@jkmassel the latest changes worked for the media upload implementation. 🎉

Noting two things:

First, it appears your two commits were pushed to a new pr-367 branch outside of this PR. Was that intentional?

Second, I'll share a few additional thoughts on the new OPTIONS change...

The OPTIONS exemption from auth is necessary for CORS preflight compatibility, but it's only half the solution. A successful CORS preflight also needs avalid response: 204 + Access-Control-Allow-* headers. The library handles the first requirement and leaves the second to the caller.

This creates a hidden obligation. Any caller serving a browser client has to independently discover that they need to handle OPTIONS in their handler, know the correct CORS headers to return, and remember to do so. A handler written naturally — routing on method/path, returning 404 for anything else — will silently break CORS preflight. That's exactly what happened when integrating this.

CORS policy (allowed origins, methods, headers) is application-level and not something the library can own without configuration that arguably falls outside its scope. Instead, should we clearly document this obligation in OPTIONS exception? Possibly noting that callers serving browser clients must handle OPTIONS in their handler and return the appropriate CORS headers.

The OPTIONS auth exemption is only half the CORS story — handlers
must also return appropriate Access-Control-Allow-* headers. When
proxying to a remote server, pass the upstream response through
unaltered. When serving local content, the handler must generate
the CORS headers itself. Document this in the class-level docs
to prevent silent preflight failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jkmassel
Copy link
Copy Markdown
Contributor Author

a184a2c adds some documentation about the "handle your own CORS" requirement. I thought about adding a helper that makes it a one-liner for non-relay purposes, but I think that would be better in some future PR if it's really needed.

@jkmassel jkmassel requested a review from dcalhoun March 19, 2026 21:16
dcalhoun and others added 2 commits March 26, 2026 15:40
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@dcalhoun dcalhoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent most of my time testing. The implementation tested well for me, as did existing editor functionality. I reviewed the implementation at a high level. The tests appear to provide a good coverage of the targeted spec.

Claude noted potential issues to address, but none appeared critical. I capture them here for future consideration.

Claude's review notes

Potential Issues

Medium Priority

  1. CONNECT method target validation is lenient (HTTPRequestSerializer.swift:213-216, HTTPRequestSerializer.kt:141-153). The code validates that
    CONNECT targets contain : and don't start with /, but doesn't validate the port is numeric. CONNECT example.com:abc HTTP/1.1 would pass. RFC 9112
    §3.2.3 defines authority-form as uri-host ":" port where port is a number.

  2. Content-Length with leading zeros (HTTPRequestSerializer.kt:272-279). Both platforms accept Content-Length: 007 as 7. While RFC 7230 §3.3.3
    doesn't explicitly forbid leading zeros, some security research suggests rejecting them to avoid parsing differentials between servers (a request smuggling
    vector in multi-tier architectures).

  3. ConnectionTasks accumulates entries unboundedly (HTTPServer.swift:566-594). The comment explains the rationale (avoiding a race between
    track() and remove()), and entries are tiny (UUID + Task reference), but for a long-running server with many connections this could grow. Consider
    periodic cleanup on track() if the dictionary exceeds a threshold.

  4. SimpleDateFormat created per-response (HttpServer.kt:616). Thread-safe since each call creates a new instance, but wasteful. Consider a
    DateTimeFormatter (thread-safe, immutable) or ThreadLocal<SimpleDateFormat>.

Low Priority

  1. Multipart header size limit is hardcoded at 8192 bytes in parseChunked() on both platforms. Part headers exceeding this size will fail with an
    error that doesn't clearly indicate the root cause. Worth documenting the limit or making it configurable.

  2. Unterminated quoted strings in HeaderValue (HeaderValue.swift:72-94, HeaderValue.kt:77-84). If a parameter value has an opening " but no
    closing one, the parser returns the partial value rather than nil. This is lenient but could mask malformed input. Consider returning nil for unclosed
    quotes.

  3. RequestBody.FileBacked.inputStream() (RequestBody.kt:220-245) creates a new RandomAccessFile per call with no documentation that the caller
    must close it. If called multiple times without closing, file descriptor leak potential.

Informational / Nitpicks

  1. Debug server prints token to stdout (main.swift:31). Acceptable for a debug tool but the README should note that log access = token access.

  2. Android MediaProxyServerActivity is exported="true" (AndroidManifest.xml:56-57). Comment says demo-only. If this is truly never shipped, it's
    fine; otherwise consider a debug build type.

  3. maxBodySize comment says "4 GB" (HTTPRequestParser.swift:34) but the value is 4 * 1024^3 = 4 GiB. Minor docs inaccuracy.

I merged the latest trunk branch and resolved conflicts in Package.swift. The trunk branch now includes Detekt configuration. I opted to update the baseline rather heavily modifying this work to address lint errors. We can address lint errors in future PRs if deemed worthwhile.

Approving so that we can merge and rebase #357 atop this work. 🚀

@dcalhoun dcalhoun merged commit 7bb38d6 into trunk Mar 26, 2026
15 checks passed
@dcalhoun dcalhoun deleted the jkmassel/swift-http-proxy-server branch March 26, 2026 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Type] Enhancement A suggestion for improvement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants