perf: profile and harden Linux socket path#92
Conversation
M30 — Kernel/socket path profiling and Linux socket hardening.
Profiling:
- scripts/profile_gateway_io.sh (make profile-io, Linux-only): backgrounds the gateway,
reads rusage from /proc/<pid>/{stat,status} (user vs system CPU, context switches, page
faults, peak RSS) and attaches strace -f -c for the syscall mix. Skips with exit 2 on
non-Linux. Direct-PID stop path (no pkill/GNU-time dependency) with SIGKILL escalation
so it cannot hang.
- scripts/socket_stress.sh (make socket-stress, portable): UDP burst + receive-buffer
(SO_RCVBUF) experiment over loopback, multi-trial, with sequence-gap detection.
Hardening:
- UdpFeedClient gains an optional SO_RCVBUF request and reads back the kernel-granted
size; qsl-mdfeed subscribe gains [rcvbuf_bytes], publish gains [orders], and the
subscriber idle-breaks so burst experiments terminate promptly. Unit-tested.
Evidence (constrained, loopback; ADR 0008):
- results/socket_profile_loopback.txt (containerized Linux): per-order matching CPU below
the 10ms tick; measurable CPU is the kernel/socket path; syscall mix is
accept/read/sendto/close; ~1 voluntary context switch per connection.
- results/socket_stress_summary.txt (native): an undersized receive buffer drops datagrams
under burst (sequence gaps); default/large buffers do not. Loss is timing-dependent.
Docs: docs/socket_profiling.md, docs/socket_hardening.md, ADR 0008 (loopback-constrained;
epoll deferred to M34, multi-client pressure to M35, io_uring discuss-only). README,
CHANGELOG, PROGRESS, HANDOFF updated.
make check 187/187; make asan 187/187.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 41ecf2e3f0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Addresses the Codex P2 review on PR #92. scripts/profile_gateway_io.sh previously did `wait "$STRACE_PID" ... || true`, discarding strace's exit status, then wrote the artifact unconditionally. If strace exists but cannot attach (ptrace blocked by container/Yama policy), STRACE_OUT is empty and the script wrote a successful-looking artifact with an empty syscall section. Now it captures strace's exit status and requires a real syscall summary (a "total" row) in STRACE_OUT; on a nonzero strace exit or a missing summary it prints the error plus strace's stderr and exits 3 without writing/overwriting the artifact. The readiness-failure paths for both passes also exit nonzero instead of emitting a half-empty artifact. Verified: the happy path regenerates the full artifact (rc=0); a stubbed failing strace makes the script exit nonzero and write no artifact.
|
Fixed the P2 in 551a9c5. @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 551a9c57be
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Addresses the second Codex P2 review on PR #92. socket_stress.sh previously reported only the SequenceTracker gap total. That counter sees forward (interior) gaps only: a datagram dropped at the END of a burst has no later sequence number to reveal it, so total gaps could read 0 while received < published — exactly the burst-overflow scenario the experiment measures. The script now captures the subscriber's "received N" line and reports loss = published - received per trial (which includes tail drops), keeping the SequenceTracker count as a secondary interior-gaps signal. Regenerated results/socket_stress_summary.txt and updated docs/socket_profiling.md and docs/socket_hardening.md to match.
|
Fixed the second P2 in cc49ea1. @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cc49ea10c7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Addresses the third Codex P2 review on PR #92. Pass 2 attached with `strace -p`, but on Linux hosts with Yama ptrace_scope=1 (a common default) a non-privileged tracer may only trace its descendants -- attaching to a sibling needs CAP_SYS_PTRACE -- so `make profile-io` could not regenerate the artifact on an otherwise valid Linux host (the earlier Docker run masked this with SYS_PTRACE/unconfined seccomp). Pass 2 now runs the gateway UNDER `strace -f -c` (launch form), making it strace's descendant. The pass is stopped by terminating the gateway (strace's child) with SIGTERM, escalating to SIGKILL on the child -- never strace itself -- so strace observes the exit and flushes its -c summary; signalling strace would lose the report. (Empirically, SIGINT does not reliably terminate a traced child; SIGTERM/SIGKILL do.) Verified in containerized Linux: the regenerated artifact carries the full syscall summary (per-request accept/read/sendto/close plus one-time process/socket setup) and the script does not hang. Docs updated to match.
|
Fixed the third P2 in 62ed01a. @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 62ed01afb7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…rtifacts Addresses the two Codex P2 reviews on PR #92 (round 4): hard-coded ports plus swallowed failures could produce successful-looking artifacts with empty/garbage measurements. profile_gateway_io.sh: if QSL_PROFILE_PORT was already in use the launched gateway exited at bind, but wait_ready could still connect to the unrelated listener -- so the script profiled a dead PID and wrote an empty-rusage artifact. wait_ready now also checks the launched process is still alive (not-ready if it exited); Pass 1 re-checks the gateway is alive before reading its procfs; Pass 2 watches strace (which exits when its traced gateway's bind fails) and additionally requires the serving syscalls (accept + sendto/write) in the summary, since a bind-failure trace has only startup syscalls. Each case fails loudly (exit 3) and writes no artifact. socket_stress.sh: a failed subscriber/publisher (e.g. trial port already in use) had its exit status swallowed and recorded '?' fields while still writing the artifact. Each trial now checks both helper exit codes and that the published/received/SO_RCVBUF fields parsed; a failure aborts the run (exit 1) with a clear message instead of emitting a misleading artifact. Verified in containerized Linux and on macOS: happy paths regenerate valid artifacts; holding the trial port makes each script fail loudly with no artifact written.
|
Fixed both P2s in 340579e (port-conflict robustness).
Verified both happy paths (valid artifacts) and port-held negative cases (loud failure, no artifact) in containerized Linux + macOS. @codex review |
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Milestone
M30 — Kernel/socket path profiling and Linux socket hardening
Summary
Profiles the Linux kernel/socket path of the TCP order gateway and the UDP market-data feed, and adds a measured socket-hardening knob — profiling/hardening the existing gateway rather than rewriting it.
UdpFeedClientgains an optionalSO_RCVBUFrequest that reads back the kernel-granted size viagetsockopt.qsl-mdfeed subscribegains[rcvbuf_bytes],publishgains[orders], and the subscriber idle-breaks so burst experiments terminate promptly.scripts/profile_gateway_io.sh(make profile-io, Linux-only): backgrounds the gateway, reads rusage from/proc/<pid>/{stat,status}(user vs system CPU, context switches, page faults, peak RSS) and attachesstrace -f -cfor the syscall mix. Skips with exit 2 on non-Linux. Direct-PID stop path (nopkill/GNU-time dependency) withSIGKILLescalation so it cannot hang.scripts/socket_stress.sh(make socket-stress, portable): UDP burst + receive-buffer experiment over loopback with sequence-gap detection, multi-trial.docs/socket_profiling.md,docs/socket_hardening.md, ADR 0008.Definition of Done
strace -f -c:accept/read/sendto/close).VmHWM).epolladapter deferred to M34 (multi-client pressure → M35);io_uringdiscuss-only — rationale in ADR 0008 (Linux-only API, cannot compile/test on the macOS dev host; no untested platform code committed).make check+ relevant socket tests pass.PROGRESS.mdupdated.Tests
Measured results (constrained, loopback only — not production/latency claims)
results/socket_profile_loopback.txt(containerized Linux, Docker--cap-add=SYS_PTRACE): for the trivial-per-order loopback workload, user-space matching CPU fell below the 10 ms clock tick; the measurable CPU was the kernel/socket path, with ~1 voluntary context switch per connection; syscall mix exactlyaccept/read/sendto/close.results/socket_stress_summary.txt(native): a 2 KiB receive buffer intermittently dropped datagrams under a ~16.9k-datagram burst (gaps/trial6,0,1,0) while the OS default (~768 KiB) and an 8 MiB buffer lost nothing. UDP loss is timing/OS dependent and reported per-trial.Limitations
straceperturbs timing; the user/kernel CPU split comes from the procfs rusage pass, not strace.epoll) multi-client serving is M34/M35.Dirty tree: yes); regenerate on a clean Linux checkout for a clean-tree version. Full hardware PMU evidence remains issue M29 follow-up: generate full Linux hardware PMU perf artifacts #90.