netlink, nltest: add ReceiveIter to stream responses by nickgarlis · Pull Request #258 · mdlayher/netlink

nickgarlis · 2026-02-25T21:15:17Z

Add ReceiveIter() method that returns an iter.Seq2[Message, error] for iterating over netlink messages without collecting them into a slice. Refactor lockedReceive() to use the new lockedReceiveIter() iterator internally to eliminate code duplication.

I've tested this with google/nftables and got a notable 33% memory pressure reduction on large responses.

This is in an exploratory phase but is looking promising so far. I think that in order to evaluate this approach, the should be addressed:

Figure out what the behavior should be when the user stops iterating. Should the socket buffer be drained ? Or should it be the user's responsibility ? What happens in an async scenario ?
Make sure it doesn't disrupt packages using other netlink families (I've tested this change with wgctrl, rtnetlink, go-tc and all tests are passing.)
Add tests/benchmarks

cc @aojea

Currently, listing many rules or set elements generates a lot of intermediate slices, which increases memory usage unnecessarily. This change unmarshals elements during iteration, avoiding these intermediate allocations. Benchmarks show improved performance, particularly when reading rules. Depends on: mdlayher/netlink#258

mdlayher · 2026-02-27T03:46:02Z

Hi @nickgarlis, thanks for the contributions. I've added you as a collaborator since you're actively making improvements to the ecosystem. Thank you!

nickgarlis · 2026-02-27T06:14:42Z

Hi @nickgarlis, thanks for the contributions. I've added you as a collaborator since you're actively making improvements to the ecosystem. Thank you!

Hi @mdlayher, thanks for the invite! I appreciate the trust. I’ll do my best to be helpful and keep things in good shape.

conn.go

Add ReceiveIter() method that returns an iter.Seq2[Message, error] for iterating over netlink messages without collecting them into a slice. Refactor lockedReceive() to use the new lockedReceiveIter() iterator internally to eliminate code duplication. Reduces memory usage for large responses, particularly when using ReceiveIter. If iteration is stopped early on multi-part responses, the remaining buffer is drained to keep the socket in a consistent state.

nickgarlis · 2026-03-19T19:59:02Z

I benchmarked this change using BenchmarkNftablesDump with the current nftables implementation (using Receive), and observed the following results:

goos: linux
goarch: amd64
pkg: github.com/mdlayher/netlink/internal/integration
cpu: 13th Gen Intel(R) Core(TM) i9-13900H
                      │    v1.txt    │               v2.txt               │
                      │    sec/op    │   sec/op     vs base               │
NftablesDump/1-20        65.18µ ± 7%   65.49µ ± 8%       ~ (p=0.796 n=10)
NftablesDump/8-20       103.67µ ± 3%   95.67µ ± 8%  -7.72% (p=0.002 n=10)
NftablesDump/64-20       297.5µ ± 2%   296.0µ ± 3%       ~ (p=0.353 n=10)
NftablesDump/512-20      1.835m ± 3%   1.819m ± 3%       ~ (p=0.436 n=10)
NftablesDump/4096-20     18.45m ± 3%   18.21m ± 3%  -1.29% (p=0.023 n=10)
NftablesDump/32768-20    225.7m ± 4%   225.3m ± 8%       ~ (p=0.971 n=10)
geomean                  1.577m        1.550m       -1.71%

                      │    v1.txt    │               v2.txt                │
                      │     B/op     │     B/op      vs base               │
NftablesDump/1-20       11.18Ki ± 0%   11.34Ki ± 0%  +1.43% (p=0.000 n=10)
NftablesDump/8-20       23.90Ki ± 0%   23.18Ki ± 0%  -3.00% (p=0.000 n=10)
NftablesDump/64-20      125.5Ki ± 0%   122.1Ki ± 0%  -2.69% (p=0.000 n=10)
NftablesDump/512-20     953.5Ki ± 0%   916.4Ki ± 0%  -3.89% (p=0.000 n=10)
NftablesDump/4096-20    7.974Mi ± 0%   7.435Mi ± 0%  -6.77% (p=0.000 n=10)
NftablesDump/32768-20   66.97Mi ± 0%   64.22Mi ± 0%  -4.10% (p=0.000 n=10)
geomean                 511.5Ki        495.1Ki       -3.20%

                      │   v1.txt    │               v2.txt               │
                      │  allocs/op  │  allocs/op   vs base               │
NftablesDump/1-20        146.0 ± 0%    156.0 ± 0%  +6.85% (p=0.000 n=10)
NftablesDump/8-20        407.0 ± 0%    417.0 ± 0%  +2.46% (p=0.000 n=10)
NftablesDump/64-20      2.486k ± 0%   2.494k ± 0%  +0.32% (p=0.000 n=10)
NftablesDump/512-20     19.06k ± 0%   19.02k ± 0%  -0.19% (p=0.000 n=10)
NftablesDump/4096-20    151.6k ± 0%   151.2k ± 0%  -0.26% (p=0.000 n=10)
NftablesDump/32768-20   1.212M ± 0%   1.208M ± 0%  -0.27% (p=0.000 n=10)
geomean                 8.959k        9.089k       +1.45%

I then updated the consumer to use ReceiveIter as shown in this PR google/nftables#357, and got the following results:

goos: linux
goarch: amd64
pkg: github.com/mdlayher/netlink/internal/integration
cpu: 13th Gen Intel(R) Core(TM) i9-13900H
                      │    v1.txt    │                v3.txt                │
                      │    sec/op    │    sec/op     vs base                │
NftablesDump/1-20        65.18µ ± 7%   63.21µ ±  7%        ~ (p=0.739 n=10)
NftablesDump/8-20       103.67µ ± 3%   90.83µ ± 12%  -12.38% (p=0.002 n=10)
NftablesDump/64-20       297.5µ ± 2%   302.2µ ±  5%        ~ (p=0.579 n=10)
NftablesDump/512-20      1.835m ± 3%   1.840m ±  3%        ~ (p=1.000 n=10)
NftablesDump/4096-20     18.45m ± 3%   16.40m ±  4%  -11.11% (p=0.000 n=10)
NftablesDump/32768-20    225.7m ± 4%   201.7m ±  2%  -10.62% (p=0.000 n=10)
geomean                  1.577m        1.481m         -6.04%

                      │    v1.txt    │                v3.txt                │
                      │     B/op     │     B/op      vs base                │
NftablesDump/1-20       11.18Ki ± 0%   11.32Ki ± 0%   +1.25% (p=0.000 n=10)
NftablesDump/8-20       23.90Ki ± 0%   22.06Ki ± 0%   -7.67% (p=0.000 n=10)
NftablesDump/64-20      125.5Ki ± 0%   110.9Ki ± 0%  -11.67% (p=0.000 n=10)
NftablesDump/512-20     953.5Ki ± 0%   820.9Ki ± 0%  -13.91% (p=0.000 n=10)
NftablesDump/4096-20    7.974Mi ± 0%   6.419Mi ± 0%  -19.50% (p=0.000 n=10)
NftablesDump/32768-20   66.97Mi ± 0%   51.34Mi ± 0%  -23.34% (p=0.000 n=10)
geomean                 511.5Ki        445.8Ki       -12.83%

                      │   v1.txt    │               v3.txt               │
                      │  allocs/op  │  allocs/op   vs base               │
NftablesDump/1-20        146.0 ± 0%    158.0 ± 0%  +8.22% (p=0.000 n=10)
NftablesDump/8-20        407.0 ± 0%    413.0 ± 0%  +1.47% (p=0.000 n=10)
NftablesDump/64-20      2.486k ± 0%   2.484k ± 0%  -0.08% (p=0.000 n=10)
NftablesDump/512-20     19.06k ± 0%   19.01k ± 0%  -0.27% (p=0.000 n=10)
NftablesDump/4096-20    151.6k ± 0%   151.2k ± 0%  -0.27% (p=0.000 n=10)
NftablesDump/32768-20   1.212M ± 0%   1.208M ± 0%  -0.27% (p=0.000 n=10)
geomean                 8.959k        9.086k       +1.42%

Currently, listing many rules or set elements generates a lot of intermediate slices, which increases memory usage unnecessarily. This change unmarshals elements during iteration, avoiding these intermediate allocations. Benchmarks show improved performance, particularly when reading rules. Depends on: mdlayher/netlink#258

nickgarlis mentioned this pull request Feb 25, 2026

Unmarshal directly during iteration to reduce intermediate slices google/nftables#357

Draft

aojea reviewed Feb 27, 2026

View reviewed changes

conn.go Outdated Show resolved Hide resolved

aojea reviewed Feb 27, 2026

View reviewed changes

conn.go Outdated Show resolved Hide resolved

nickgarlis mentioned this pull request Feb 27, 2026

netlink: add ExecuteFunc method to process large responses with low mem overhead #214

Open

nickgarlis force-pushed the add-iterator branch from 6adc9ca to 9c6cd65 Compare March 19, 2026 17:55

nickgarlis changed the title ~~Add ReceiveSeq iterator for streaming message responses~~ netlink, nltest: add ReceiveIter to stream responses Mar 19, 2026

nickgarlis marked this pull request as ready for review March 19, 2026 19:59

nickgarlis merged commit 5af0e4f into mdlayher:main Mar 22, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

netlink, nltest: add ReceiveIter to stream responses#258

netlink, nltest: add ReceiveIter to stream responses#258
nickgarlis merged 1 commit intomdlayher:mainfrom
nickgarlis:add-iterator

nickgarlis commented Feb 25, 2026 •

edited

Loading

Uh oh!

mdlayher commented Feb 27, 2026

Uh oh!

nickgarlis commented Feb 27, 2026

Uh oh!

Uh oh!

Uh oh!

nickgarlis commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nickgarlis commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mdlayher commented Feb 27, 2026

Uh oh!

nickgarlis commented Feb 27, 2026

Uh oh!

Uh oh!

Uh oh!

nickgarlis commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nickgarlis commented Feb 25, 2026 •

edited

Loading