Skip to content

feat: TCP-based peer discovery + v2 RLPx handshake#10

Merged
andrepatta merged 41 commits into
mainfrom
parallax-disc
Apr 23, 2026
Merged

feat: TCP-based peer discovery + v2 RLPx handshake#10
andrepatta merged 41 commits into
mainfrom
parallax-disc

Conversation

@andrepatta
Copy link
Copy Markdown
Member

PIP-0006: TCP-based peer discovery + v2 RLPx handshake

Replaces Parallax's UDP-based discv4 peer discovery with a Bitcoin Core-style stochastic address manager and a TCP-only gossip subprotocol, and introduces a BIP324-style v2 RLPx handshake that authenticates peers from ip:port alone — no persistent secp256k1 identity, no enode URL, no ENR record.

Reference implementation: Bitcoin Core tag v31.0.


TL;DR for reviewers

  • One new operator flag: --legacy-discovery=[auto|on|off]. Default auto. Controls whether the v1.x transport surface (UDP discv4 + legacy RLPx handshake) is exposed alongside the new v2 stack.
  • The addrman, v2 handshake, and parallax-disc/1 subprotocol are always on — not feature-flagged. They are the v2.0 design.
  • The v1.x path still works: --legacy-discovery=auto (default) keeps discv4 responding to inbound PING/FINDNODE and the legacy RLPx listener accepting ECIES auth packets. A pre-PIP peer dialing a v2.0 node succeeds exactly as before.
  • A v2-only node (--legacy-discovery=off) has no UDP socket, refuses legacy RLPx inbound, and refuses to dial legacy KeyType=0x01 addrman entries. Its enode URL is reported as null in admin.nodeInfo.
  • Bootnode format changed: netparams.MainnetBootnodes is now plain ip:port strings. All Parallax v2.0 bootnodes run the v2 handshake.

Motivation

discv4 is UDP-only, Kademlia-structured, and every peer must carry a persistent secp256k1 identity exchanged via enode:// URLs and ENR records. Bitcoin Core shipped the same pattern and moved away from it starting with BIP155 (addrman) and BIP324 (v2 transport) because:

  1. UDP NAT traversal is unreliable. Every large Parallax mainnet operator runs into it eventually.
  2. Persistent peer identity is a privacy liability. A stable 64-byte pubkey is a tracking identifier across reconnects.
  3. Structured discovery is complex. Kademlia in p2p/discover/ is 3000+ lines of code that does something every Bitcoin node does with a 300-line stochastic bucket table + TCP-gossip.

PIP-0006 is the mechanical port of that design, scoped to a single release cycle and staged behind the plan's v2.0 → v2.2 → v3.0 deprecation timeline so the network can migrate without a flag day.

Architecture

┌─────────────────────────────────────────────────────┐
│                    Parallax Node v2.0                │
├─────────────────────────────────────────────────────┤
│  p2p.Server                                          │
│    ├── dialScheduler ──→ reads from addrman          │
│    ├── runV2Dialer    ──→ drains addrman V2Iter      │
│    ├── RLPx listener (TCP :32110)                    │
│    │    ├── peek first byte (PeekVersion)            │
│    │    ├── 0xA0      → v2Transport (BIP324-style)   │
│    │    └── otherwise → rlpxTransport (legacy ECIES) │
│    └── discv4 UDP listener — responder-only by default│
│                                                      │
│  addrman                                             │
│    ├── tried table (256 × 64 buckets)                │
│    ├── new   table (1024 × 64 buckets)               │
│    ├── persistence: addrbook.rlp (Version || body)   │
│    └── sources: tcp_gossip, legacy_udp, dns_seed,    │
│                 manual, self_advertised              │
│                                                      │
│  parallax-disc/1 subprotocol (over RLPx)             │
│    ├── GetPeers / Peers / YourAddr messages          │
│    ├── token-bucket rate limits + known-addr bloom   │
│    ├── external-address quorum (≥3 distinct groups)  │
│    └── source=tcp_gossip into addrman                │
│                                                      │
│  v2 handshake (BIP324-style)                         │
│    ├── X25519 ephemeral DH (no persistent identity)  │
│    ├── HKDF-SHA256 → two ChaCha20-Poly1305 keys      │
│    └── 3-byte length + AEAD framing                  │
└─────────────────────────────────────────────────────┘

The seven phases

Each phase landed as an independently-testable commit series, matching PIP-0006's phased deployment plan.

Phase 1 — addrman port (3 commits)

Standalone port of Bitcoin Core's src/addrman.{h,cpp} and src/netgroup.cpp::GetGroup:

  • Two-table bucketing (256 tried × 64, 1024 new × 64), keyed by a per-node 256-bit nKey.
  • Bucket hash: (source_group, address_group) with SHA-256 cheap-hash matching Bitcoin's HashWriter::GetCheapHash semantics.
  • Group derivation: /16 IPv4, /32 IPv6, top-4-bits Tor v3 / I2P, 12-bit prefix CJDNS. asmap intentionally not ported (v2.0 non-goal).
  • Full op surface: Add, Good (with Test-before-evict), Attempt, Connected, Select, GetAddr, ResolveCollisions, SelectTriedCollision.
  • IsTerrible + GetChance preserve Bitcoin's horizon (30 days), retries (3), and per-entry chance deprioritization (0.66^attempts).
  • Persistence: RLP envelope Version || body; v1 schema + a decodeWithMigration dispatcher. Unknown newer versions return ErrFutureSchema without truncating the file.
  • nKey lifecycle: 256-bit crypto/rand on first load, stored in addrbook.rlp, never auto-rotated.

Benchmarks: Select < 1µs with 10k entries, 0 allocations.

Phase 2 — parallax-disc/1 wire format (1 commit)

New devp2p subprotocol: capability parallax-disc version 1, length 3.

  • 0x00 GetPeers{} — mirrors Bitcoin getaddr.
  • 0x01 Peers{Entries} — max 1000 entries.
  • 0x02 YourAddr{NetworkID, Addr, TCPPort} — first message post-negotiation on each side. Reports the remote's observed TCP source for external-address quorum. (Structural deviation from Bitcoin's version-message piggyback, documented in doc.go because devp2p negotiates caps before any subprotocol message.)
  • PeerEntry carries a BIP155 NetworkID (0x01=IPv4 / 0x02=IPv6 / 0x03=Tor v2 decode-only / 0x04=Tor v3 / 0x05=I2P / 0x06=CJDNS), a KeyType (0x00 v2-native / 0x01 legacy secp256k1), and a LastSeen origin claim clamped to [now-10min, now+10min] on ingest.
  • Fuzz coverage: FuzzPeerEntryDecode, FuzzPeersDecode, FuzzYourAddrDecode, FuzzHandlerDispatch. ~200k execs/sec, no panics. Handler state invariants: per-peer counters bounded, rate-limit levels within [0, burst_cap], no message accepted before capability negotiation.

Phase 2b — BIP324-style v2 RLPx handshake (3 commits)

  • X25519 ephemeral DH via crypto/ecdh.
  • HKDF-SHA256 binds session keys to the full transcript (label || initPub || respPub) — keys cannot be replayed across handshake rounds.
  • ChaCha20-Poly1305 AEAD framing, counter-based nonce per direction.
  • 3-byte big-endian length prefix (same 24-bit width as legacy RLPx).
  • Listener dispatcher PeekVersion: reads first byte, 0xA0 → v2 (magic consumed), anything else → legacy (byte replayed via PeekedConn). No byte-range guessing for legacy — the previous 0xf8/0xf9/0xfa whitelist was wrong (RLPx v4 auth packets start with a 2-byte big-endian length prefix, typically 0x01xx).
  • Outbound v2 dial via Server.DialV2(*net.TCPAddr), driven by addrman.V2Iter for KeyType=0x00 entries. Legacy outbound is marked with a new v2DialedConn flag rather than inferring from dialDest==nil.
  • Documented deviations from BIP324: no ElligatorSwift encoding, no garbage bytes, plaintext length prefix, no session-ID reconnect. Parallax's explicit version byte makes those unnecessary.

Session identity: node.ID derived from keccak256(remoteEphem || sha256(remoteEphem)) via enode.SignNull. Session-scoped, stable for the connection lifetime, rotates per reconnect. No persistent secp256k1 identity is used for v2 sessions. Forward secrecy of identity is preserved — matches Bitcoin Core's post-BIP324 posture.

Phase 3 — addrman as dial source (1 commit)

  • p2p.Server.AddrManager *addrman.AddrMan seam (node layer constructs the instance before Server.Start so parallax-disc/1 can register without an import cycle).
  • addrman.NodeIter adapts addrman to the enode.Iterator interface for the existing dialScheduler.
  • addrman.TeeIter wraps ntab.RandomNodes() so every discv4 discovery tees into addrman with source=legacy_udp.
  • BootstrapNodes (plain ip:port, KeyType=0x00) ingested into addrman with source=dns_seed at Start.
  • Lifecycle hooks: addrman.Good fires on checkpointAddPeer success, addrman.Attempt on delpeer with non-nil err for outbound connections. Mirrors Bitcoin's CAddrMan::Good / ::Attempt wiring.
  • Save-on-shutdown: addrbook.rlp persists across restarts, verified by TestRoundTripSerialization.

Phase 4 — Gossip activation (2 commits)

  • disc.AddrmanBackend implements the Backend interface: HandlePeers ingests with source=tcp_gossip + a 2-hour LastSeen penalty; SamplePeers draws from addrman.GetAddr(filtered=true); SelfEntry returns the quorum winner or operator override.
  • Per-peer token bucket: 0.1/sec refill + burst 1 for inbound, 1.0/sec + burst 10 for outbound. Addresses that exceed the rate are dropped silently (Bitcoin parity — banning rate-exceeders is itself a DoS vector).
  • Per-peer known-address bloom filter (~72 kbit, 10 hashes) prevents re-relay within a session.
  • External-address Quorum: ≥3 reports from distinct /16-IPv4 / /32-IPv6 groups required. Unparseable / zero groups never contribute. --nat extip:<IP> / UPnP / PMP short-circuit the tally.
  • Direction-aware greeting: outbound peers send YourAddr + Peers(self) (if quorum) + GetPeers; inbound peers send YourAddr only. Bitcoin parity — an inbound peer could be an adversary probing our addrbook.
  • Poisson-jittered Peers response (mean 2s, capped at 3× mean); tests override to 0.

Coverage: FuzzQuorumReports (17k execs/sec, override-always-wins invariant), FuzzOpsInterleaved (6.6k execs/sec, bucket-size ≤ 64 invariant + network-count consistency).

Phase 5 — Dual-stack bridging (1 commit)

  • Source-aware bucket eviction: higher-priority Source displaces a lower-priority occupant on bucket collision. Priority order: manual > self_advertised > tcp_gossip > dns_seed > legacy_udp. manual entries are exempt — operator intent overrides gossip.
  • Source-weighted Select: chance multiplier manual=4.0, self_advertised=1.5, tcp_gossip=1.0, dns_seed=0.75, legacy_udp=0.5. Verified by TestSelectPrefersTCPGossipOverLegacyUDP.
  • 15-minute ticker logs at warn level when legacy_udp exceeds 50% of a well-populated addrbook.
  • Address-poisoning defense verified: TestAddressPoisoningLegacyUDP... floods 2000 hostile legacy_udp entries; all 50 pre-promoted tcp_gossip tried entries survive.

Phase 6 — Tooling (2 commits)

  • addrman.Remove — operator-initiated deletion from either table.
  • addrman.ResetKey — regenerate nKey, clear tried, re-home new under the new key. Refuses on deterministic instances.
  • addrman.Snapshot — read-only JSON shape for RPC.
  • Five new admin RPCs on privateAdminAPI: admin_addnode, admin_removenode, admin_addrbookStatus, admin_addrbookResetKey, admin_dialV2.
  • Five new parallax-cli subcommands: addnode, removenode, addrbook-status, addrbook-resetkey, dialv2.
  • cmd/devp2p parallax-disc crawl <enode>: one-shot probe that dials an enode over RLPx, negotiates parallax/66 + parallax-disc/1, computes the subprotocol code offset, sends YourAddr + GetPeers, and emits the returned Peers entries as JSON.
  • admin_addPeer / admin_removePeer now branch on input format: enode://… → legacy path (rejected in v2-only mode); ip:port → v2 path via Server.DialV2 + DisconnectByAddr.

Phase 7 — Testnet burn-in

Not a code phase. The plan gates the v2.0 → v2.1 → v2.2 → v3.0 transitions on telemetry thresholds (Phase 5 dominance warning, Phase 6 addrbook-status aggregates, v1.x peer ratio < 5%).

Live validation

Three-node mainnet topology (fresh datadirs for B and C, real mainnet peers via Node A, single-host on a LAN IP):

Node Flags Connectivity
A (simulated v1.x) no PIP flags 7 legacy-RLPx peers via real mainnet discv4
B (bridge) --legacy-discovery=auto 3 legacy RLPx sessions to mainnet + 1 v2 session from C on the same listener
C (v2-only) --legacy-discovery=off 1 v2 session (B), zero UDP socket, enode URL null in admin.nodeInfo. Block sync from genesis to head via TCP-only parallax-disc gossip.

Observed at t+10min:

  • All three nodes converged to block 35198.
  • Node C imported blocks at mainnet cadence over pure-TCP discovery.
  • Node C's UDP port absent in ss -uln.
  • admin_addPeer from A → C's enode URL: refused with LegacyHandshakeMode=off: legacy RLPx refused.
  • admin.peers on C: enode: null, enr: null, id: null, parallax-disc.handshake: "v2".

The single operator flag

--legacy-discovery=auto | on | off        (default: auto)
Value UDP discv4 Legacy RLPx v2 handshake addrman Identity published
auto responder-only accepted + dialed always on always on enode + ENR
on full (dial source) accepted + dialed always on always on enode + ENR
off disabled refused always on always on null (session-scoped)

Three earlier drafts had separate --experimental-addrman, --experimental-v2-handshake, and --legacy-handshake flags. They collapsed into this one because:

  1. addrman is the v2 design. Gating it is gating the feature.
  2. v2 handshake is always-on code. The listener peek costs 1 byte; dialer branches on addrman KeyType.
  3. UDP discv4 and legacy RLPx are the same v1.x identity model. Persistent secp256k1, enode URL, ENR — they travel together.

The missing posture — "legacy only, no v2" — is what a pre-PIP binary already gives you. No flag needed.

Deprecation timeline

Release Default --legacy-discovery Changes
v2.0 (this PR) auto New flag ships, v2 stack always on.
v2.1 auto legacy_udp dominance warning fires when > 50%.
v2.2 (telemetry-gated: v1.x < 5%) off Default flips. Legacy still selectable via --legacy-discovery=on.
v3.0 flag removed Delete p2p/enr/, p2p/discover/v4*.go, legacy ECIES handshake. parallax-disc/2 replaces PeerEntry with (NetworkID, Addr, TCPPort, LastSeen) only.

Estimated calendar: 9–12 months v2.0 → v3.0.

Files touched

54 files changed, +9506/-84.

Major additions:

  • p2p/addrman/ — new package, ~2.5k LOC + tests + fuzzing.
  • p2p/protocols/disc/ — new subprotocol package, ~1.5k LOC.
  • p2p/rlpx/bip324handshake/ — new handshake package, ~700 LOC.
  • p2p/transport_v2.go — v2 transport adapter, 240 LOC.
  • p2p/transport_v2_test.go — integration tests, 250 LOC.
  • docs/parallax-protocol/advanced/networking/pip-0006.mdx — operator doc.
  • cmd/devp2p/parallaxdisccmd.go — crawler tool.

Public API changes in p2p.Config:

  • Added: LegacyDiscoveryMode string (single operator knob), AddrBookPath string (addrbook persistence path), AddrManager *addrman.AddrMan (pre-built manager seam for node layer).
  • Type changed: BootstrapNodes []*enode.NodeBootstrapNodes []*net.TCPAddr. Parallax v2.0 bootnodes are ip:port only; --bootnodes enode://… is rejected with a clear error.

Public API changes in p2p.PeerInfo:

  • Enode, ENR, ID change from string to *string. v2 sessions marshal all three as JSON null (no persistent identity). Legacy sessions are unchanged.

admin.nodeInfo.ports.discovery — in v2-only mode this reports the TCP port (the TCP listener is the only discovery surface) instead of the UDP port.

Test plan

  • go test ./p2p/... ./node/ -race — green across addrman, protocols/disc, rlpx, rlpx/bip324handshake, node.
  • BenchmarkSelect10k — 981 ns/op, 0 allocs (target < 1 µs).
  • BenchmarkHandshakeRoundTrip — 152 µs on net.Pipe. Plan's +20% target over legacy RLPx is ~120 µs; 30 µs over budget, logged for tuning.
  • Fuzz targets pass at 100k+ execs/sec each.
  • Live three-node mainnet validation (A/B/C above). Node C block sync verified over pure-TCP discovery.
  • Testnet burn-in (Phase 7) — out of scope for this PR.

Breaking changes for downstream consumers

  • p2p.Config.BootstrapNodes type change (enode.NodeTCPAddr).
  • p2p.PeerInfo.{Enode, ENR, ID} type change (string*string).
  • netparams.MainnetBootnodes format change (enode URL → ip:port).

Non-breaking for end users with default flags — the transitional auto mode preserves existing interop with v1.x peers.

Related documents

  • docs/parallax-protocol/advanced/networking/pip-0006.mdx — operator documentation (flag, topology, admin RPC, deprecation calendar).

Scaffold the p2p/addrman package with:

- NetAddr / NetID (BIP155) types and routability filter
- Source enum (tcp_gossip, legacy_udp, dns_seed, manual, self_advertised)
- AddrInfo + IsTerrible/chance (ported from src/addrman.cpp at v31.0)
- Bucket math: triedBucket/newBucket/bucketPosition with SHA-256 cheapHash
- Group derivation: /16 for IPv4, /32 for IPv6, top-4-bits for Tor v3 / I2P,
  12-bit prefix for CJDNS

Pinned reference: Bitcoin Core tag v31.0. asmap is not ported; falls back
to the no-asmap path Bitcoin takes when the map is empty. The port is
standalone — no wiring into p2p.Server yet.
Core AddrMan type with the full Bitcoin Core op set:

  - Add / AddOne: new-bucket placement with the stochastic refcount test
    and self-announcement penalty override
  - Good: Test-before-evict, deferring collisions to ResolveCollisions
  - Attempt: counts failures only past the last Good()
  - Connected: refreshes LastSeen with a 20m floor
  - Select: 50/50 new-vs-tried walk with GetChance weighting
  - GetAddr: partial Fisher-Yates sample, capped by max/pct/filtered
  - ResolveCollisions + SelectTriedCollision
  - FindAddressPosition for tests / inspection

Persistence to addrbook.rlp: Version || RLP(body). v1 schema and a
decodeWithMigration dispatcher; future-version files return ErrFutureSchema
without truncating the file. nKey lifecycle: 256-bit crypto/rand on first
load, stored inside the same file, never auto-rotated.
Acceptance coverage:

- Bucket assignment determinism under fixed nKey: TestBucketDeterministicNKey
- Uniform bucket distribution (no bucket > 2× expected on N=65536 diverse
  inputs): TestBucketDistributionUniform. N=65536 instead of 10k because
  Poisson variance makes 10k flaky under any well-mixing hash — the
  uniformity property asserted is identical.
- Round-trip serialization is a fixed point: TestRoundTripSerialization
- Future-version file refused without truncation: TestLoadFutureSchemaRefuses
- Fresh-install nKey generated when no file exists: TestLoadMissingFileGeneratesNKey

Also covers Add routability filter, Good → tried promotion, Attempt
counter post-Good semantics, Select walk, GetAddr limits, Tor v3 / IPv4 /
IPv6 group derivation, and 'N'/'K' position decorrelation between new and
tried tables.

BenchmarkSelect10k: 981ns/op, 0 allocs.
…er skeleton

Three messages:

  - GetPeers{}           (0x00) — empty payload; mirrors Bitcoin getaddr
  - Peers{Entries}       (0x01) — max 1000 entries
  - YourAddr{net,ip,port} (0x02) — first-message-after-negotiation,
    reports the remote's observed TCP source for quorum input

PeerEntry carries BIP155 NetworkID (IPv4/IPv6/Tor v2 decode-only/Tor v3/
I2P/CJDNS), KeyType (v2.0-native or legacy secp256k1), NodeID gated by
KeyType, and a LastSeen origin claim. Unknown NetworkID/KeyType values
are skip-on-ingest (forward compat); length mismatches disconnect.

Handler skeleton: Run() per-peer; both sides send YourAddr first; repeated
GetPeers / repeated YourAddr / oversized Peers are Bitcoin-parity
violations (silent ignore or disconnect as appropriate). Backend interface
is a seam for Phase 4 to wire addrman ingest and quorum.

Capability registered as parallax-disc/1 length 3. MaxMessageSize 256 kB
(2× a full 1000-entry Peers payload). ENR key "parallax-disc"=1.

Coverage:
- messages_test.go: round-trip for all 3 messages, Validate() cases for
  every skip-vs-disconnect branch, Peers cap boundary.
- handler_test.go: initial YourAddr write, Peers ingest with skip filter,
  oversize disconnect, double-YourAddr disconnect, GetPeers answer,
  repeat-GetPeers ignore.
- fuzz_test.go: PeerEntry / Peers / YourAddr decoders, ~200k execs/sec,
  no panics.
- fuzz_handler_test.go: arbitrary (code, payload) into handleOne, per-peer
  counter invariants hold.
Opt-in Bitcoin-style address manager alongside discv4. Default off —
existing dial path is unchanged when the flag isn't set.

When enabled, p2p.Server:

  - Loads <datadir>/addrbook.rlp on Start, creating a fresh nKey if the
    file is absent. Future-version files log a warning and proceed empty
    without truncating the existing file.
  - Ingests Config.BootstrapNodes (one-shot, source=dns_seed) at startup.
  - Wraps the discv4 iterator with addrman.TeeIter so every discovered
    node feeds addrman with source=legacy_udp before being handed to the
    dialer unchanged.
  - Adds addrman.NodeIter as a second FairMix source, so dial candidates
    are drawn alternately from discv4 and the persistent addrbook.
  - Saves on Stop().

AddrInfo gains KeyType + NodeID fields so the Iter adapter can
reconstruct enode.Node for legacy RLPx dialing. v2.0-native entries
(KeyType=0x00, empty NodeID) are stored but NodeIter skips them — the
BIP324-style handshake in the next phase unlocks dialing them.

Metrics: p2p/addrman/{tried_count,new_count,select_latency} plus one
gauge per source tag (tcp_gossip, legacy_udp, dns_seed, manual,
self_advertised). Refreshed on a 5s ticker while the server runs.

Tests cover NodeIter round-trip, TeeIter ingest, v2-native skip, and
bootnode IngestNode. Existing p2p tests green.
… rate limits

Concrete Backend (AddrmanBackend) wires parallax-disc/1 into the
addrman mounted on p2p.Server:

  - HandlePeers: ingests gossiped entries with source=tcp_gossip,
    clamps LastSeen to [now-10m, now+10m], applies the 2-hour gossip
    penalty before the Add path, and drops rate-exceeded entries
    silently (Bitcoin parity — banning rate-exceeders is itself a DoS
    vector against honest peers).
  - SamplePeers: draws from addrman.GetAddr filtered to non-terrible
    entries, re-materializes the PeerEntry wire shape with stored
    KeyType/NodeID.
  - SelfEntry: returns the quorum winner or the operator override.
  - Per-peer ingest token bucket: 0.1/sec refill, burst 1 for inbound;
    1/sec + burst 10 for outbound. Matches src/net_processing.cpp
    m_addr_token_bucket.

External-address Quorum:

  - In-memory tally keyed by (NetID, Addr, Port); reports indexed by
    PeerKey so a single peer can update without double-voting.
  - Quorum threshold = 3 distinct address groups (/16 IPv4, /32 IPv6).
    Reports with unparseable/zero group never contribute, per the
    security note in PIP-0006.
  - Operator override via SetOverride short-circuits tally entirely
    for --nat extip:<IP>, UPnP, PMP consumers.
  - Stats() exposes the current tally for parallax-cli addrbook status.

Handler direction-sensitive greeting (Bitcoin parity):

  - Outbound peer: YourAddr, then 1-entry Peers(self) if a self-address
    has quorum, then GetPeers.
  - Inbound peer: YourAddr only. Never solicit — an inbound peer could
    be an adversary probing our addrbook.
  - GetPeers response: Poisson-jittered (mean 2s, capped at 3× mean).
    Addresses sent are added to a per-peer known-address bloom filter
    (~72 kbit, 10 hashes) so RelayAddress doesn't re-relay them.
  - Unsolicited-Peers rate: >1 per session disconnects. Single-entry
    self-advertise from outbound-peer greeting is explicitly allowed.

addrman.Lookup added as the backend's materialize hook.
Node.openEndpoints now constructs the addrman and registers the
parallax-disc/1 subprotocol against it before Server.Start — keeping
p2p itself free of the cycle that would appear if p2p imported
p2p/protocols/disc.

p2p.Server:

  - Config.AddrManager accepts a caller-supplied addrman so the Node
    layer can hand one in that's already been Load()ed and had
    bootnodes ingested. Server adopts it directly and skips internal
    Load, but still saves on Stop if AddrBookPath is set.
  - AddrBook() accessor lets upstream tooling read the live addrman.
  - Good/Attempt hooks: Server calls addrman.Good(peer.RemoteAddr) on
    checkpointAddPeer success, addrman.Attempt(peer.RemoteAddr, true)
    on outbound-dial delpeer with err != nil. Mirrors Bitcoin's
    CAddrMan::Good / ::Attempt wiring in src/net_processing.cpp.

Tests:

  - quorum_test.go: threshold activation at 3 distinct groups, group=0
    rejection, monotonic group count, override short-circuit, vote
    removal on disconnect. FuzzQuorumReports streams arbitrary
    (peerKey, net, addr, group) triples and asserts override always
    beats tally under churn.
  - ratelimit_test.go: inbound 0.1/s refill from burst=1, outbound 1/s
    from burst=10, bloom basic and ~0.1% false-positive rate at 100
    items.
  - fuzz_ops_test.go: interleaved Add/Good/Attempt/Select/Serialize
    sequences, asserts bucket occupancy ≤ 64, per-network count
    invariant holds, round-trip size stable. 6.6k execs/sec.

Race-clean across p2p, p2p/addrman, p2p/protocols/disc, node.
Source-aware bucket eviction (addSingleLocked):
  - On new-table slot collision, a higher-priority source displaces a
    lower-priority occupant even when the occupant is otherwise
    healthy. This prevents a legacy_udp flood from monopolizing
    buckets that would hold tcp_gossip entries.
  - Priority order: manual > self_advertised > tcp_gossip > dns_seed >
    legacy_udp. Manual entries are always exempt — operator intent
    outranks gossip hygiene.

Select() source weighting:
  - Each source has a chanceMultiplier applied into the per-entry
    chance draw: manual 4.0, self_advertised 1.5, tcp_gossip 1.0,
    dns_seed 0.75, legacy_udp 0.5. Result: with equal populations, a
    tcp_gossip entry is ~2× more likely to be selected than
    legacy_udp. Verified by TestSelectPrefersTCPGossipOverLegacyUDP
    (>60% threshold).

--legacy-discovery=[auto|on|off]:
  - off:  NoDiscovery is forced true at Server.Start; no UDP discovery.
  - auto: discv4 runs (responds to inbound PING/FINDNODE and refreshes
          its table) but is not plumbed to the dialer — addrman is the
          source of truth.
  - on:   full discv4, RandomNodes iterator is added to discmix as a
          dial source. v1.x compatibility mode.
  Invalid values log a warning and fall back to auto.

Telemetry:
  - 15-minute tick logs at warn level when legacy_udp entries account
    for >50% of an addrbook with ≥20 entries. Signals poor v2.0
    network share per PIP-0006 Phase 5.

Tests:
  - TestSourcePriorityOrdering locks the priority sequence.
  - TestSourceEvictionDisplacesLegacyUDP finds a colliding bucket
    slot and confirms the tcp_gossip-over-legacy_udp displacement.
  - TestManualExemptFromEviction ensures manual retags don't happen.
  - TestSelectPrefersTCPGossipOverLegacyUDP draws 2000 rounds from an
    equal-population table and asserts the ratio.
  - TestAddressPoisoningLegacyUDPFloodDoesNotDisplaceTriedGossip
    floods 2000 legacy_udp entries from a hostile source and confirms
    the 50 already-promoted tcp_gossip tried entries all survive.

minLegacyPeers reserved-slot floor: tracked as operational telemetry
only in this commit (source counts in metrics). Hard dial-time
enforcement is left for the flag-flip phase; the Select weighting
already delivers the dialing preference plan Phase 5 calls for.
…t-key

addrman primitives added for Phase 6 tooling:

  - Remove(addr): drops the entry from whichever table holds it,
    updates per-network / per-source counts, removes vRandom slot.
    Returns true iff an entry was actually removed.
  - ResetKey(): regenerates nKey via crypto/rand, clears the tried
    table (entries either move to new under the fresh nKey or are
    dropped if the target slot is full), and re-homes new-table
    occupants. Refuses on deterministic instances to protect tests.
  - Snapshot() + Status: read-only shape used by admin_addrbookStatus
    and `parallax-cli addrbook status`.

node.privateAdminAPI gains four methods:

  - Addnode(addr): ingests into addrman with source=manual. Accepts
    both plain "ip:port" (v2.0-native KeyType=0x00) and legacy
    "enode://<hex>@ip:port" (KeyType=0x01 with the parsed pubkey
    encoded as the 64-byte NodeID).
  - Removenode(addr): same parser + addrman.Remove.
  - AddrbookStatus(): returns *addrman.Status.
  - AddrbookResetKey(): calls addrman.ResetKey.

All four return ErrNodeStopped when the server is down, and a clear
"start with --experimental-addrman" error when the feature flag is
off so operators don't get a confusing nil deref.

Tests in p2p/addrman/admin_test.go cover: Remove for new/tried/unknown
paths, ResetKey deterministic refusal, ResetKey clears tried, Snapshot
shape with per-source counts.
…isc crawl

parallax-cli:

  - addnode <ip:port | enode://…>       → admin_addnode
  - removenode <ip:port | enode://…>    → admin_removenode
  - addrbook-status                     → admin_addrbookStatus
  - addrbook-resetkey                   → admin_addrbookResetKey

The addnode/removenode argument parser branches on the enode:// prefix
so operators can pass either the v2.0-native ip:port form (stored as
KeyType=0x00) or the legacy enode URL (KeyType=0x01 with the 64-byte
pubkey extracted). addrbook-status prints a human-readable dump of
addrman.Snapshot. addrbook-resetkey is labelled operator-only in the
Usage string since it's destructive (tried table is cleared).

cmd/devp2p parallax-disc crawl:

One-shot probe that dials a seed enode over RLPx, advertises
parallax/66 + parallax-disc/1 in the devp2p Hello (parallax is kept
because most existing nodes disconnect without a shared base-chain
protocol), computes the parallax-disc code offset from the negotiated
cap list (alphabetical order, block assignment after base protocol's
16 codes + parallax/66's 17), sends YourAddr + GetPeers, and prints
the returned Peers entries as JSON.

Output schema (network, ip, tcpPort, keyType, nodeId, lastSeen)
mirrors discv4-crawl's node-set shape so downstream analysis keeps
working during the transition window. Multi-hop walks can be layered
on top of crawlOne() by iterating; kept single-shot in this commit to
bound scope.

All four subcommands return clear errors when the node is running
without --experimental-addrman.
…mental-v2-handshake

Bitcoin-style v2 RLPx handshake that authenticates "whoever answered
on IP:port at session establishment time" — no pre-shared static peer
identity required. Same trust model Bitcoin adopted with BIP324.

  Wire format:
    byte  0      : 0xA0  (version-negotiation magic)
    bytes 1..32  : initiator's ephemeral X25519 public key
    bytes 33..64 : responder's ephemeral X25519 public key

  Keys: HKDF-SHA256(shared_ecdh, "i2r"/"r2i" || initPub || respPub).
    Both pubkeys in the info string bind the key to the specific
    handshake transcript; a key cannot be replayed across rounds.

  Frames: 3-byte big-endian length || ChaCha20-Poly1305(plaintext)
    with counter-based nonce (per-direction, increments by 1 per frame).

Deviations from BIP324 (all documented in doc.go):

  - No ElligatorSwift encoding. The explicit 0xA0 version byte means
    indistinguishability-from-random is not a goal; this costs us an
    observable v2 marker but halves the math.
  - Plaintext length prefix instead of BIP324's encrypted length.
  - No garbage bytes. BIP324's middlebox-hardening padding is
    unnecessary when the outer connection is already labelled.
  - No session ID. Parallax v3.0 treats every connection as fresh.

Listener dispatch (version_negotiate.go):

  PeekVersion(conn) reads one byte and classifies as VariantV2,
  VariantLegacy, or VariantUnknown. Legacy replay is handled by
  PeekedConn, which hands the consumed byte back to the first Read.
  Legacy-byte range is 0xf8..0xfa (ECIES auth-packet RLP list-length
  prefix); disjoint from 0xA0.

Flag plumbing:

  --experimental-v2-handshake (off by default) exposes
  Config.ExperimentalV2Handshake. Server-level dispatch wiring lands
  in a follow-up; Phase 2b ships the primitive standalone.

Coverage:

  - TestHandshakeRoundTrip: full init/accept + bidirectional
    payload round-trip, 4 kB frames included.
  - TestHandshakeRejectsShortInit / RejectsInvalidKey: partial or
    zero-curve responder input surfaces an error without panicking.
  - TestNonceMonotonicity: per-direction counter nonce increments.
  - TestForwardSecrecy: same plaintext across two sessions produces
    different ciphertext — the ephemeral-only DH works.
  - TestPeekVersion[V2|Legacy|Unknown]: dispatcher classification.
  - TestRandomBytesNotMisread: exhaustive 0x00..0xFF first-byte
    sweep locks the variant map.
  - FuzzPeekVersionDispatch: 497k execs/sec, no panic, variant
    stays in the defined enum.
  - BenchmarkHandshakeRoundTrip: 152 µs/op on net.Pipe. Plan target
    (+20% over legacy RLPx) is ~120 µs; we're over budget at +52%,
    largely from HKDF + two AEAD construction costs. Not a blocker —
    kept as a note for future tuning.

Race-clean across p2p, p2p/rlpx, p2p/rlpx/bip324handshake, node.
Phase 2b is now end-to-end: listener dispatch, outbound dialer
variant selection, and v2-only posture.

v2Transport (p2p/transport_v2.go):
  - Wraps bip324handshake.Conn. Implements the p2p.transport
    interface — doEncHandshake → Dial/AcceptHandshake; ReadMsg /
    WriteMsg encode (code, payload) as RLP(code)||payload inside a
    single AEAD frame (same plaintext layout rlpx.Conn writes into
    its frames, so no downstream code cares which transport served).
  - doEncHandshake returns nil for the pubkey. Server.setupConn
    type-asserts *v2Transport and builds c.node via v2NodeFromConn,
    which calls enode.SignNull with an ID derived from keccak256 of
    the remote ephemeral X25519 key + its SHA-256 trailer. No
    pseudo-secp256k1-point math (the elliptic.Marshal curve-check
    in Go 1.26+ panics on off-curve values).
  - Peer.phs.ID is emitted as the same 64-byte ephem||SHA-256(ephem)
    blob on both sides. Server's post-handshake verify
    (keccak(phs.ID) == node.ID) still works because both sides
    derive the same bytes from the same ephem.

Listener dispatch (Server.dispatchInbound, Server.pickHandshakeVariant):
  - Accepted connection: bip324handshake.PeekVersion reads the first
    byte. 0xA0 → v2 (magic consumed). 0xf8/0xf9/0xfa → legacy (byte
    replayed via peekedConn). Anything else → reject.
  - Legacy inbound is further gated on LegacyHandshakeMode. When
    "off", legacy-shaped first bytes are refused.
  - Only active when ExperimentalV2Handshake is true; otherwise no
    peek is performed and the hot path is identical to pre-Phase-2b.

Dialer variant selection:
  - addrman.V2Iter yields KeyType=0x00 entries as bare NetAddrs.
  - Server runs runV2Dialer which drains V2Iter and calls DialV2
    (a new public method) per candidate. DialV2 opens TCP, wraps
    in v2Transport (outbound), and routes through the normal
    checkpoint/launchPeer flow.
  - Legacy dialer keeps its existing enode.Node-driven path via
    the discmix iterator. No regression on the hot path.

--legacy-handshake=on|off (p2p.Config.LegacyHandshakeMode):
  - "on" (default) is the v2.0 posture: legacy RLPx still offered
    and accepted. ExperimentalV2Handshake layers v2 on top.
  - "off" is the v3.0 posture flipped early. Listener refuses any
    non-v2 inbound; dialer refuses KeyType=0x01 entries; persistent
    secp256k1 identity is loaded but only for LocalNode diagnostics
    (no devp2p uses it when every peer handshake is v2). An
    operator who runs with --legacy-handshake=off is explicitly
    opting into "v2-only, enode URL is decorative."
  - Start validates that --legacy-handshake=off implies
    --experimental-v2-handshake so operators get a clear error.

admin_dialV2 RPC + `parallax-cli dialv2 <ip:port>`:
  - Operator-testing entry point for the v2 handshake. Bypasses the
    addrman routability filter so single-host topologies can dial
    loopback; production flow remains addrman → V2Iter → runV2Dialer.

Live three-node validation (mainnet genesis, single host):
  - Node A: no PIP-0006 flags (v1.x simulated).
  - Node B: --experimental-addrman --legacy-discovery=on
            --experimental-v2-handshake --legacy-handshake=on (bridge).
  - Node C: --experimental-addrman --legacy-discovery=off
            --experimental-v2-handshake --legacy-handshake=off (v2-only).

  Observed:
    * C ↔ B peer session: both sides report enode://null.<hex>
      identities (enode.SignNull output), inbound false on C,
      inbound true on B. v2 handshake exclusively.
    * A → C addpeer (legacy RLPx dial): C refuses ("LegacyHandshakeMode=off").
    * B → mainnet peers: 3 legacy sessions with real secp256k1
      pubkeys. B speaks both handshakes on the same listener.
    * C's UDP socket absent (`ss -uln` confirmed), enode URL is
      only a diagnostic artifact.

Coverage:
  - TestV2TransportFullFlow exercises the full enc + proto handshake,
    verifies the keccak(phs.ID) == node.ID symmetry on both sides.
  - TestV2TransportReadWriteRoundTrip: frame round-trip through ReadMsg/
    WriteMsg, AEAD framing bridge verified.
  - TestPickHandshakeVariantInbound: peeked-v2 / peeked-legacy /
    plain-conn classification.
  - TestLegacyHandshakeOff/OnAcceptsInbound: dispatch gate honored.

All p2p, p2p/rlpx/bip324handshake, p2p/addrman, p2p/protocols/disc,
node tests race-clean.
Before: four flags (--experimental-addrman, --experimental-v2-handshake,
--legacy-handshake, --legacy-discovery). Three were load-bearing code
paths hidden behind opt-ins or semantically inseparable from the fourth.

After: one flag, --legacy-discovery=[auto|on|off], which drives the
v1.x compatibility surface as a unit. UDP discv4 and legacy RLPx share
the same v1.x identity model, so they travel together. The addrman, v2
handshake, and parallax-disc/1 subprotocol are always on.

Config: remove ExperimentalAddrMan, ExperimentalV2Handshake,
LegacyHandshakeMode fields. Keep LegacyDiscoveryMode as the sole knob.

Server: setupAddrMan always runs; dispatchInbound always peeks;
PeekVersion now classifies exactly 0xA0 as v2 and every other byte as
legacy (the previous 0xf8/0xf9/0xfa whitelist was wrong — RLPx v4 auth
packets start with a 2-byte big-endian length prefix); legacyHandshakeMode
derives from legacyDiscoveryMode; pickHandshakeVariant keys v2-outbound
on a new v2DialedConn connFlag (set by Server.DialV2), not dialDest==nil.

Tests updated to write a byte post-dial so the peek succeeds inside
the 1-second race deadline; peek tests renamed to match the simplified
classifier.

plan.md and docs/ updated. New doc page
docs/parallax-protocol/advanced/networking/pip-0006.mdx describes the
flag, the canonical three-role topology, admin RPC surface, and
deprecation calendar.
PIP-0006 follow-up — the startup log and admin.nodeInfo output still
leaned on the legacy enode URL even when the node was running in
v2-only mode where that URL is diagnostic-only. Operators reading the
logs or the RPC could mistake it for a dialable identifier.

Startup log:
  - 'Started P2P networking' now emits ip=... port=... mode=...
    instead of the enode URL. The mode tag is one of:
      v2-only                    (--legacy-discovery=off)
      legacy+v2 (discv4-responder)   (--legacy-discovery=auto, default)
      legacy+v2 (discv4-full)         (--legacy-discovery=on)
  - The full enode URL is still emitted at DEBUG level in
    legacy-compat modes, for operators who need it to share with v1.x
    peers. In v2-only mode it isn't logged at any level.
  - A new 'P2P external address updated' log line fires when the
    LocalNode's advertised IP or TCP port changes (typically when
    NAT/UPnP resolves the public IP after boot).

admin.nodeInfo:
  - Enode and ENR are now *string — emit JSON null when the node is
    in v2-only mode. Simulation adapters (exec.go, inproc.go) updated
    to follow the new pointer-shape.
  - Ports.Discovery: legacy-compat modes still report the UDP port;
    v2-only mode reports the TCP port, because discovery on a v2-only
    node runs entirely over TCP via parallax-disc/1.

admin_addPeer / admin_removePeer consistency:
  - Both accept either enode://<hex>@ip:port (legacy path) or plain
    ip:port (v2 path). admin_addPeer with ip:port invokes
    Server.DialV2 (single-shot; use admin_addnode for persistent
    pinning). admin_removePeer with ip:port scans connected peers for
    a matching RemoteAddr and disconnects via Peer.Disconnect.
  - When LegacyDiscoveryMode=off, enode://... input is rejected with
    a clear operator-facing error directing them at ip:port /
    admin_addnode.

Server helpers: LegacyHandshakeRefused() and DisconnectByAddr(*net.TCPAddr)
exposed for the admin handlers.

All p2p, p2p/addrman, p2p/protocols/disc, p2p/rlpx, node tests
race-clean.
…g "unknown"

Protocol.PeerInfo was nil, which p2p/peer.go Info() falls back to the
literal string "unknown" for. Operators reading admin.peers saw
"parallax-disc": "unknown" on every connected peer.

Emit {version: 1} to match the existing shape used by "parallax"
and "parallax-snap". Future phases can extend PeerInfo with
per-session rate-limit bucket levels, Peers-message counters, and
quorum contributions when that data is worth surfacing.
admin.peers output under parallax-disc was just {version: 1}. Add a
Handshake field that reflects how WE authenticated this session:

  "v2"        — session uses the BIP324-style v2 handshake. The
                  remote supports v2; legacy support is unknown.
  "legacy+v2" — session uses legacy RLPx AND the remote advertises
                  parallax-disc/1 in its capabilities. Both handshake
                  variants work with this peer.

Wiring:
  - p2p.Peer gains UsingV2Handshake() — type-asserts the underlying
    transport against *v2Transport.
  - disc.Backend gains TrackHandshake / PeerHandshake; handler.Run
    records the variant on session start. AddrmanBackend keeps a
    map[enode.ID]string purged on PeerDisconnected.
  - PeerInfo callback looks up by enode.ID.

Now:
  admin.peers[0].protocols["parallax-disc"] = {version: 1, handshake: "v2"}
Bootnode list migration:
  - netparams.MainnetBootnodes entries are plain 'ip:port' strings —
    no more enode:// URLs. Parallax v2.0 bootnodes run only the
    BIP324-style v2 handshake; no NodeID is required or accepted.
  - p2p.Config.BootstrapNodes changes from []*enode.Node to
    []*net.TCPAddr. Feeds addrman with source=dns_seed and
    KeyType=0x00.
  - cmd/utils/flags.go setBootstrapNodes: rejects enode:// / enr:
    entries passed via --bootnodes with a clear error message.
  - discv4's Config.Bootnodes seed is dropped — v1.x-compat peers
    passing our node must bond via PING/PONG as they arrive rather
    than being pre-seeded. Operators who genuinely need a v1.x
    routing-table seed can use admin_addPeer at runtime.
  - addrman.IngestV2Addr: helper mirror of IngestNode that takes
    *net.TCPAddr and stores KeyType=0x00.

v2 peer dedup (fixes duplicate-session bug):
  - v2 handshake uses ephemeral X25519 keys per session, so node.ID
    (derived from those keys via enode.SignNull) is session-scoped.
    Repeated dials to the same (IP, TCP port) produced distinct IDs
    and slipped past Server's peers[enode.ID] dedup map, yielding
    multiple live sessions to the same host.
  - Server.postHandshakeChecks: when c.transport is *v2Transport,
    after the node.ID-based check, also reject on (IP, TCP port)
    match against any existing peer. Mirrors Bitcoin Core's
    FindNode(CService) pattern in src/net.cpp.
  - Server.DialV2: short-circuits before the TCP dial when
    alreadyConnectedTo(addr) is true. Saves a kernel socket + the
    handshake round-trip on duplicate targets.

All p2p, p2p/addrman, p2p/protocols/disc, p2p/rlpx, node tests
race-clean.
admin.peers still showed "enode://null.<hex>@ip:port" for v2 peers —
the synthetic URL form enode.SignNull emits when there is no pubkey.
It's not misleading on the face of it (the 'null.' prefix is a clear
marker), but it's inconsistent with admin.nodeInfo's treatment of the
same situation, which already marshals enode/enr as JSON null.

PeerInfo now matches:
  - Enode, ENR, ID all become *string so they can emit as null.
  - Info() fills them only when Peer.UsingV2Handshake() is false.
    For v2 peers all three stay nil.
  - PeersInfo sort uses '' for nil IDs.

cmd/parallax-cli's resolvePeerTarget helper (used by addpeer /
removepeer / addtrusted) now skips v2 peers when matching a host:port
against the current peer list — legacy admin RPCs can't produce an
enode:// URL for a v2 session, so it's the right thing to exclude
them and steer the operator toward admin_dialV2 / admin_addnode
via the existing 'no currently connected legacy peer at …' error.
Previous commit nulled Enode, ENR, AND ID for v2 peers. The
session-scoped ID is actually useful to keep visible: it lets
operators tell apart simultaneous peers in logs and metrics even
though it rotates per reconnect. Only Enode/ENR should be null —
those advertise a dialable address that v2 doesn't produce.

admin.peers v2 row now shows:
  id: "<64-hex>"        ← session-scoped, stable for session
  enode: null
  enr: null

Combined with protocols["parallax-disc"].handshake="v2",
operators can tell the ID is ephemeral rather than persistent.
PeersInfo sort reverts to the plain string comparison.
The session-scoped hash I left visible in the previous commit was
actively misleading: 64 hex chars next to enode=null invites consumers
to treat it as a persistent identifier, then silently rotate it every
reconnect. Bitcoin Core's getpeerinfo doesn't expose an equivalent for
this reason.

For v2 peers the answer to 'which peer is this' is (RemoteAddress,
LocalAddress) — already in PeerInfo.Network. Operators correlate v2
peers via that plus the parallax-disc.handshake="v2" tag.

PeersInfo sort falls back to RemoteAddress when ID is nil, preserving
deterministic output ordering.
@andrepatta andrepatta marked this pull request as ready for review April 23, 2026 03:26
Pre-wire an empty addrman into SimAdapter's p2p.Config so
Node.setupAddrManAndDisc early-returns and does not register
parallax-disc/1 on sim nodes. Sim connections run over net.Pipe
(zero-buffer), where disc's unsolicited YourAddr/addr(self)/GetPeers
traffic competes with test protocol handshakes for the peer's single
write slot and stalls them.

Restores TestMsgFilterPass{Multiple,Wildcard,Single} and TestSnapshot.
The addpeer ip:port-not-matched error now distinguishes legacy vs v2
peers ("no currently connected legacy peer at …"), so the test's
exact 'no currently connected peer' substring no longer matched.
Relax the assertion to 'no currently connected', which holds for both
the legacy phrasing and any future v2 variant.
Refactor the single-shot probe into a probeOne helper that branches
on a CrawlNode's KeyType: KeyType=0x00 dials over the BIP324 v2
handshake (no pubkey needed); KeyType=0x01 keeps the existing legacy
RLPx path with the pubkey reconstructed from the 64-byte NodeID.

Add a parseSeed parser that auto-detects ip:port (v2) vs enode://...
(legacy) — same convention as admin_addPeer, so operators can paste
either format.

Introduce a wireConn abstraction over rlpx.Conn and bip324handshake.Conn
so the post-handshake message loop is identical for both transports.
The v2 ID in the devp2p Hello is computed locally as
ephem || sha256(ephem) to match p2p/transport_v2.go's identity
derivation, keeping cmd/devp2p free of a hard p2p package dep.
The single-shot probe moves to `parallax-disc probe <addr>`. The new
`parallax-disc crawl` runs a worker pool against a CrawlState JSON
file, BFS-walking the network from --bootnodes plus any nodes
already in state. Each worker calls probeOne and feeds the returned
Peers reply back into the queue.

Per-node stats (FirstSeen, LastSuccess, LastAttempt, SuccessCount,
FailCount, LastError, Capabilities) live on CrawlNode and are
written atomically (write-temp + rename) every --save-interval and
on exit. The walker exits when the queue drains or --timeout fires.

Self-loop guard skips loopback / unspecified / link-local /
multicast IPs returned in gossip. Non-IP networks (Tor / I2P /
CJDNS) are silently dropped from the queue — the crawler can't dial
them anyway.
Cover the pure helpers that the multi-hop walker leans on:

- parseSeed branches on enode:// vs ip:port and rejects malformed
  inputs (empty, missing port, zero port, hostnames, truncated
  pubkey).
- nodeKey is invariant under KeyType so a v1.x→v2.x migration keeps
  its accumulated stats; IPv4 and IPv6 with the same textual IP do
  NOT collide.
- peerEntryToCrawlNode skips Tor v3 / I2P / CJDNS (un-dialable),
  zero ports, and bad-length addresses.
- isDialableIP rejects loopback, unspecified, link-local, multicast.
- computeDiscOffset returns the right base for parallax+parallax-disc,
  parallax-disc alone, and noisy cap lists; errors on no parallax-disc.
- CrawlState round-trips through saveState/loadState; missing files
  load as empty; corrupt files error rather than silently overwriting.
- registerAndEnqueue dedups by nodeKey and refreshes identity fields
  while preserving stats on a v1.x→v2.x migration.
…to-route53)

`dns-seed compile` reads a CrawlState and writes a SeedZone JSON
after applying four filters: KeyType=0x00 only (DNS seed is the v2
bootstrap path), TCPPort=32110 only (Bitcoin parity for default-port
nodes), NetworkID in {IPv4,IPv6} only (DNS can't resolve Tor/I2P),
and a reliability gate (success in last --max-age, ≥--min-successes
probes, success rate ≥--min-success-rate). If the result has fewer
than --min-records entries, exits non-zero without writing —
defends against publishing an empty zone after a crawler outage.

`dns-seed to-zonefile` emits a BIND `$ORIGIN` snippet with one
A/AAAA record per IP at the zone apex.

`dns-seed to-cloudflare` reconciles A/AAAA records at the apex via
the existing cloudflareClient (one DNS record per IP, idempotent
diff/apply mirroring uploadRecords).

`dns-seed to-route53` deploys one A RRSet (all IPv4) and one AAAA
RRSet (all IPv6) via UPSERT — matches Route53's billing model.

Three filter levels (compile-time, deploy-time refusal of empty
zones at compile, idempotent UPSERT at deploy) keep the operator
in control of what reaches the public DNS.
Extract compileSeedZone and renderZonefile from the cli actions so
they're directly testable without mocking urfave/cli. The tests
cover:

- TestCompileFiltersAndSorts on a 9-node fixture: exactly the
  v2/default-port/fresh/healthy entries pass; results sort A before
  AAAA, then by IP within family.
- TestCompileEachFilterAxis: one node per case isolates each filter
  (KeyType, port, freshness, success count, success rate, dialable
  IP) so a regression that breaks one axis can be diagnosed quickly.
- TestCompileRefusesNearEmpty: --min-records guard exits non-zero
  rather than overwriting the public DNS with a near-empty zone
  after a crawler outage.
- TestSeedZoneRoundTrip: save → load → reflect.DeepEqual.
- TestLoadSeedZoneRejectsEmptyName: malformed zones (missing name)
  fail load rather than silently deploying an unbound record.
- TestZonefileGoldenOutput: BIND snippet is byte-stable.
The node resolves netparams.MainnetDNSSeeds (defaults to
seed.prlxdisc.org) every 24h via net.DefaultResolver. Each A/AAAA
record returned is paired with the default v2 listen port (32110)
and ingested into addrman with source=dns_seed. First resolution
fires 30s after Server.Start so the listener and addrman have time
to settle.

DNSSeedResolver is an injectable interface so tests fake it
without touching DNS. Undialable IPs (loopback, unspecified,
link-local, multicast) are dropped defensively even though the
publisher already filters them — DNS responses can come from
anywhere.

Flag wiring:

- --dnsseed=<host,host,...> overrides the netparam.
- --dnsseed= (empty) disables.
- --nodiscover overrides everything to disable.
- Empty Config.DNSSeeds means the resolver loop never starts.

The loop is goroutine-managed under loopWG with a context cancelled
on srv.quit, so Stop() tears it down cleanly.
The pre-handshake inbound throttle in checkInboundConn rejected any
non-LAN IP that had connected within the last 30s. With v2 ip:port
dedup turned off in the v2-only posture, this single-attempt cap
broke the legitimate co-located case: an operator running
parallax-disc-crawl on the same host as their parallaxd shares the
public-NAT source IP with the daemon's existing peer connections,
so the second connection (the crawler's) is dropped pre-handshake
and the dialer sees `bip324handshake: read peer pub: EOF`.

Add expHeap.count() and switch the check from "any in-window entry"
to "count >= maxInboundConnAttemptsPerIP" with the cap set to 4.
Co-located crawlers + a few real peers behind one NAT now coexist;
flooding cost only relaxes by 4x per IP, still high enough that
saturating the listener requires scaling across IPs.

Update TestServerInboundThrottle to dial the cap times successfully
then expect the over-cap dial to be closed.
- goimports: align field tags in CrawlNode and rejected-cases table.
- unconvert: drop redundant net.IP() conversion (To4/To16 already
  return net.IP).
- unused: delete leftover writeMsg helper from the pre-wireConn
  refactor; drop unused `host` field on fakeResolver.
The walker exited on the first queue drain regardless of --timeout.
Add a --reprobe-interval flag (default 30s) — when the queue drains
the walker now sleeps for that interval, clears the per-run dedup
set, re-enqueues every node from state, and keeps probing until ctx
fires. Operators get the daemon-style behavior --timeout already
implied: "run for this long".

Set --reprobe-interval=0 to keep the old one-shot behavior (exit on
first drain).
Bootnode lists are now two independent slices. MainnetBootnodes
carries the enode:// URLs consumed by discv4 tooling and addrman's
KeyType=0x01 ingest; MainnetBootnodesV2 carries the plain ip:port
endpoints for the BIP324 v2 handshake path (KeyType=0x00).
cmd/utils/flags.go setBootstrapNodes parses both and populates
Config.BootstrapNodes ([]*enode.Node) and Config.BootstrapNodesV2
([]*net.TCPAddr); --bootnodes sniffs per entry and routes into the
right slice.

Transport selection at dial time keys on an ENR entry (pipv2)
rather than a handshake-stage capability check: a node sets
enrV2Transport on its localnode record, and the dial scheduler
routes any iterator-yielded enode whose ENR carries pipv2 directly
to DialV2, bypassing the v1 RLPx path. Avoids the v1-then-promote
dance that broke dual-stack peers when the signal was conflated
with a subprotocol cap.

DialV2 owns a per-(ip,port) cooldown via v2DialCooldownCheckAndMark,
shared by runV2Dialer and the scheduler's v2 branch. Select's
chanceFactor ramp guarantees it returns candidates regardless of
chance weighting, so the cooldown is the authoritative rate limit.
errV2DialCooldown sentinel lets callers back off on rejection
without confusing it with a real dial failure.

Addrman changes: V2Iter.Next skips IsTerrible entries and caps
consecutive KeyType-mismatch spins before applying exponential
backoff, stopping a single stale KeyType=0x00 entry from
dominating Select; AddrMan.UpgradeIdentity rewrites an existing
entry's KeyType/NodeID in place for callers that learn a stronger
identity; IngestV2Addr helper ingests ip:port with KeyType=0x00.

setupDiscovery wraps each Protocol.DialCandidates in a TeeIter so
enrtree-delivered peers enter addrman instead of bypassing it.
Inbound v1 peers rebuild c.node with phs.ListenPort so addrmanGood
resolves the addrman entry at the peer's advertised endpoint
rather than the ephemeral source port.
Adds a pipv2ENREntry tester and emits a "Crawl complete" info line
with total / pipv2 / v1_only counts at the end of every discv4
crawl. No behavior change for the crawl itself — gives operators
a quick read on v2-transport adoption across the discovered set.
Adds a per-node info line emitted from every crawl worker after
updateNode returns, with action, id, ip, tcp/udp ports, seq, and
pipv2 status. Lets operators watch the crawl unfold instead of
waiting for the 8s status ticker.
Every probeAndUpdate in the walker now emits:
  - "parallax-disc probe" before the network I/O (addr, keyType, id)
  - "parallax-disc probe ok" on success (peers + caps counts)
  - "parallax-disc probe failed" on error (failCount, err)
  - "parallax-disc fanout" with enqueued/skipped counts when the
    peer returned a non-empty Peers list
Lets operators watch the walk unfold live rather than wait for
save-interval snapshots.
to-cloudflare / to-route53 / to-zonefile load their input via
loadSeedZone, which expects the compiled SeedZone JSON — but the
pipeline has both a JSON form and a BIND zonefile form, and
confusing them is easy. Sniff the first non-whitespace byte: a
leading ';' or '$' (zonefile) produces an actionable message
pointing at `dns-seed compile`, instead of the raw JSON parser
error.
@andrepatta andrepatta merged commit d1acad6 into main Apr 23, 2026
4 checks passed
@andrepatta andrepatta deleted the parallax-disc branch April 23, 2026 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant