-
Notifications
You must be signed in to change notification settings - Fork 41
UDP search traverse NAT #116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
acb19d9 to
3c792ad
Compare
|
A possible resolution to epics-base/pvAccessCPP#197 This protocol change would involve a protocol version increment (3). I believe that behavior added is backwards compatible since the replyPort field will always have a non-zero value, which older servers will continue to use verbatim. Because the forwarder behavior changes, NAT traversal can only work reliable if all peers on the server/IOC host understand the new @gilesknap This PR implements that idea I mentioned. This should allow NAT traversal (into a container w/ slirp4netns) if both client (outside) and server (inside) are updated. @kasemir Since this would be a protocol change, which I would want |
|
As discussed on epics-base/pvAccessCPP#197 the For the Instead this PR make use of one of the unused bits in the search flags field. If this new To work correctly with forwarding (aka. the local multicast hack) it is necessary that the forwarder should test By extension, |
|
Looks good, but I'm unsure why only the client version is incremented to 3: Don't both server and client need to be aware of the new |
I don't claim to understand the protocol, will this change permit search packets to pass through multiple forwarders? The "forwarder should … clear the flag" wording above makes it seem like it would only work through one, which if true could be short-sighted — won't searches from a client in a container on one host have to pass through 2 forwarders to find PVs on a server inside a container on a different host? |
|
The forwarding is local. Within one host, assume you have N servers. The one started last happens to receive the unicast UDP traffic. It forwards it to all the other N-1 servers on the host, and if one of them recognizes the searched channel, it can then reply. For that it should be sufficient for the forwarder to once put the correct reply port into the message. |
Ah, so the forwarder is only used when copying the packet to the localhost interface for other servers running on the same host (which presumably means running in the same container when containers are involved). Thanks! |
What @kasemir says is entirely correct so far. @anjohnson you do bring up a point which might become relevant in future. In the context of a container with user-mode networking, there are actually two loopback interfaces for a SEARCH to potentially pass through. The host loopback, and the container/guest loopback. At present this is not relevant since Thus, so far running a single container with user-mode networking bound to UDP/5076 prevents any other container, and any PVA peer (eg. So what would happen if some clever person figures out a way around this limitation?
As I think about it, clearing this bit is not necessary. And allowing it to pass through could enable this future possibility. |
I say "potentially" here because of some oddities with how As I think about it, this PR may be better combined with #101, which avoids testing the unicast flag. @gilesknap tells me that podman 5 replaces |
Because I can? Which is not the same as should...
The loopback UDP forwarding behavior blurs the client/server distinction since either type of peer may do the forwarding. To achieve forward compatibility it is necessary for a peer to forward messages with later versions it does not understand. fyi. With PVXS each process gets only one forwarder per port. For example, a process with one client and one server bound to the same port will only have one forwarder. Also, so far (I think) no implementation tries to forward unicast BEACON, which I suppose is a surprise waiting for some site. |
|
@mdavidsaver I have tried this version of PVXS in the same test environment we used at the end of the EPICS meeting. It does not appear to be working for me, it just times out as before. If I run the IOC container with network host it does work, but just binding 5075/tcp 5076/udp is still timing out. I enclose minimal steps to reproduce my result so you can tell me what I am doing wrong or potentially do some diagnosis. testing pvxs in containersServer Machine
Client Machine
Manual edits to the example IOC.(To get the epics-containers framework out of the way) After the first execution of |
|
@EmilioPeJu @coretl FYI you may be interested in this PR. |
Since we always bind() the wildcards, the OS specific oddifies wrt. bind() to interface addresses/bcast address do not apply. So always register interest in associated broadcast addresses.
wrapper sendmsg() and WSASendMsg() Linux and Windows support IPv4 IP_PKTINFO. BSD, Linux, and Windows support IPV6_PKTINFO So far RTEMS and OSX, the extra sendto() overrides will be ignored.
Switch to periodic poll on dedicated worker thread instead of opportunistic poll on use.
Better test of whether received packet was forwarded, based on OS provided meta-data instead of peer provided unicast flag. Also use ORIGIN_TAG (original destination) address as UDP source address if a local interface address.
src/udp_collector.cpp
Outdated
| return; | ||
| } | ||
| } | ||
| if((origin==Remote) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please remove the origin==Remote check here (if it's not vital)?
This check is what is making podman fail, the reason is that the network driver(via slirp4netns) uses a tap interface in which the packets coming from outside are NAT'ed to use the same source IP as the tap interface in the container's network namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm that this change also works for me.
To be clear, this original PR does fix NAT traversal for Docker and for Podman 5.0 or greater (podman 5.0 switched to pasta network stack).
For podman 4.9.4 the original PR does not work but Emilio's patch fixes that. This is because the slirp4netns network stack makes it look like the source and destination address are the same when the IOC receives the search request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor Correction. Podman 5.0 upwards works because there is no NAT by default.
However I tried launching the IOC using the default bridge network and it no longer works, unless I apply Emilio's patch.
[podman run --security-opt label=disable \
-p5064:5064/udp -p5064:5064/tcp -p5075:5075/tcp --network podman \
-itv $(pwd)/pvxs-dev:/epics/support/pvxs \
-v $(pwd)/example-services/services/bl01t-ea-test-01/config/:/epics/ioc/config \
--rm ghcr.io/epics-containers/ioc-generic-developer:2025.4.2b1 \
/epics/ioc/start.shThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please remove the
origin==Remotecheck here (if it's not vital)?
#101 makes this test more precise. I will need to think some more if it can be omitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition is now:
+ if((origin==Broadcast || origin==Forwarding)
+ && peerVersion >= 3
+ && (flags&pva_search_flags::ReplySrcPort))
+ {
+ // peer requests reply to (apparent) sender port
+ port = src.port();
+ }The origin check is important as the ReplySrcPort logic should only be applied to packets received directly from a remote peer. For a packet forwarded by a local peer, the source port is not relevant.
So I think this condition is needed. Especially when the forwarder does not clear ReplySrcPort.
Change the CMD_SEARCH message, adding a reply-to-sender-port flag to allow for replies to traverse a NAT. The meaning of that flag is that the recipient should ignore the replyPort field, and instead send a reply to the apparent port which sent the request. A forwarder should overwrite the replyPort field, and clear this new MustReply flag.
3c792ad to
1aa3fe0
Compare
|
This PR now includes the contents of #101 as well as the I also found a way to test with a single host. With EPICS_PVA_AUTO_ADDR_LIST=NO \
EPICS_PVA_ADDR_LIST=localhost:5076 \
EPICS_PVA_BROADCAST_PORT=15076 \
./bin/linux-x86_64/pvxget pv:nameNote that |
|
@mdavidsaver I've tried the latest changes against:
and it is working. |
Also @EmilioPeJu confirms that it is working fully for docker too (I had some issues with docker not related to this PR) |
|
@mdavidsaver is this ready to release? |
The code is, but the protocol change needs to be communicated properly. Since @kasemir is onboard, the next step is to reconcile the spec. documents (yes, plural). @kasemir @ttkorhonen Do you have thoughts on what to do with the copies in the pvAccessCPP wiki and/or the epics-docs repository? Are there any other copies I am not thinking of? My current thinking is to make https://github.com/epics-docs/epics-docs the authoritative copy going forward, and replace the wiki content with stub links to it. |
@mdavidsaver , I agree with your proposal. |
Remove unused code and enable optional SSL key logging
Change the CMD_SEARCH message, adding a reply-to-sender-port flag (
ReplyPort) to allow for replies to traverse a NAT.The meaning of that flag is that the recipient should ignore the replyPort field, and instead send a reply to the apparent port which sent the request.
A forwarder should overwrite the replyPort field,
and clear this new.ReplyPortflagThis behavior is enabled by PVA protocol header version
3.Potential TODO items for the whole change process:
core.pvaUpdate pvAccessJava, forwarder onlyKay thinks not worthwhile