Skip to content

Conversation

@mdavidsaver
Copy link
Member

@mdavidsaver mdavidsaver commented Apr 20, 2025

Change the CMD_SEARCH message, adding a reply-to-sender-port flag (ReplyPort) to allow for replies to traverse a NAT.

The meaning of that flag is that the recipient should ignore the replyPort field, and instead send a reply to the apparent port which sent the request.

A forwarder should overwrite the replyPort field, and clear this new ReplyPort flag.

This behavior is enabled by PVA protocol header version 3.

Potential TODO items for the whole change process:

@mdavidsaver mdavidsaver added the enhancement New feature or request label Apr 20, 2025
@mdavidsaver mdavidsaver self-assigned this Apr 20, 2025
@mdavidsaver
Copy link
Member Author

mdavidsaver commented Apr 20, 2025

A possible resolution to epics-base/pvAccessCPP#197

This protocol change would involve a protocol version increment (3). I believe that behavior added is backwards compatible since the replyPort field will always have a non-zero value, which older servers will continue to use verbatim.

Because the forwarder behavior changes, NAT traversal can only work reliable if all peers on the server/IOC host understand the new ReplyPort flag.

@gilesknap This PR implements that idea I mentioned. This should allow NAT traversal (into a container w/ slirp4netns) if both client (outside) and server (inside) are updated.

@kasemir Since this would be a protocol change, which I would want core.pva to implement as well, I would like to hear from you. As you may see from this PR, the actual code change can be small.

@mdavidsaver
Copy link
Member Author

As discussed on epics-base/pvAccessCPP#197 the CMD_SEARCH message replyAddr and especially replyPort do not allow NAT traversal since the recipient would send a reply to the wrong port.

For the replyAddr field, the sender may set a special value (0.0.0.0 or ::0) to indicate reply to sender address. However, there is no analogous special value for replyPort. It is not possible to begin using 0 now as existing servers would blindly try to send to 0, which would break in non-NAT situations.

Instead this PR make use of one of the unused bits in the search flags field. If this new ReplyPort bit is set, then a recipient should ignore the replyPort field, and instead use to (apparent) sender/source port.

To work correctly with forwarding (aka. the local multicast hack) it is necessary that the forwarder should test ReplyPort. If set, then the forwarder should mangle the forwarded packet to clear this flag and overwrite replyPort with the apparent sender/source port.

By extension, ReplyPort must be ignored in a forwarded message as older peers will blindly forward.

@kasemir
Copy link

kasemir commented Apr 21, 2025

Looks good, but I'm unsure why only the client version is incremented to 3:

struct pva_version {
     enum {
         client = 3,
         server = 2,

Don't both server and client need to be aware of the new ReplyPort flag?

@anjohnson
Copy link
Member

To work correctly with forwarding (aka. the local multicast hack) it is necessary that the forwarder should test ReplyPort. If set, then the forwarder should mangle the forwarded packet to clear this flag and overwrite replyPort with the apparent sender/source port.

I don't claim to understand the protocol, will this change permit search packets to pass through multiple forwarders? The "forwarder should … clear the flag" wording above makes it seem like it would only work through one, which if true could be short-sighted — won't searches from a client in a container on one host have to pass through 2 forwarders to find PVs on a server inside a container on a different host?

@kasemir
Copy link

kasemir commented Apr 21, 2025

The forwarding is local. Within one host, assume you have N servers. The one started last happens to receive the unicast UDP traffic. It forwards it to all the other N-1 servers on the host, and if one of them recognizes the searched channel, it can then reply. For that it should be sufficient for the forwarder to once put the correct reply port into the message.

@anjohnson
Copy link
Member

The forwarding is local.

Ah, so the forwarder is only used when copying the packet to the localhost interface for other servers running on the same host (which presumably means running in the same container when containers are involved). Thanks!

@mdavidsaver
Copy link
Member Author

The forwarding is local.

What @kasemir says is entirely correct so far. @anjohnson you do bring up a point which might become relevant in future.

In the context of a container with user-mode networking, there are actually two loopback interfaces for a SEARCH to potentially pass through. The host loopback, and the container/guest loopback. At present this is not relevant since slirp4netns and its ilk are not aware of SO_REUSEPORT, which permits multiple UDP sockets to bind to one port.

Thus, so far running a single container with user-mode networking bound to UDP/5076 prevents any other container, and any PVA peer (eg. pvget), from running on that host.

So what would happen if some clever person figures out a way around this limitation?

"forwarder should … clear the flag"

As I think about it, clearing this bit is not necessary. And allowing it to pass through could enable this future possibility.

@mdavidsaver
Copy link
Member Author

two loopback interfaces for a SEARCH to potentially pass through.

I say "potentially" here because of some oddities with how slirp4netns works at present. For example, a broadcast received by the host socket magically becomes a unicast inside ...without setting the PVA SEARCH unicast flag. Another example of why this flag can not be trusted. (cf. epics-base/pvAccessCPP#200)

As I think about it, this PR may be better combined with #101, which avoids testing the unicast flag.

@gilesknap tells me that podman 5 replaces slirp4netns with something else, which does not appear to NAT. It remains to be seen if/how it mangles the destination IP address.

@mdavidsaver
Copy link
Member Author

why only the client version is incremented to 3:

Because I can? Which is not the same as should...

Don't both server and client need to be aware of the new ReplyPort flag?

The loopback UDP forwarding behavior blurs the client/server distinction since either type of peer may do the forwarding. To achieve forward compatibility it is necessary for a peer to forward messages with later versions it does not understand.

fyi. With PVXS each process gets only one forwarder per port. For example, a process with one client and one server bound to the same port will only have one forwarder.

Also, so far (I think) no implementation tries to forward unicast BEACON, which I suppose is a surprise waiting for some site.

@gilesknap
Copy link

@mdavidsaver I have tried this version of PVXS in the same test environment we used at the end of the EPICS meeting.

It does not appear to be working for me, it just times out as before. If I run the IOC container with network host it does work, but just binding 5075/tcp 5076/udp is still timing out.

I enclose minimal steps to reproduce my result so you can tell me what I am doing wrong or potentially do some diagnosis.

testing pvxs in containers

Server Machine

  • clone example IOC instances next to pvxs-dev folder
    • git clone ssh://git@github.com/epics-containers/example-services
  • launch an ioc-generic container and mount in the PR version of pvxs over the version that was compiled at build time
$ podman run --security-opt label=disable \
    -p5064:5064/udp -p5064:5064/tcp -p5075:5075/tcp -p5076:5076/udp \
    -itv $(pwd)/pvxs-dev:/epics/support/pvxs \
    -v $(pwd)/example-services/services/bl01t-ea-test-01/config/:/epics/ioc/config \
    --rm ghcr.io/epics-containers/ioc-generic-developer:2025.4.2b1
  • Launch the example ioc
/epics/ioc/start.sh

Client Machine

  • launch ioc-generic with PR version of pvxs and network host
$ podman run --security-opt label=disable \
    --network host \
    -itv $(pwd)/pvxs-dev:/epics/support/pvxs \
    --rm ghcr.io/epics-containers/ioc-generic-developer:2025.4.2b1
  • try to get the pv
/epics/support/pvxs/bin/linux-x86_64/pvxget bl01t:SUM
  • note that caget works - also starting the Server container with network host works.

Manual edits to the example IOC. 

(To get the epics-containers framework out of the way)

After the first execution of start.sh  a completely standard startup script can be found and edited using vim at /epics/runtime/st.cmd and executed using:

/epics/ioc/bin/linux-x86_64/ioc /epics/runtime/st.cmd 

@gilesknap
Copy link

@EmilioPeJu @coretl FYI you may be interested in this PR.

Since we always bind() the wildcards, the OS specific
oddifies wrt. bind() to interface addresses/bcast address
do not apply.  So always register interest in associated
broadcast addresses.
wrapper sendmsg() and WSASendMsg()

Linux and Windows support IPv4 IP_PKTINFO.
BSD, Linux, and Windows support IPV6_PKTINFO

So far RTEMS and OSX, the extra sendto() overrides
will be ignored.
Switch to periodic poll on dedicated worker thread
instead of opportunistic poll on use.
Better test of whether received packet was forwarded,
based on OS provided meta-data instead of peer provided
unicast flag.

Also use ORIGIN_TAG (original destination) address as
UDP source address if a local interface address.
return;
}
}
if((origin==Remote)
Copy link

@EmilioPeJu EmilioPeJu Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please remove the origin==Remote check here (if it's not vital)?
This check is what is making podman fail, the reason is that the network driver(via slirp4netns) uses a tap interface in which the packets coming from outside are NAT'ed to use the same source IP as the tap interface in the container's network namespace.

Copy link

@gilesknap gilesknap Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that this change also works for me.

To be clear, this original PR does fix NAT traversal for Docker and for Podman 5.0 or greater (podman 5.0 switched to pasta network stack).

For podman 4.9.4 the original PR does not work but Emilio's patch fixes that. This is because the slirp4netns network stack makes it look like the source and destination address are the same when the IOC receives the search request.

Copy link

@gilesknap gilesknap Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor Correction. Podman 5.0 upwards works because there is no NAT by default.

However I tried launching the IOC using the default bridge network and it no longer works, unless I apply Emilio's patch.

[podman run --security-opt label=disable \
    -p5064:5064/udp -p5064:5064/tcp -p5075:5075/tcp --network podman \
    -itv $(pwd)/pvxs-dev:/epics/support/pvxs \
    -v $(pwd)/example-services/services/bl01t-ea-test-01/config/:/epics/ioc/config \
    --rm ghcr.io/epics-containers/ioc-generic-developer:2025.4.2b1 \
   /epics/ioc/start.sh

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please remove the origin==Remote check here (if it's not vital)?

#101 makes this test more precise. I will need to think some more if it can be omitted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition is now:

+        if((origin==Broadcast || origin==Forwarding)
+                && peerVersion >= 3
+                && (flags&pva_search_flags::ReplySrcPort))
+        {
+            // peer requests reply to (apparent) sender port
+            port = src.port();
+        }

The origin check is important as the ReplySrcPort logic should only be applied to packets received directly from a remote peer. For a packet forwarded by a local peer, the source port is not relevant.

So I think this condition is needed. Especially when the forwarder does not clear ReplySrcPort.

Change the CMD_SEARCH message, adding a reply-to-sender-port flag
to allow for replies to traverse a NAT.

The meaning of that flag is that the recipient should ignore the
replyPort field, and instead send a reply to the apparent port
which sent the request.

A forwarder should overwrite the replyPort field, and clear this
new MustReply flag.
@mdavidsaver
Copy link
Member Author

This PR now includes the contents of #101 as well as the ReplySrcPort flag (renamed from ReplyPort).

I also found a way to test with a single host. With softIocPVX running in a container with user mode networking and -p5075:5075/tcp -p5076:5076/udp, then run on the host:

EPICS_PVA_AUTO_ADDR_LIST=NO \
EPICS_PVA_ADDR_LIST=localhost:5076 \
EPICS_PVA_BROADCAST_PORT=15076 \
./bin/linux-x86_64/pvxget pv:name

Note that localhost:5076 can be replaced with 127.255.255.255:5076 or some <iface-addr>:5076 as #101 is able to distinguish a localhost multicast from a unicast or bcast.

@gilesknap
Copy link

@mdavidsaver I've tried the latest changes against:

  • podman 4.9.4
  • podman 5.0.3 with bridge network

and it is working.

@gilesknap
Copy link

gilesknap commented Apr 28, 2025

@mdavidsaver I've tried the latest changes against:

  • podman 4.9.4
  • podman 5.0.3 with bridge network

and it is working.

Also @EmilioPeJu confirms that it is working fully for docker too (I had some issues with docker not related to this PR)

@gilesknap
Copy link

@mdavidsaver is this ready to release?

@mdavidsaver
Copy link
Member Author

@mdavidsaver is this ready to release?

The code is, but the protocol change needs to be communicated properly. Since @kasemir is onboard, the next step is to reconcile the spec. documents (yes, plural).

@kasemir @ttkorhonen Do you have thoughts on what to do with the copies in the pvAccessCPP wiki and/or the epics-docs repository? Are there any other copies I am not thinking of?

My current thinking is to make https://github.com/epics-docs/epics-docs the authoritative copy going forward, and replace the wiki content with stub links to it.

@ttkorhonen
Copy link

@mdavidsaver is this ready to release?

The code is, but the protocol change needs to be communicated properly. Since @kasemir is onboard, the next step is to reconcile the spec. documents (yes, plural).

@kasemir @ttkorhonen Do you have thoughts on what to do with the copies in the pvAccessCPP wiki and/or the epics-docs repository? Are there any other copies I am not thinking of?

My current thinking is to make https://github.com/epics-docs/epics-docs the authoritative copy going forward, and replace the wiki content with stub links to it.

@mdavidsaver , I agree with your proposal.
I am obviously biased because I imported the document to epics-docs, after some discussions. I think that making it the authoritative copy would be the right decision. Version control is easier than in wiki.
I am not aware of more copies, or at least do not remember any on top of my head.

JJL772 pushed a commit to slac-epics/pvxs-tls that referenced this pull request Jan 5, 2026
Remove unused code and enable optional SSL key logging
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants