Understood however that is not the behavior I am seeing. Although that is
likely to be the case for EPGM since those are UDP packets although from my
understanding regardless whether incoming data is multicast or unicast, PGM is
binding to any address and specific port. The kernel will pass all data
received on an interface to any listening socket as long as the destination
port patches that of the socket binding.
Now let’s put aside UDP for a sec, what about when using pgm transport? These
are raw sockets and any unicast NAK are actually sent from remote SUB to the
PUB unicast address and source port (which is randomly selected at the time of
creating the raw PGM PUB socket). At that point the PUB socket should be the
only one listening on its own unicast address and source port. Correct?
This is a snapshot of what my netstat –ln looks like at the moment. This is
with both ( PUB and SUB created and running on the same host ).
Proto Recv-Q Send-Q Local Address
Foreign Address State
Sockets associated with PUB:
raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 ::%622984:*
113
raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 ::%623304:*
113
raw 0 0 ::%2147479552:113
::%622984:* 113
Sockets associated with SUB:
raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 ::%622984:*
113
raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 ::%623304:*
113
raw 0 0 ::%2147479552:113
::%622984:* 113
You would notice how the Recv-Q is full on both PUB and SUB related send/router
alert send sockets.
These are my thoughts as to why they are full and not because of the same
reason:
For the case of the SUB associated sockets the 2001 address ones basically used
to send NAKs to the remote PUB:
These get full as soon as a remote PUB starts sending multicast data. I think
the SUB send socket is connecting with the destination port used to send
multicast traffic. I can see whenever a SUB sends NAK to the PUB that the
source port on unicast packet matches that of the destination port of the
multicast group. However this is not really an issue since the SUB socket is
configured in PGM as receive only therefore any ODATA/SPM data received on its
send socket is not processed. The SUB socket however is also getting the
multicast data via the local binding: ::%2147479552:113 which as seen is
emptying out its queue fine and I could verify the node is receiving data at
the application level.
For the case of the PUB associated sockets the 2001 address ones basically used
to send ODATA/SPM/RDATA/NCF to remote SUB:
Even though its local binding: ::%2147479552:113 is also receiving the
multicast data sent by remote PUB it is thrown out since the PUB socket is
configured as send only at PGM level and so ODATA/SPM data received is thrown
out.
However, its send associated sockets do receive unicast NAKs from remote SUB
and as seen above they are being put on the socket’s Recv-Q however the queue
is full because NAKs are not being processed by the PUB socket.
Note: The exact same behavior is seen with EPGM the only difference is that
none of the sockets Recv-Q get full because they are being emptied out at the
UDP layer upon arrival however I suspect that once forwarded to the PGM layer
the PGM socket buffers would show the same thing as netstat –ln above.
Even though I think is redundant and probably not a good idea to run the same
code when creating either a ZMQ PUB and/or SUB socket since essentially those
socket types are restricted to do specific things like send/receive only, that
does not appear to be the cause of the issue here. I have read in some of the
openpgm doc that it is necessary for the application to frequently call
pgm_recv as that somehow moves the pgm state machine to do things, however my
issue here is how to accomplish that from the ZMQ API layer, that is the whole
point of using ZMQ in my case in the first place.
Any thoughts? And thanks of the comments.
From: zeromq-dev [mailto:[email protected]] On Behalf Of
Steven McCoy
Sent: Friday, March 23, 2018 12:53 PM
To: ZeroMQ development list
Subject: Re: [zeromq-dev] [External] Re: A PGM/EPGM question
The problem is that the kernel will not multicast UDP unicast packets to each
socket listening so it is probable the wrong socket is hearing the NAK.
On Fri, Mar 23, 2018 at 12:07 Montero, Antonio UTC CCS
<[email protected]<mailto:[email protected]>> wrote:
ZMQ’s implementation of PUB socket type does not allow for receive calls to be
made (zmq_recv is disabled), hence why I am trying to figure out how does one
trigger ZMQ to call “pgm_recv” on the PUB socket in order to get the PUB socket
to processes received NAKs from a remote SUB socket?
I have tried querying the PUB socket state via ZMQ_EVENTS to triggering the
processing of any commands available for the socket however that does not seem
to move the PGM state machine in terms of processing NAKs.
I am running both a PUB and SUB on the same application on the same host and
although I see the same set of sockets being created at the PGM level for both
PUB and SUB ZMQ sockets which includes multiple sockets binding to the same
port, this does not appear to cause any issues in terms of my SUB socket able
to receive multicast messages from a remote PUB and respond with unicast NAKs
when data loss is detected.
Any ideas as to how a user should get ZMQ lib to trigger NAKs processing for a
PUB socket using either pgm/epgm transports?
Thanks,
Antonio Montero.
From: zeromq-dev
[mailto:[email protected]<mailto:[email protected]>]
On Behalf Of Steven McCoy
Sent: Friday, March 23, 2018 9:55 AM
To: ZeroMQ development list
Subject: [External] Re: [zeromq-dev] A PGM/EPGM question
You should check the PUB socket has a loop that is processing the incoming NAK
requests, this is usually recv call based. The symptoms indicate that the
protocol is operating TX-only.
—
Steve-o
On Wed, Mar 21, 2018 at 19:50 Montero, Antonio UTC CCS
<[email protected]<mailto:[email protected]>> wrote:
Hello,
I am having a bit of a hard time getting a ZMQ PUB socket reacting to PGM NAKs
which means at this point I am not able to recover lost packets
I have tried with both protocols: (pgm and epgm). Still getting the same result.
I have a setup where I create both a PUB and SUB sockets in that order in the
same ZMQ context running on the same host and connected to the same IPv6
multicast address and port.
I have N nodes and each node has a PUB and SUB. All N nodes send messages
asynchronously and all N nodes receive all messages. My multicast network is
working fine whether I use pgm or epgm and all N nodes communicate with each
other over IPv6 multicast.
The issue I am having is when a packet loss occurs, a remote SUB sends a
unicast NAK back to the source PUB however I am not seeing any NCF or RDATA
being sent by the source PUB. I have verified that the packets in question are
in fact still in the Tx Window as reported by the SPMs being sent by the source
PUB. I have ongoing traffic on a periodic basis which triggers a send and
receive respectably on the PUB and SUB sockets and I am clearing out the
ZMQ_EVENTS after every send and/or receive. I also have a polling thread
running every 150ms to check for ZMQ_EVENTS on both PUB and SUB.
Nothing seems to work in terms of triggering the PUB to react and process the
NAKs received from remote SUB. Looking at the code a bit I see this function
zmq::pgm_socket_t::process_upstream but
can’t tell if and how it is being triggered. It does not appear to be from my
perspective.
Any help or direction would be appreciated. Thanks.
--
Antonio
_______________________________________________
zeromq-dev mailing list
[email protected]<mailto:[email protected]>
https://lists.zeromq.org/mailman/listinfo/zeromq-dev<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.zeromq.org_mailman_listinfo_zeromq-2Ddev&d=DwMFaQ&c=ilBQI1lupc9Y65XwNblLtw&r=KE-_zI6ApNLT6qvQ1tC8zssu327OLN9lWlhIigWhJA4&m=AYMnN2d160L4oOUMYUzTsb0enU6l7vTnRPY_52rLMy0&s=LcJtAkEY4h2bzvGKaxr7OMpdGRbSSgLTF12pJkc7N70&e=>
_______________________________________________
zeromq-dev mailing list
[email protected]<mailto:[email protected]>
https://lists.zeromq.org/mailman/listinfo/zeromq-dev<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.zeromq.org_mailman_listinfo_zeromq-2Ddev&d=DwMFaQ&c=ilBQI1lupc9Y65XwNblLtw&r=KE-_zI6ApNLT6qvQ1tC8zssu327OLN9lWlhIigWhJA4&m=xnJYudr-VZyLQc2fVwcyswMASLxV90FdVTH1C3EKuwk&s=mzDn57-bG6GCUrlAQwPK0okr2zFlO_BdVZbzm-_kSZ0&e=>
_______________________________________________
zeromq-dev mailing list
[email protected]
https://lists.zeromq.org/mailman/listinfo/zeromq-dev