Maybe have to use zmq_poll with both in and out events? Ultimately in_event() needs to fire on the pgm_sender which calls process_upstream() that processes a NAK.
On 23 March 2018 at 13:43, Montero, Antonio UTC CCS < [email protected]> wrote: > Understood however that is not the behavior I am seeing. Although that is > likely to be the case for EPGM since those are UDP packets although from my > understanding regardless whether incoming data is multicast or unicast, PGM > is binding to any address and specific port. The kernel will pass all data > received on an interface to any listening socket as long as the destination > port patches that of the socket binding. > > > > Now let’s put aside UDP for a sec, what about when using pgm transport? > These are raw sockets and any unicast NAK are actually sent from remote SUB > to the PUB unicast address and source port (which is randomly selected at > the time of creating the raw PGM PUB socket). At that point the PUB socket > should be the only one listening on its own unicast address and source > port. Correct? > > > > *This is a snapshot of what my netstat –ln looks like at the moment. This > is with both ( PUB and SUB created and running on the same host ).* > > > > Proto Recv-Q Send-Q Local Address > Foreign > Address State > > *Sockets associated with PUB:* > > raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 > ::%622984:* 113 > > raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 > ::%623304:* 113 > > raw 0 0 ::%2147479552:113 > ::%622984:* > 113 > > *Sockets associated with SUB:* > > raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 > ::%622984:* 113 > > raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 > ::%623304:* 113 > > raw 0 0 ::%2147479552:113 > ::%622984:* > 113 > > > > You would notice how the Recv-Q is full on both PUB and SUB related > send/router alert send sockets. > > These are my thoughts as to why they are full and not because of the same > reason: > > > > For the case of the SUB associated sockets the 2001 address ones basically > used to send NAKs to the remote PUB: > > These get full as soon as a remote PUB starts sending multicast data. I > think the SUB send socket is connecting with the destination port used to > send multicast traffic. I can see whenever a SUB sends NAK to the PUB that > the source port on unicast packet matches that of the destination port of > the multicast group. However this is not really an issue since the SUB > socket is configured in PGM as receive only therefore any ODATA/SPM data > received on its send socket is not processed. The SUB socket however is > also getting the multicast data via the local binding: ::%2147479552:113 > which as seen is emptying out its queue fine and I could verify the node is > receiving data at the application level. > > > > For the case of the PUB associated sockets the 2001 address ones basically > used to send ODATA/SPM/RDATA/NCF to remote SUB: > > Even though its local binding: ::%2147479552:113 is also receiving the > multicast data sent by remote PUB it is thrown out since the PUB socket is > configured as send only at PGM level and so ODATA/SPM data received is > thrown out. > > However, its send associated sockets do receive unicast NAKs from remote > SUB and as seen above they are being put on the socket’s Recv-Q however the > queue is full because NAKs are not being processed by the PUB socket. > > > > Note: The exact same behavior is seen with EPGM the only difference is > that none of the sockets Recv-Q get full because they are being emptied out > at the UDP layer upon arrival however I suspect that once forwarded to the > PGM layer the PGM socket buffers would show the same thing as netstat –ln > above. > > > > Even though I think is redundant and probably not a good idea to run the > same code when creating either a ZMQ PUB and/or SUB socket since > essentially those socket types are restricted to do specific things like > send/receive only, that does not appear to be the cause of the issue here. > I have read in some of the openpgm doc that it is necessary for the > application to frequently call pgm_recv as that somehow moves the pgm state > machine to do things, *however my issue here is how to accomplish that > from the ZMQ API layer*, that is the whole point of using ZMQ in my case > in the first place. > > > > Any thoughts? And thanks of the comments. > > > > *From:* zeromq-dev [mailto:[email protected]] *On > Behalf Of *Steven McCoy > *Sent:* Friday, March 23, 2018 12:53 PM > *To:* ZeroMQ development list > *Subject:* Re: [zeromq-dev] [External] Re: A PGM/EPGM question > > > > The problem is that the kernel will not multicast UDP unicast packets to > each socket listening so it is probable the wrong socket is hearing the NAK. > > > > On Fri, Mar 23, 2018 at 12:07 Montero, Antonio UTC CCS < > [email protected]> wrote: > > ZMQ’s implementation of PUB socket type does not allow for receive calls > to be made (zmq_recv is disabled), hence why I am trying to figure out how > does one trigger ZMQ to call “pgm_recv” on the PUB socket in order to get > the PUB socket to processes received NAKs from a remote SUB socket? > > I have tried querying the PUB socket state via ZMQ_EVENTS to triggering > the processing of any commands available for the socket however that does > not seem to move the PGM state machine in terms of processing NAKs. > > > > I am running both a PUB and SUB on the same application on the same host > and although I see the same set of sockets being created at the PGM level > for both PUB and SUB ZMQ sockets which includes multiple sockets binding to > the same port, this does not appear to cause any issues in terms of my SUB > socket able to receive multicast messages from a remote PUB and respond > with unicast NAKs when data loss is detected. > > > > Any ideas as to how a user should get ZMQ lib to trigger NAKs processing > for a PUB socket using either pgm/epgm transports? > > > > Thanks, > > Antonio Montero. > > *From:* zeromq-dev [mailto:[email protected]] *On > Behalf Of *Steven McCoy > *Sent:* Friday, March 23, 2018 9:55 AM > *To:* ZeroMQ development list > *Subject:* [External] Re: [zeromq-dev] A PGM/EPGM question > > > > You should check the PUB socket has a loop that is processing the incoming > NAK requests, this is usually recv call based. The symptoms indicate that > the protocol is operating TX-only. > > > > — > > Steve-o > > > > On Wed, Mar 21, 2018 at 19:50 Montero, Antonio UTC CCS < > [email protected]> wrote: > > Hello, > > I am having a bit of a hard time getting a ZMQ PUB socket reacting to PGM > NAKs which means at this point I am not able to recover lost packets > > I have tried with both protocols: (pgm and epgm). Still getting the same > result. > > > > I have a setup where I create both a PUB and SUB sockets in that order in > the same ZMQ context running on the same host and connected to the same > IPv6 multicast address and port. > > I have N nodes and each node has a PUB and SUB. All N nodes send messages > asynchronously and all N nodes receive all messages. My multicast network > is working fine whether I use pgm or epgm and all N nodes communicate with > each other over IPv6 multicast. > > The issue I am having is when a packet loss occurs, a remote SUB sends a > unicast NAK back to the source PUB however I am not seeing any NCF or RDATA > being sent by the source PUB. I have verified that the packets in question > are in fact still in the Tx Window as reported by the SPMs being sent by > the source PUB. I have ongoing traffic on a periodic basis which triggers a > send and receive respectably on the PUB and SUB sockets and I am clearing > out the ZMQ_EVENTS after every send and/or receive. I also have a polling > thread running every 150ms to check for ZMQ_EVENTS on both PUB and SUB. > > > > Nothing seems to work in terms of triggering the PUB to react and process > the NAKs received from remote SUB. Looking at the code a bit I see this > function zmq::pgm_socket_t::process_upstream but > > can’t tell if and how it is being triggered. It does not appear to be from > my perspective. > > > > Any help or direction would be appreciated. Thanks. > > > > -- > > Antonio > > > > _______________________________________________ > zeromq-dev mailing list > [email protected] > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.zeromq.org_mailman_listinfo_zeromq-2Ddev&d=DwMFaQ&c=ilBQI1lupc9Y65XwNblLtw&r=KE-_zI6ApNLT6qvQ1tC8zssu327OLN9lWlhIigWhJA4&m=AYMnN2d160L4oOUMYUzTsb0enU6l7vTnRPY_52rLMy0&s=LcJtAkEY4h2bzvGKaxr7OMpdGRbSSgLTF12pJkc7N70&e=> > > _______________________________________________ > zeromq-dev mailing list > [email protected] > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.zeromq.org_mailman_listinfo_zeromq-2Ddev&d=DwMFaQ&c=ilBQI1lupc9Y65XwNblLtw&r=KE-_zI6ApNLT6qvQ1tC8zssu327OLN9lWlhIigWhJA4&m=xnJYudr-VZyLQc2fVwcyswMASLxV90FdVTH1C3EKuwk&s=mzDn57-bG6GCUrlAQwPK0okr2zFlO_BdVZbzm-_kSZ0&e=> > > > _______________________________________________ > zeromq-dev mailing list > [email protected] > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > >
_______________________________________________ zeromq-dev mailing list [email protected] https://lists.zeromq.org/mailman/listinfo/zeromq-dev
