That is actually where my investigation was taking me. Any idea what is supposed to trigger zmq::pgm_sender_t::in_event ???
I did have a zmq_poll checking for both ZMQ_POLLOUT and ZMQ_POLLIN ZMQ_EVENTS, however when I enabled that code this is what was happening: The thread checking for zmq_poll with a timeout of -1 ( wait forever ) on the PUB socket was returning every time from zmq_poll and the ZMQ_POLLOUT was always set. I think this might be because my PUB socket is writable all of the time. At the moment I am just sending a message every 5 seconds and do nothing else. My understanding is that PUB sockets won’t get their ZMQ_POLLIN set because they are not able to receive messages. -----Original Message----- From: zeromq-dev [mailto:[email protected]] On Behalf Of Luca Boccassi Sent: Friday, March 23, 2018 4:17 PM To: ZeroMQ development list Subject: Re: [zeromq-dev] [External] Re: A PGM/EPGM question ZMQ_XPUB has reads enabled as well On Fri, 2018-03-23 at 16:13 -0400, Steven McCoy wrote: > Maybe have to use zmq_poll with both in and out events? Ultimately > in_event() > needs to fire on the pgm_sender which calls process_upstream() that > processes a NAK. > > On 23 March 2018 at 13:43, Montero, Antonio UTC CCS < > [email protected]> wrote: > > > Understood however that is not the behavior I am seeing. Although > > that is likely to be the case for EPGM since those are UDP packets > > although from my understanding regardless whether incoming data is > > multicast or unicast, PGM is binding to any address and specific > > port. The kernel will pass all data received on an interface to any > > listening socket as long as the destination port patches that of the > > socket binding. > > > > > > > > Now let’s put aside UDP for a sec, what about when using pgm > > transport? > > These are raw sockets and any unicast NAK are actually sent from > > remote SUB to the PUB unicast address and source port (which is > > randomly selected at the time of creating the raw PGM PUB socket). > > At that point the PUB socket should be the only one listening on its > > own unicast address and source port. Correct? > > > > > > > > *This is a snapshot of what my netstat –ln looks like at the moment. > > This is with both ( PUB and SUB created and running on the same host > > ).* > > > > > > > > Proto Recv-Q Send-Q Local > > Address > > Foreign > > Address State > > > > *Sockets associated with PUB:* > > > > raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 > > ::%622984:* 113 > > > > raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 > > ::%623304:* 113 > > > > raw 0 0 ::%2147479552:113 > > ::%622984:* > > 113 > > > > *Sockets associated with SUB:* > > > > raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 > > ::%622984:* 113 > > > > raw 164672 0 2001:db8::2b0:19ff:fe73:d890%2147479552:113 > > ::%623304:* 113 > > > > raw 0 0 ::%2147479552:113 > > ::%622984:* > > 113 > > > > > > > > You would notice how the Recv-Q is full on both PUB and SUB related > > send/router alert send sockets. > > > > These are my thoughts as to why they are full and not because of the > > same > > reason: > > > > > > > > For the case of the SUB associated sockets the 2001 address ones > > basically used to send NAKs to the remote PUB: > > > > These get full as soon as a remote PUB starts sending multicast > > data. I think the SUB send socket is connecting with the destination > > port used to send multicast traffic. I can see whenever a SUB sends > > NAK to the PUB that the source port on unicast packet matches that > > of the destination port of the multicast group. However this is not > > really an issue since the SUB socket is configured in PGM as receive > > only therefore any ODATA/SPM data received on its send socket is not > > processed. The SUB socket however is also getting the multicast data > > via the local binding: > > ::%2147479552:113 > > which as seen is emptying out its queue fine and I could verify the > > node is receiving data at the application level. > > > > > > > > For the case of the PUB associated sockets the 2001 address ones > > basically used to send ODATA/SPM/RDATA/NCF to remote SUB: > > > > Even though its local binding: ::%2147479552:113 is also receiving > > the multicast data sent by remote PUB it is thrown out since the PUB > > socket is configured as send only at PGM level and so ODATA/SPM data > > received is thrown out. > > > > However, its send associated sockets do receive unicast NAKs from > > remote SUB and as seen above they are being put on the socket’s > > Recv-Q however the queue is full because NAKs are not being > > processed by the PUB socket. > > > > > > > > Note: The exact same behavior is seen with EPGM the only difference > > is > > that none of the sockets Recv-Q get full because they are being > > emptied out > > at the UDP layer upon arrival however I suspect that once forwarded > > to the > > PGM layer the PGM socket buffers would show the same thing as > > netstat –ln > > above. > > > > > > > > Even though I think is redundant and probably not a good idea to > > run the > > same code when creating either a ZMQ PUB and/or SUB socket since > > essentially those socket types are restricted to do specific things > > like > > send/receive only, that does not appear to be the cause of the > > issue here. > > I have read in some of the openpgm doc that it is necessary for the > > application to frequently call pgm_recv as that somehow moves the > > pgm state > > machine to do things, *however my issue here is how to accomplish > > that > > from the ZMQ API layer*, that is the whole point of using ZMQ in my > > case > > in the first place. > > > > > > > > Any thoughts? And thanks of the comments. > > > > > > > > *From:* zeromq-dev [mailto:[email protected]] *On > > Behalf Of *Steven McCoy > > *Sent:* Friday, March 23, 2018 12:53 PM > > *To:* ZeroMQ development list > > *Subject:* Re: [zeromq-dev] [External] Re: A PGM/EPGM question > > > > > > > > The problem is that the kernel will not multicast UDP unicast > > packets to > > each socket listening so it is probable the wrong socket is hearing > > the NAK. > > > > > > > > On Fri, Mar 23, 2018 at 12:07 Montero, Antonio UTC CCS < > > [email protected]> wrote: > > > > ZMQ’s implementation of PUB socket type does not allow for receive > > calls > > to be made (zmq_recv is disabled), hence why I am trying to figure > > out how > > does one trigger ZMQ to call “pgm_recv” on the PUB socket in order > > to get > > the PUB socket to processes received NAKs from a remote SUB socket? > > > > I have tried querying the PUB socket state via ZMQ_EVENTS to > > triggering > > the processing of any commands available for the socket however > > that does > > not seem to move the PGM state machine in terms of processing NAKs. > > > > > > > > I am running both a PUB and SUB on the same application on the same > > host > > and although I see the same set of sockets being created at the PGM > > level > > for both PUB and SUB ZMQ sockets which includes multiple sockets > > binding to > > the same port, this does not appear to cause any issues in terms of > > my SUB > > socket able to receive multicast messages from a remote PUB and > > respond > > with unicast NAKs when data loss is detected. > > > > > > > > Any ideas as to how a user should get ZMQ lib to trigger NAKs > > processing > > for a PUB socket using either pgm/epgm transports? > > > > > > > > Thanks, > > > > Antonio Montero. > > > > *From:* zeromq-dev [mailto:[email protected]] *On > > Behalf Of *Steven McCoy > > *Sent:* Friday, March 23, 2018 9:55 AM > > *To:* ZeroMQ development list > > *Subject:* [External] Re: [zeromq-dev] A PGM/EPGM question > > > > > > > > You should check the PUB socket has a loop that is processing the > > incoming > > NAK requests, this is usually recv call based. The symptoms > > indicate that > > the protocol is operating TX-only. > > > > > > > > — > > > > Steve-o > > > > > > > > On Wed, Mar 21, 2018 at 19:50 Montero, Antonio UTC CCS < > > [email protected]> wrote: > > > > Hello, > > > > I am having a bit of a hard time getting a ZMQ PUB socket reacting > > to PGM > > NAKs which means at this point I am not able to recover lost > > packets > > > > I have tried with both protocols: (pgm and epgm). Still getting the > > same > > result. > > > > > > > > I have a setup where I create both a PUB and SUB sockets in that > > order in > > the same ZMQ context running on the same host and connected to the > > same > > IPv6 multicast address and port. > > > > I have N nodes and each node has a PUB and SUB. All N nodes send > > messages > > asynchronously and all N nodes receive all messages. My multicast > > network > > is working fine whether I use pgm or epgm and all N nodes > > communicate with > > each other over IPv6 multicast. > > > > The issue I am having is when a packet loss occurs, a remote SUB > > sends a > > unicast NAK back to the source PUB however I am not seeing any NCF > > or RDATA > > being sent by the source PUB. I have verified that the packets in > > question > > are in fact still in the Tx Window as reported by the SPMs being > > sent by > > the source PUB. I have ongoing traffic on a periodic basis which > > triggers a > > send and receive respectably on the PUB and SUB sockets and I am > > clearing > > out the ZMQ_EVENTS after every send and/or receive. I also have a > > polling > > thread running every 150ms to check for ZMQ_EVENTS on both PUB and > > SUB. > > > > > > > > Nothing seems to work in terms of triggering the PUB to react and > > process > > the NAKs received from remote SUB. Looking at the code a bit I see > > this > > function zmq::pgm_socket_t::process_upstream but > > > > can’t tell if and how it is being triggered. It does not appear to > > be from > > my perspective. > > > > > > > > Any help or direction would be appreciated. Thanks. > > > > > > > > -- > > > > Antonio > > > > > > > > _______________________________________________ > > zeromq-dev mailing list > > [email protected] > > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.zeromq. > > org_mailman_listinfo_zeromq- > > 2Ddev&d=DwMFaQ&c=ilBQI1lupc9Y65XwNblLtw&r=KE- > > _zI6ApNLT6qvQ1tC8zssu327OLN9lWlhIigWhJA4&m=AYMnN2d160L4oOUMYUzTsb0e > > nU6l7vTnRPY_52rLMy0&s=LcJtAkEY4h2bzvGKaxr7OMpdGRbSSgLTF12pJkc7N70&e > > => > > > > _______________________________________________ > > zeromq-dev mailing list > > [email protected] > > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.zeromq. > > org_mailman_listinfo_zeromq- > > 2Ddev&d=DwMFaQ&c=ilBQI1lupc9Y65XwNblLtw&r=KE- > > _zI6ApNLT6qvQ1tC8zssu327OLN9lWlhIigWhJA4&m=xnJYudr- > > VZyLQc2fVwcyswMASLxV90FdVTH1C3EKuwk&s=mzDn57- > > bG6GCUrlAQwPK0okr2zFlO_BdVZbzm-_kSZ0&e=> > > > > > > _______________________________________________ > > zeromq-dev mailing list > > [email protected] > > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > > > > _______________________________________________ > zeromq-dev mailing list > [email protected] > https://lists.zeromq.org/mailman/listinfo/zeromq-dev -- Kind regards, Luca Boccassi _______________________________________________ zeromq-dev mailing list [email protected] https://lists.zeromq.org/mailman/listinfo/zeromq-dev
