Hi Min, This is great, thanks for looking into it. I am looking forward to trying out the fix.
-Ritesh Sent from my iPhone. On Jan 27, 2013, at 9:16 AM, Yu Dongmin <[email protected]> wrote: > Hi, > > I've found a culprit which caused the data loss. > > When ZMQ send a large message, the stream_engine sends data through multiple > out_event calls. > > The ZMQ linger option only guarantees messages are delivered to the peer > pipe. By the speculative write out_event is called at least once but large > message requires multiple hops. > > Before finishing enough out_event calls, stream_engine can be terminated. > > > So a longer linger option will not resolve this issue. A workaround seems to > be adding some sleeps before close. > > > I'm going to submit a pull request to resolving the issue. > > Thanks > Min > > On Jan 27, 2013, at 12:42 AM, Yu Dongmin <[email protected]> wrote: > >> Hi, >> >> My guess was it might have an issue on libzmq (zeromq c library) when large >> messages were heavily sent. >> >> Thanks >> Min >> >> On Jan 26, 2013, at 4:01 PM, Ritesh Adval <[email protected]> wrote: >> >>> Hi Min, >>> >>> Thanks for the update.Just to confirm, >>> Are you saying that this issue is on zeromq c library or jzmq c wrapper? >>> >>> Just an update that when I replaced >>> DEALER socket which connects to ROUTER socket of broker with REQ socket and >>> replaced DEALER socket which connects to DEALER socket of broker with REP >>> socket, then I do not see message loss when doing the same test. (REQ >>> socket does "send" and then "recv" and REP does opposite "recv" and "send") >>> >>> -Ritesh >>> Sent from my iPhone. >>> >>> >>> On Jan 25, 2013, at 8:42 PM, Min <[email protected]> wrote: >>> >>>> I was able to reproduce the issue on jzmq even on zeromq 3.2.2. >>>> >>>> What I discovered is about last 30K bytes of 45K message was not sometimes >>>> delivered to in-router on raw close. >>>> I didn't build equivalent C code, as jzmq is a thin wrapper of native C >>>> library it could have the same problem. >>>> >>>> But I didn't find a clear solution yet. >>>> >>>> Thanks >>>> Min >>>> >>>> >>>> On Thu, Jan 24, 2013 at 6:39 AM, Ritesh Adval <[email protected]> >>>> wrote: >>>> Hello, >>>> >>>> I have created a bug for this issue with instructions and java test case. >>>> Its at https://zeromq.jira.com/browse/LIBZMQ-497 >>>> >>>> Thanks >>>> Ritesh >>>> >>>> >>>> >>>> >>>> On Tue, Jan 22, 2013 at 6:30 PM, Ritesh Adval <[email protected]> >>>> wrote: >>>> Thanks Min, >>>> >>>> I will create a bug with instruction and unit test. I was also >>>> experimenting with Java only version of zeromq >>>> (https://github.com/zeromq/jeromq). When running same test it does not >>>> drop message but has some other issue. >>>> >>>> -Ritesh >>>> >>>> >>>> >>>> On Mon, Jan 21, 2013 at 11:53 PM, Min <[email protected]> wrote: >>>> Ritesh, >>>> >>>> If you can reproduce the problem, Java code should be fine. >>>> >>>> Community could look into it. >>>> >>>> Thanks >>>> Min >>>> >>>> 2013년 1월 17일 목요일에 Ritesh Adval님이 작성: >>>> >>>> Hi Charles, >>>> >>>> I have test program in JAVA, I am not a C programmer so i will probably >>>> take me time to reproduce this in C. Can someone first take a look at my >>>> JAVA program to see if I am not doing anything stupid. Should I create >>>> bug and attach Java maven project? >>>> Its very easy to run it, all you need is zeromq 2.2.0 installed and jzmq >>>> built and installed by building jzmq (https://github.com/zeromq/jzmq). >>>> I can add instructions to the bug report. Once confirmed that program >>>> looks right I can try to create a C version of the test but will take me >>>> some time. >>>> >>>> let me know. >>>> >>>> Thanks >>>> Ritesh >>>> >>>> >>>> >>>> >>>> On Wed, Jan 16, 2013 at 10:55 PM, Charles Remes <[email protected]> >>>> wrote: >>>> On Jan 16, 2013, at 4:08 PM, Ritesh Adval <[email protected]> wrote: >>>> >>>> > Hi Charles, >>>> > >>>> > Yes I close the socket in my thread after sending 100 messages, and I >>>> > expect that LINGER will make sure messages are sent to the other end, I >>>> > expected that context termination will block and make sure any pending >>>> > messages are sent, but thats not happening. context termination returns >>>> > quickly. >>>> > >>>> > Just now tried again in my unit test by setting LINGER to >>>> > Integer.MAX_VALUE explicitly in all my sockets and ran the test again >>>> > and it did fail with messages getting dropped. >>>> > >>>> > The interesting thing is only the 100th message (The last one) from >>>> > some of my concurrent threads are getting dropped. >>>> >>>> Time to show someone the code. That's the easiest way to figure it out. If >>>> you can reproduce this in C, that will get a lot more attention. >>>> >>>> Here's how to open an issue: >>>> >>>> http://www.zeromq.org/docs:issue-tracking >>>> >>>> cr >>>> >>>> _______________________________________________ >>>> zeromq-dev mailing list >>>> [email protected] >>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>> >>>> >>>> _______________________________________________ >>>> zeromq-dev mailing list >>>> [email protected] >>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> zeromq-dev mailing list >>>> [email protected] >>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>> >>>> >>>> _______________________________________________ >>>> zeromq-dev mailing list >>>> [email protected] >>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >>> _______________________________________________ >>> zeromq-dev mailing list >>> [email protected] >>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
