I’ve branched the code to add the proxy code for testing: https://github.com/davipt/libzmq/tree/fix-002-proxy_lat_thr
This now allows me: 1. current PUSH/PULL end-to-end test: idavi:perf bruno$ ./local_thr tcp://127.0.0.1:5555 500 10000000 & local_thr bind-to=tcp://127.0.0.1:5555 message-size=500 message-count=10000000 type=0 check=0 connect=0 idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:5555 500 10000000 & remote_thr connect-to=tcp://127.0.0.1:5555 message-size=500 message-count=10000000 type=0 check=0 message size: 500 [B] message count: 10000000 mean throughput: 1380100 [msg/s] mean throughput: 5520.400 [Mb/s] 2. PUB/SUB end-to-end test: idavi:perf bruno$ ./local_thr tcp://127.0.0.1:5555 500 10000000 1 & local_thr bind-to=tcp://127.0.0.1:5555 message-size=500 message-count=10000000 type=1 check=0 connect=0 idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:5555 500 10000000 1 & remote_thr connect-to=tcp://127.0.0.1:5555 message-size=500 message-count=10000000 type=1 check=0 message size: 500 [B] message count: 10000000 mean throughput: 971666 [msg/s] mean throughput: 3886.664 [Mb/s] 3. same test via zmq_proxy, by switching local_lat from bind to connect: idavi:perf bruno$ ./proxy tcp://*:8881 tcp://*:8882 & Proxy type=PULL|PUSH in=tcp://*:8881 out=tcp://*:8882 idavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 10000000 32 & local_thr bind-to=tcp://127.0.0.1:8882 message-size=500 message-count=100000 type=32 check=0 connect=32 idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:8881 500 10000000 & remote_thr connect-to=tcp://127.0.0.1:8881 message-size=500 message-count=10000000 type=0 check=0 message size: 500 [B] message count: 10000000 mean throughput: 92974 [msg/s] mean throughput: 371.896 [Mb/s] 4. same test via proxy and PUB/SUB, including checking if every message arrives (*) idavi:perf bruno$ ./proxy tcp://*:8881 tcp://*:8882 1 & Proxy type=XSUB|XPUB in=tcp://*:8881 out=tcp://*:8882 idavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 10000000 49 & local_thr bind-to=tcp://127.0.0.1:8882 message-size=500 message-count=10000000 type=49 check=16 connect=32 idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:8881 500 10000000 17 & remote_thr connect-to=tcp://127.0.0.1:8881 message-size=500 message-count=10000000 type=17 check=16 message size: 500 [B] message count: 10000000 mean throughput: 88721 [msg/s] mean throughput: 354.884 [Mb/s] (*) if check is enabled on the remote_thr, the message, if size>16, will contain a counter. On the local_thr it will then verify if the counter comes at the expected order and without loosing any message. Hence why the remote_thr needs to increase the HWM and sleep for one second in case of PUB/SUB. So, then again, what is happening with the zmq_proxy? On Nov 7, 2013, at 22:15, Bruno D. Rodrigues <[email protected]> wrote: > I’ve been testing a lot of combinations of ZeroMQ over Java, between the pure > jeromq base and the jzmq JNI libzmq C code. Albeit my impression so far is > that jeromq is way faster than the binding - not that the code isn’t great, > but my feeling so far is that the JNI jump slows everything down - at a > certain point I felt the need for a simple zmq_proxy network node and I was > pretty sure that the C code must be faster than the jeromq. I have some ideas > that can improve the jeromq proxy code, but it felt easier to just compile > the zmq_proxy code from the book. > > Unfortunately something went completely wrong on my side so I need your help > to understand what is happening here. > > Context: > MacOSX Mavericks fully updated, MBPro i7 4x2 CPU 2.2Ghz 16GB > libzmq from git head > (same for jeromq and libzmq, albeit I’m using my own fork so I can send pulls > back) > my data are json lines that goes from about 100 bytes to some multi MB > exceptions, but the average of those million messages is about 500bytes. > > Test 1: pure local_thr and remote_thr: > > iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8881 500 1000000 & > iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 & > real 0m0.732s > user 0m0.516s > sys 0m0.394s > message size: 500 [B] > message count: 1000000 > mean throughput: 1418029 [msg/s] > mean throughput: 5672.116 [Mb/s] > > Test 2: change local_thr to perform connect instead of bind, and put a proxy > in the middle. > The proxy is the first C code example from the book, available here > https://gist.github.com/davipt/7361477 > iDavi:c bruno$ gcc -o proxy proxy.c -I /usr/local/include/ -L /usr/local/lib/ > -lzmq > iDavi:c bruno$ ./proxy tcp://*:8881 tcp://*:8882 1 > Proxy type=PULL/PUSH in=tcp://*:8881 out=tcp://*:8882 > > iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 1000000 & > iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 & > iDavi:perf bruno$ message size: 500 [B] > message count: 1000000 > mean throughput: 74764 [msg/s] > mean throughput: 299.056 [Mb/s] > > real 0m10.358s > user 0m0.668s > sys 0m0.508s > > > Test3: use the jeromq equivalent of the proxy: > https://gist.github.com/davipt/7361623 > > iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 1000000 & > [1] 15816 > iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 & > [2] 15830 > iDavi:perf bruno$ > real 0m3.429s > user 0m0.654s > sys 0m0.509s > message size: 500 [B] > message count: 1000000 > mean throughput: 293532 [msg/s] > mean throughput: 1174.128 [Mb/s] > > This performance coming out of Java is okish, it’s here just for comparison, > and I’ll spend some time looking at it. > > The core question is the C proxy - why 10 times slower than the no-proxy > version? > > One thing I noticed, by coincidence, is that on the upper side of the proxy, > both with the C “producer” as well as the java one, tcpdump shows me > consistently packets of 16332 (or the MTU size if using ethernet, 1438 I > think). This value is consistent for the 4 combinations of producers and > proxies (jeromq vs c). > > But on the other side of the proxy, the result is completely different. With > the jeromq proxy, I see packets of 8192 bytes, but with the C code I see > packets of either 509 or 1010. It feels like the proxy is sending the > messages one by one. Again, this value is consistent with the PULL consumer > after the proxy, being it C or java. > > So this is something on the proxy “backend” socket side of the zmq_proxy. > > Also, I see quite similar behavior with a PUB - [XSUB+Proxy+XPUB] - SUB > version. > > What do I need to tweak on the proxy.c ? > > Thanks in advance >
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
