as requested I’ve created a ticket and updated the branch with the latest code and a perf/README.txt explaining how to run it (basically the instructions below)
https://github.com/zeromq/libzmq/issues/757 On Nov 10, 2013, at 13:08, Bruno D. Rodrigues <[email protected]> wrote: > I’ve branched the code to add the proxy code for testing: > https://github.com/davipt/libzmq/tree/fix-002-proxy_lat_thr > > This now allows me: > > 1. current PUSH/PULL end-to-end test: > > idavi:perf bruno$ ./local_thr tcp://127.0.0.1:5555 500 10000000 & > local_thr bind-to=tcp://127.0.0.1:5555 message-size=500 > message-count=10000000 type=0 check=0 connect=0 > > idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:5555 500 10000000 & > remote_thr connect-to=tcp://127.0.0.1:5555 message-size=500 > message-count=10000000 type=0 check=0 > > message size: 500 [B] > message count: 10000000 > mean throughput: 1380100 [msg/s] > mean throughput: 5520.400 [Mb/s] > > 2. PUB/SUB end-to-end test: > > idavi:perf bruno$ ./local_thr tcp://127.0.0.1:5555 500 10000000 1 & > local_thr bind-to=tcp://127.0.0.1:5555 message-size=500 > message-count=10000000 type=1 check=0 connect=0 > > idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:5555 500 10000000 1 & > remote_thr connect-to=tcp://127.0.0.1:5555 message-size=500 > message-count=10000000 type=1 check=0 > > message size: 500 [B] > message count: 10000000 > mean throughput: 971666 [msg/s] > mean throughput: 3886.664 [Mb/s] > > 3. same test via zmq_proxy, by switching local_lat from bind to connect: > > idavi:perf bruno$ ./proxy tcp://*:8881 tcp://*:8882 & > Proxy type=PULL|PUSH in=tcp://*:8881 out=tcp://*:8882 > > idavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 10000000 32 & > local_thr bind-to=tcp://127.0.0.1:8882 message-size=500 message-count=100000 > type=32 check=0 connect=32 > > idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:8881 500 10000000 & > remote_thr connect-to=tcp://127.0.0.1:8881 message-size=500 > message-count=10000000 type=0 check=0 > > message size: 500 [B] > message count: 10000000 > mean throughput: 92974 [msg/s] > mean throughput: 371.896 [Mb/s] > > 4. same test via proxy and PUB/SUB, including checking if every message > arrives (*) > > idavi:perf bruno$ ./proxy tcp://*:8881 tcp://*:8882 1 & > Proxy type=XSUB|XPUB in=tcp://*:8881 out=tcp://*:8882 > > idavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 10000000 49 & > local_thr bind-to=tcp://127.0.0.1:8882 message-size=500 > message-count=10000000 type=49 check=16 connect=32 > > idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:8881 500 10000000 17 & > remote_thr connect-to=tcp://127.0.0.1:8881 message-size=500 > message-count=10000000 type=17 check=16 > > message size: 500 [B] > message count: 10000000 > mean throughput: 88721 [msg/s] > mean throughput: 354.884 [Mb/s] > > (*) if check is enabled on the remote_thr, the message, if size>16, will > contain a counter. On the local_thr it will then verify if the counter comes > at the expected order and without loosing any message. Hence why the > remote_thr needs to increase the HWM and sleep for one second in case of > PUB/SUB. > > > So, then again, what is happening with the zmq_proxy? > > > > > On Nov 7, 2013, at 22:15, Bruno D. Rodrigues <[email protected]> > wrote: > >> I’ve been testing a lot of combinations of ZeroMQ over Java, between the >> pure jeromq base and the jzmq JNI libzmq C code. Albeit my impression so far >> is that jeromq is way faster than the binding - not that the code isn’t >> great, but my feeling so far is that the JNI jump slows everything down - at >> a certain point I felt the need for a simple zmq_proxy network node and I >> was pretty sure that the C code must be faster than the jeromq. I have some >> ideas that can improve the jeromq proxy code, but it felt easier to just >> compile the zmq_proxy code from the book. >> >> Unfortunately something went completely wrong on my side so I need your help >> to understand what is happening here. >> >> Context: >> MacOSX Mavericks fully updated, MBPro i7 4x2 CPU 2.2Ghz 16GB >> libzmq from git head >> (same for jeromq and libzmq, albeit I’m using my own fork so I can send >> pulls back) >> my data are json lines that goes from about 100 bytes to some multi MB >> exceptions, but the average of those million messages is about 500bytes. >> >> Test 1: pure local_thr and remote_thr: >> >> iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8881 500 1000000 & >> iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 & >> real 0m0.732s >> user 0m0.516s >> sys 0m0.394s >> message size: 500 [B] >> message count: 1000000 >> mean throughput: 1418029 [msg/s] >> mean throughput: 5672.116 [Mb/s] >> >> Test 2: change local_thr to perform connect instead of bind, and put a proxy >> in the middle. >> The proxy is the first C code example from the book, available here >> https://gist.github.com/davipt/7361477 >> iDavi:c bruno$ gcc -o proxy proxy.c -I /usr/local/include/ -L >> /usr/local/lib/ -lzmq >> iDavi:c bruno$ ./proxy tcp://*:8881 tcp://*:8882 1 >> Proxy type=PULL/PUSH in=tcp://*:8881 out=tcp://*:8882 >> >> iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 1000000 & >> iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 & >> iDavi:perf bruno$ message size: 500 [B] >> message count: 1000000 >> mean throughput: 74764 [msg/s] >> mean throughput: 299.056 [Mb/s] >> >> real 0m10.358s >> user 0m0.668s >> sys 0m0.508s >> >> >> Test3: use the jeromq equivalent of the proxy: >> https://gist.github.com/davipt/7361623 >> >> iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 1000000 & >> [1] 15816 >> iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 & >> [2] 15830 >> iDavi:perf bruno$ >> real 0m3.429s >> user 0m0.654s >> sys 0m0.509s >> message size: 500 [B] >> message count: 1000000 >> mean throughput: 293532 [msg/s] >> mean throughput: 1174.128 [Mb/s] >> >> This performance coming out of Java is okish, it’s here just for comparison, >> and I’ll spend some time looking at it. >> >> The core question is the C proxy - why 10 times slower than the no-proxy >> version? >> >> One thing I noticed, by coincidence, is that on the upper side of the proxy, >> both with the C “producer” as well as the java one, tcpdump shows me >> consistently packets of 16332 (or the MTU size if using ethernet, 1438 I >> think). This value is consistent for the 4 combinations of producers and >> proxies (jeromq vs c). >> >> But on the other side of the proxy, the result is completely different. With >> the jeromq proxy, I see packets of 8192 bytes, but with the C code I see >> packets of either 509 or 1010. It feels like the proxy is sending the >> messages one by one. Again, this value is consistent with the PULL consumer >> after the proxy, being it C or java. >> >> So this is something on the proxy “backend” socket side of the zmq_proxy. >> >> Also, I see quite similar behavior with a PUB - [XSUB+Proxy+XPUB] - SUB >> version. >> >> What do I need to tweak on the proxy.c ? >> >> Thanks in advance >> >
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
