I’ve branched the code to add the proxy code for testing:
https://github.com/davipt/libzmq/tree/fix-002-proxy_lat_thr

This now allows me:

1. current PUSH/PULL end-to-end test:

idavi:perf bruno$ ./local_thr tcp://127.0.0.1:5555 500 10000000 &
local_thr bind-to=tcp://127.0.0.1:5555 message-size=500 message-count=10000000 
type=0 check=0 connect=0

idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:5555 500 10000000 &
remote_thr connect-to=tcp://127.0.0.1:5555 message-size=500 
message-count=10000000 type=0 check=0

message size: 500 [B]
message count: 10000000
mean throughput: 1380100 [msg/s]
mean throughput: 5520.400 [Mb/s]

2. PUB/SUB end-to-end test:

idavi:perf bruno$ ./local_thr tcp://127.0.0.1:5555 500 10000000 1 &
local_thr bind-to=tcp://127.0.0.1:5555 message-size=500 message-count=10000000 
type=1 check=0 connect=0

idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:5555 500 10000000 1 &
remote_thr connect-to=tcp://127.0.0.1:5555 message-size=500 
message-count=10000000 type=1 check=0

message size: 500 [B]
message count: 10000000
mean throughput: 971666 [msg/s]
mean throughput: 3886.664 [Mb/s]

3. same test via zmq_proxy, by switching local_lat from bind to connect:

idavi:perf bruno$ ./proxy tcp://*:8881 tcp://*:8882 &
Proxy type=PULL|PUSH in=tcp://*:8881 out=tcp://*:8882

idavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 10000000 32 &
local_thr bind-to=tcp://127.0.0.1:8882 message-size=500 message-count=100000 
type=32 check=0 connect=32

idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:8881 500 10000000 &
remote_thr connect-to=tcp://127.0.0.1:8881 message-size=500 
message-count=10000000 type=0 check=0

message size: 500 [B]
message count: 10000000
mean throughput: 92974 [msg/s]
mean throughput: 371.896 [Mb/s]

4. same test via proxy and PUB/SUB, including checking if every message arrives 
(*)

idavi:perf bruno$ ./proxy tcp://*:8881 tcp://*:8882 1 &
Proxy type=XSUB|XPUB in=tcp://*:8881 out=tcp://*:8882

idavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 10000000 49 &
local_thr bind-to=tcp://127.0.0.1:8882 message-size=500 message-count=10000000 
type=49 check=16 connect=32

idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:8881 500 10000000 17 &
remote_thr connect-to=tcp://127.0.0.1:8881 message-size=500 
message-count=10000000 type=17 check=16

message size: 500 [B]
message count: 10000000
mean throughput: 88721 [msg/s]
mean throughput: 354.884 [Mb/s]

(*) if check is enabled on the remote_thr, the message, if size>16, will 
contain a counter. On the local_thr it will then verify if the counter comes at 
the expected order and without loosing any message. Hence why the remote_thr 
needs to increase the HWM and sleep for one second in case of PUB/SUB.


So, then again, what is happening with the zmq_proxy?
 



On Nov 7, 2013, at 22:15, Bruno D. Rodrigues <[email protected]> wrote:

> I’ve been testing a lot of combinations of ZeroMQ over Java, between the pure 
> jeromq base and the jzmq JNI libzmq C code. Albeit my impression so far is 
> that jeromq is way faster than the binding - not that the code isn’t great, 
> but my feeling so far is that the JNI jump slows everything down - at a 
> certain point I felt the need for a simple zmq_proxy network node and I was 
> pretty sure that the C code must be faster than the jeromq. I have some ideas 
> that can improve the jeromq proxy code, but it felt easier to just compile 
> the zmq_proxy code from the book.
> 
> Unfortunately something went completely wrong on my side so I need your help 
> to understand what is happening here.
> 
> Context:
> MacOSX Mavericks fully updated, MBPro i7 4x2 CPU 2.2Ghz 16GB
> libzmq from git head
> (same for jeromq and libzmq, albeit I’m using my own fork so I can send pulls 
> back)
> my data are json lines that goes from about 100 bytes to some multi MB 
> exceptions, but the average of those million messages is about 500bytes.
> 
> Test 1: pure local_thr and remote_thr:
> 
> iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8881 500 1000000 &
> iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 &
> real  0m0.732s
> user  0m0.516s
> sys   0m0.394s
> message size: 500 [B]
> message count: 1000000
> mean throughput: 1418029 [msg/s]
> mean throughput: 5672.116 [Mb/s]
> 
> Test 2: change local_thr to perform connect instead of bind, and put a proxy 
> in the middle.
> The proxy is the first C code example from the book, available here 
> https://gist.github.com/davipt/7361477
> iDavi:c bruno$ gcc -o proxy proxy.c -I /usr/local/include/ -L /usr/local/lib/ 
> -lzmq
> iDavi:c bruno$ ./proxy tcp://*:8881 tcp://*:8882 1
> Proxy type=PULL/PUSH in=tcp://*:8881 out=tcp://*:8882
> 
> iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 1000000 &
> iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 &
> iDavi:perf bruno$ message size: 500 [B]
> message count: 1000000
> mean throughput: 74764 [msg/s]
> mean throughput: 299.056 [Mb/s]
> 
> real  0m10.358s
> user  0m0.668s
> sys   0m0.508s
> 
> 
> Test3: use the jeromq equivalent of the proxy: 
> https://gist.github.com/davipt/7361623
> 
> iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 1000000 &
> [1] 15816
> iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 &
> [2] 15830
> iDavi:perf bruno$ 
> real  0m3.429s
> user  0m0.654s
> sys   0m0.509s
> message size: 500 [B]
> message count: 1000000
> mean throughput: 293532 [msg/s]
> mean throughput: 1174.128 [Mb/s]
> 
> This performance coming out of Java is okish, it’s here just for comparison, 
> and I’ll spend some time looking at it.
> 
> The core question is the C proxy - why 10 times slower than the no-proxy 
> version?
> 
> One thing I noticed, by coincidence, is that on the upper side of the proxy, 
> both with the C “producer” as well as the java one, tcpdump shows me 
> consistently packets of 16332 (or the MTU size if using ethernet, 1438 I 
> think). This value is consistent for the 4 combinations of producers and 
> proxies (jeromq vs c).
> 
> But on the other side of the proxy, the result is completely different. With 
> the jeromq proxy, I see packets of 8192 bytes, but with the C code I see 
> packets of either 509 or 1010. It feels like the proxy is sending the 
> messages one by one. Again, this value is consistent with the PULL consumer 
> after the proxy, being it C or java.
> 
> So this is something on the proxy “backend” socket side of the zmq_proxy.
> 
> Also, I see quite similar behavior with a PUB - [XSUB+Proxy+XPUB] - SUB 
> version.
> 
> What do I need to tweak on the proxy.c ?
> 
> Thanks in advance
> 

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to