Sorry, Steve, that's too much to read and do and digest in the random fractions of time we have to share here.
Can you cut this down to a single C program that does the client and server and sets the keys and reproduces the issue? You can use CZMQ or the lower level API. You'll see examples of multi-thread client/server security test cases in the zauth.c selftest method. -Pieter Ps. your fixed font makes the email wrap weirdly and be harder to read. On Tue, Apr 15, 2014 at 4:44 PM, Steve Murphy <[email protected]> wrote: > Hello! > > (This message looks best if viewed in a WIDE reader window!) > > It looks to me like there's a subtle bug in zmq, but... I've > thought that generally many times before, only to find it was ME > the whole time. > > I've reproduced the lack of encryption between two of my apps, into > simpler files serveriron and clientiron. The way > they use encryption is modeled after the ironhouse.c that is used > as example by the zmq folks. > > I can reproduce the problems in these simple test cases, and I > think I have enough evidence to report a bug, but... I'm > very "human". Maybe you folks can spot what I'm doing wrong. > > > I'm getting strangeness when I try to work with encrypted > communications (via curve). > > Find attached my simple test cases. To run it, untar the the attached > tar file "security-blog.tar.gz", and cd into the newly > created security-blog dir, and type "make test". It will use the > currently installed libzmq and libczmq's to build the apps. > serveriron and clientiron are just the two halves you get > when you rip the ironhouse example apart to run in separate > executables, and put the certs in the executables, instead of > generating new one certs every time. Oh, and you use the certstore > also for those two public cert files. The Makefile will set things up. > > I also run a 'littletest' that just calls the zauth self-test. (Had probs > with CentOS 6.5) > > I also run "ironhouse2" which is the same as the ironhouse > example, except it uses fixed certs for both client and server, > and grabs the public server cert and feeds that to the client. > It basically mirrors all the actions of the serveriron/clientiron > processes, and runs them together in the same process, like ironhouse > does. > > I've tried it with various combinations of czmq and zeromq/libzmq. > The file "testscript" will build and install libzmq in versions > 4.0.3, 4.0.4, and the current git version; and also czmq in versions > 2.0.3, 2.1.0, and the current git latest version. It will run all > 9 permutations of these. > > I have run testscript on a CentOS 6.5 system, and an Ubuntu 13.10 > system. There are some differences in behavior, but generally, they > give the same results. > > Here's what I see: > > for each czmq/zmq combo, "split" means ironhouse split into > separate client/server processes, with both certstore and > compiled-in certs. > > "lt" is "littletest" and represents running just the zauth self-test > routine. > > "i2" is "ironhouse2", which is basically "split" remerged into a > single process. > > "i3" is "ironhouse3", which is basically "split" remerged into a > single process, but the client and server have their own contexts/zauth. > > "i3x" is "ironhouse3x", which is ironhouse3, except we swap the order of > socket instantiation, so the client socket is instantiated before the > server, > which mimics real-life (split) behavior. After all, we do want to get the > message from the server, so we have to start the client first, so when the > server sends the hello, we are up and ready to receive it on the client > side. > Waiting 5 sec between running the client and server, seems to be optimal. > Longer > yields no better result, shorter yields less (or so it seems). > > "i3s" is where I copy ironhouse3.c into cli-ih3.c and ser-ih3.c, and, in the > cli-ih3.c, I remove all the server code, and in ser-ih3.c, I remove all the > client > code. This is like split, but arrived at step by step (just in case I missed > something in split). > > LIBZMQ > 4.0.3 4.0.4 > libzmq-git-latest > > CZMQ 2.0.3 split: runs OK, split: runs OK, czmq 2.0.3 > doesn't compile > but no but no on this > libzmq. > encryption! encryption! > lt : assert. fail lt : assert. fail > (CentOS) (CentOS) > i2 : runs, but no i2 : runs, but no > encyption! encyption! > i3 : runs, but no i3 : runs, but no > encyption! encyption! > i3x: run w/o encrypt. i3x: runs w/o encryption > i3s: runs w/o encrypt. i3s: runs w/o encrypt. > > CZMQ 2.1.0 split: client hangs split: client hangs czmq 2.1.0 > doesn't compile > lt: runs OK. lt: runs OK on this > libzmq. > i2: runs OK. i2: runs OK > i3: runs OK. i3: runs OK > i3x: runs OK. i3x: runs OK > i3s: client hangs i3s: client hangs > > CZMQ git latest split: client hangs split: client hangs split: > client hangs > lt: OK lt: OK lt: OK > i2: OK i2: OK i2: OK > i3: OK i3: OK i3: OK > i3x: OK i3x: OK i3x: OK > i3s: client hangs i3s: client hangs i3s: > client hangs > > > > Notes: > > > When I say "client hangs" it means that I wait in zstr_recv() forever. I can > run the > server multiple times. The server never says "I: "-anything, so even if the > client gets the message, I'd > expect no encryption. I have found, that if I repeat the tests, I can > occasionally > have the client get the hello; but this is not very frequent, and when it > does, I get > no encryption. Probability <20% maybe < 10%. I played with the sleep time > between firing > off the client, and starting the server, and it *seems* I get better > probability with > sleep = 5 sec, but... I have not done a large statistics exercise, it could > just be > anecdotal, serendipitous type stuff. I did notice that binding to 127.0.0.1 > instead of * > increases the probability of the split/ih3 cases having the client receive > the unencrypted message. > But never at 100%. And never encrypted. > > zauth seems to have noticeable problems in czmq-2.0.3. How irontest works > there, but irontest2 > does not, would be extremely interesting to resolve. (This is on > CentOS 6.5; Ubuntu didn't > seem to notice. This might be a compiler/linker problem, as CentOS is > NOT cutting edge versions! > > In 2.1.0 (and higher), split's client hangs on the recv call, but > ironhouse2's client does not. The only difference > I can spot is that in ironhouse2, both client and server share the same > context and process... > > So, I created ironhouse3, which is like ironhouse2, but server and client > each use a different zctx. Since > this also works, then I have to conclude that split and i3 have only 2 > differences: order that the > connect/binds are done, and different processes. > > So, I created ironhouse3x, which swaps the order. Still works. What's left? > Different processes. > Is that crazy? > > But, perhaps, I'm missing something. Maybe it's a bug in my stuff, right > under my own nose, but I can't > spot it. > > So, at the moment, I think I've ruled out: > A. My "REAL" application uses ROUTER sockets and has these problems, but > these test sets use PUSH/PULL, and demonstrate the problem, so it > doesn't > look socket-type related. > B. not connect/bind order dependent. > C. not common/different zctx dependent. > D. not compiled-in vs file cert dependent. > > > > Any help that anyone can give, will be highly appreciated! > > > murf > > -- > > Steve Murphy > ParseTree Corporation > 57 Lane 17 > Cody, WY 82414 > ✉ murf at parsetree dot com > ☎ 307-899-5535 > > > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
