OK, I found it. I started tracing how the Curve works underneath, and discovered something I hadn't thought of: that the encryption process is asynchronous. It occurred to me that I was terminating the server immediately after it wrote the data to the client. So, I added a 1 sec sleep time between writing the "hello" to the client, and destroying everything and exiting.
And everything works the cli-ih3.c, and ser-ih3.c. I then timed how long I was waiting for recv in ironhouse3.c, I see that it averaged 2msec , with a few at 7msec on the high end (1000 runs). That wait held the server back from terminating the connection, and allowed the data to reach the client. When I split the client and server into two processes, that forced wait was lost. The equivalent would be to open another socket, and send a message to the server after the client got its data. If the server were to wait for a such an acknowledgement to arrive before destroying the auth, cert, and context, then it would not be chopping off the current data. I used zclock_sleep(4), and over 1000 runs, no problems occurred. Of course, there's all sorts of ways to hold back the server, and a fixed wait is the least dependable way to do it, but, in a pinch, with a generous-enough wait, you can raise the probablility of success to somewhere near 100%. Or, at least, you HOPE it will! So, my apologies for my jumping to conclusions; perhaps if anyone else hits these problems, they will be lucky enough to find this correspondence. Perhaps my fixed-up versions of cli-ih3, and ser-ih3, could be included somewhere as an example, with a well commented line including the zclock_sleep(), to explain why the wait is necessary? A missed little detail like this really tripped me up, maybe 1 in a 1000 will follow my path.... murf On Tue, Apr 15, 2014 at 11:06 AM, Steve Murphy <[email protected]> wrote: > Pieter-- > > Basically, ironhouse3.c runs fine > as a single executable, but if you > split it into a client and server, (cli-ih3.c, ser-ih3.c) > it doesn't work. The data is sent > unencrypted, if it is sent at all. > > Sorry about the formatting, I can't predict > the end-viewer experience. > > murf > > > > On Tue, Apr 15, 2014 at 10:56 AM, Steve Murphy <[email protected]> wrote: > >> Pieter-- >> >> If you want to concentrate, then do it on the >> cli-ih3.c and ser-ih3.c, and ignore the >> rest. If you can figure out what's going on there, >> great. I tend to send too much, in favor of not >> having lots of back-and-forth messages, answering >> questions. the *-ih3.c files correspond to the >> ironhouse3.c file, also there, if you need it. >> >> murf >> >> >> >> On Tue, Apr 15, 2014 at 9:44 AM, Pieter Hintjens <[email protected]> wrote: >> >>> Sorry, Steve, that's too much to read and do and digest in the random >>> fractions of time we have to share here. >>> >>> Can you cut this down to a single C program that does the client and >>> server and sets the keys and reproduces the issue? You can use CZMQ or >>> the lower level API. You'll see examples of multi-thread client/server >>> security test cases in the zauth.c selftest method. >>> >>> -Pieter >>> >>> Ps. your fixed font makes the email wrap weirdly and be harder to read. >>> >>> >>> On Tue, Apr 15, 2014 at 4:44 PM, Steve Murphy <[email protected]> >>> wrote: >>> > Hello! >>> > >>> > (This message looks best if viewed in a WIDE reader window!) >>> > >>> > It looks to me like there's a subtle bug in zmq, but... I've >>> > thought that generally many times before, only to find it was ME >>> > the whole time. >>> > >>> > I've reproduced the lack of encryption between two of my apps, into >>> > simpler files serveriron and clientiron. The way >>> > they use encryption is modeled after the ironhouse.c that is used >>> > as example by the zmq folks. >>> > >>> > I can reproduce the problems in these simple test cases, and I >>> > think I have enough evidence to report a bug, but... I'm >>> > very "human". Maybe you folks can spot what I'm doing wrong. >>> > >>> > >>> > I'm getting strangeness when I try to work with encrypted >>> > communications (via curve). >>> > >>> > Find attached my simple test cases. To run it, untar the the attached >>> > tar file "security-blog.tar.gz", and cd into the newly >>> > created security-blog dir, and type "make test". It will use the >>> > currently installed libzmq and libczmq's to build the apps. >>> > serveriron and clientiron are just the two halves you get >>> > when you rip the ironhouse example apart to run in separate >>> > executables, and put the certs in the executables, instead of >>> > generating new one certs every time. Oh, and you use the certstore >>> > also for those two public cert files. The Makefile will set things up. >>> > >>> > I also run a 'littletest' that just calls the zauth self-test. (Had >>> probs >>> > with CentOS 6.5) >>> > >>> > I also run "ironhouse2" which is the same as the ironhouse >>> > example, except it uses fixed certs for both client and server, >>> > and grabs the public server cert and feeds that to the client. >>> > It basically mirrors all the actions of the serveriron/clientiron >>> > processes, and runs them together in the same process, like ironhouse >>> > does. >>> > >>> > I've tried it with various combinations of czmq and zeromq/libzmq. >>> > The file "testscript" will build and install libzmq in versions >>> > 4.0.3, 4.0.4, and the current git version; and also czmq in versions >>> > 2.0.3, 2.1.0, and the current git latest version. It will run all >>> > 9 permutations of these. >>> > >>> > I have run testscript on a CentOS 6.5 system, and an Ubuntu 13.10 >>> > system. There are some differences in behavior, but generally, they >>> > give the same results. >>> > >>> > Here's what I see: >>> > >>> > for each czmq/zmq combo, "split" means ironhouse split into >>> > separate client/server processes, with both certstore and >>> > compiled-in certs. >>> > >>> > "lt" is "littletest" and represents running just the zauth self-test >>> > routine. >>> > >>> > "i2" is "ironhouse2", which is basically "split" remerged into a >>> > single process. >>> > >>> > "i3" is "ironhouse3", which is basically "split" remerged into a >>> > single process, but the client and server have their own >>> contexts/zauth. >>> > >>> > "i3x" is "ironhouse3x", which is ironhouse3, except we swap the order >>> of >>> > socket instantiation, so the client socket is instantiated before the >>> > server, >>> > which mimics real-life (split) behavior. After all, we do want to get >>> the >>> > message from the server, so we have to start the client first, so when >>> the >>> > server sends the hello, we are up and ready to receive it on the client >>> > side. >>> > Waiting 5 sec between running the client and server, seems to be >>> optimal. >>> > Longer >>> > yields no better result, shorter yields less (or so it seems). >>> > >>> > "i3s" is where I copy ironhouse3.c into cli-ih3.c and ser-ih3.c, and, >>> in the >>> > cli-ih3.c, I remove all the server code, and in ser-ih3.c, I remove >>> all the >>> > client >>> > code. This is like split, but arrived at step by step (just in case I >>> missed >>> > something in split). >>> > >>> > LIBZMQ >>> > 4.0.3 4.0.4 >>> > libzmq-git-latest >>> > >>> > CZMQ 2.0.3 split: runs OK, split: runs OK, czmq >>> 2.0.3 >>> > doesn't compile >>> > but no but no >>> on this >>> > libzmq. >>> > encryption! encryption! >>> > lt : assert. fail lt : assert. fail >>> > (CentOS) (CentOS) >>> > i2 : runs, but no i2 : runs, but no >>> > encyption! encyption! >>> > i3 : runs, but no i3 : runs, but no >>> > encyption! encyption! >>> > i3x: run w/o encrypt. i3x: runs w/o encryption >>> > i3s: runs w/o encrypt. i3s: runs w/o encrypt. >>> > >>> > CZMQ 2.1.0 split: client hangs split: client hangs czmq >>> 2.1.0 >>> > doesn't compile >>> > lt: runs OK. lt: runs OK >>> on this >>> > libzmq. >>> > i2: runs OK. i2: runs OK >>> > i3: runs OK. i3: runs OK >>> > i3x: runs OK. i3x: runs OK >>> > i3s: client hangs i3s: client hangs >>> > >>> > CZMQ git latest split: client hangs split: client hangs >>> split: >>> > client hangs >>> > lt: OK lt: OK lt: >>> OK >>> > i2: OK i2: OK i2: >>> OK >>> > i3: OK i3: OK i3: >>> OK >>> > i3x: OK i3x: OK i3x: >>> OK >>> > i3s: client hangs i3s: client hangs i3s: >>> > client hangs >>> > >>> > >>> > >>> > Notes: >>> > >>> > >>> > When I say "client hangs" it means that I wait in zstr_recv() forever. >>> I can >>> > run the >>> > server multiple times. The server never says "I: "-anything, so even >>> if the >>> > client gets the message, I'd >>> > expect no encryption. I have found, that if I repeat the tests, I can >>> > occasionally >>> > have the client get the hello; but this is not very frequent, and when >>> it >>> > does, I get >>> > no encryption. Probability <20% maybe < 10%. I played with the sleep >>> time >>> > between firing >>> > off the client, and starting the server, and it *seems* I get better >>> > probability with >>> > sleep = 5 sec, but... I have not done a large statistics exercise, it >>> could >>> > just be >>> > anecdotal, serendipitous type stuff. I did notice that binding to >>> 127.0.0.1 >>> > instead of * >>> > increases the probability of the split/ih3 cases having the client >>> receive >>> > the unencrypted message. >>> > But never at 100%. And never encrypted. >>> > >>> > zauth seems to have noticeable problems in czmq-2.0.3. How irontest >>> works >>> > there, but irontest2 >>> > does not, would be extremely interesting to resolve. (This is on >>> > CentOS 6.5; Ubuntu didn't >>> > seem to notice. This might be a compiler/linker problem, as >>> CentOS is >>> > NOT cutting edge versions! >>> > >>> > In 2.1.0 (and higher), split's client hangs on the recv call, but >>> > ironhouse2's client does not. The only difference >>> > I can spot is that in ironhouse2, both client and server share the same >>> > context and process... >>> > >>> > So, I created ironhouse3, which is like ironhouse2, but server and >>> client >>> > each use a different zctx. Since >>> > this also works, then I have to conclude that split and i3 have only 2 >>> > differences: order that the >>> > connect/binds are done, and different processes. >>> > >>> > So, I created ironhouse3x, which swaps the order. Still works. What's >>> left? >>> > Different processes. >>> > Is that crazy? >>> > >>> > But, perhaps, I'm missing something. Maybe it's a bug in my stuff, >>> right >>> > under my own nose, but I can't >>> > spot it. >>> > >>> > So, at the moment, I think I've ruled out: >>> > A. My "REAL" application uses ROUTER sockets and has these problems, >>> but >>> > these test sets use PUSH/PULL, and demonstrate the problem, so it >>> > doesn't >>> > look socket-type related. >>> > B. not connect/bind order dependent. >>> > C. not common/different zctx dependent. >>> > D. not compiled-in vs file cert dependent. >>> > >>> > >>> > >>> > Any help that anyone can give, will be highly appreciated! >>> > >>> > >>> > murf >>> > >>> > -- >>> > >>> > Steve Murphy >>> > ParseTree Corporation >>> > 57 Lane 17 >>> > Cody, WY 82414 >>> > ✉ murf at parsetree dot com >>> > ☎ 307-899-5535 >>> > >>> > >>> > >>> > _______________________________________________ >>> > zeromq-dev mailing list >>> > [email protected] >>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev >>> > >>> _______________________________________________ >>> zeromq-dev mailing list >>> [email protected] >>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >>> >> >> >> >> -- >> >> Steve Murphy >> ParseTree Corporation >> 57 Lane 17 >> Cody, WY 82414 >> ✉ murf at parsetree dot com >> ☎ 307-899-5535 >> >> >> > > > -- > > Steve Murphy > ParseTree Corporation > 57 Lane 17 > Cody, WY 82414 > ✉ murf at parsetree dot com > ☎ 307-899-5535 > > > -- Steve Murphy ParseTree Corporation 57 Lane 17 Cody, WY 82414 ✉ murf at parsetree dot com ☎ 307-899-5535
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
