Hello!

(This message looks best if viewed in a WIDE reader window!)

It looks to me like there's a subtle bug in zmq, but... I've
thought that generally many times before, only to find it was ME
the whole time.

I've reproduced the lack of encryption between two of my apps, into
simpler files serveriron and clientiron. The way
they use encryption is modeled after the ironhouse.c that is used
as example by the zmq folks.

I can reproduce the problems in these simple test cases, and I
think I have enough evidence to report a bug, but... I'm
very "human". Maybe you folks can spot what I'm doing wrong.


I'm getting strangeness when I try to work with encrypted
communications (via curve).

Find attached my simple test cases. To run it, untar the the attached
tar file "security-blog.tar.gz", and cd into the newly
created security-blog dir, and type "make test". It will use the
currently installed libzmq and libczmq's to build the apps.
serveriron and clientiron are just the two halves you get
when you rip the ironhouse example apart to run in separate
executables, and put the certs in the executables, instead of
generating new one certs every time. Oh, and you use the certstore
also for those two public cert files. The Makefile will set things up.

I also run a 'littletest' that just calls the zauth self-test. (Had probs
with CentOS 6.5)

I also run "ironhouse2" which is the same as the ironhouse
example, except it uses fixed certs for both client and server,
 and grabs the public server cert and feeds that to the client.
It basically mirrors all the actions of the serveriron/clientiron
processes, and runs them together in the same process, like ironhouse
does.

I've tried it with various combinations of czmq and zeromq/libzmq.
The file "testscript" will build and install libzmq in versions
4.0.3, 4.0.4, and the current git version; and also czmq in versions
2.0.3, 2.1.0, and the current git latest version. It will run all
9 permutations of these.

I have run testscript on a CentOS 6.5 system, and an Ubuntu 13.10
system. There are some differences in behavior, but generally, they
give the same results.

Here's what I see:

for each czmq/zmq combo, "split" means ironhouse split into
separate client/server processes, with both certstore and
compiled-in certs.

"lt" is "littletest" and represents running just the zauth self-test
routine.

"i2" is "ironhouse2", which is basically "split" remerged into a
single process.

"i3" is "ironhouse3", which is basically "split" remerged into a
single process, but the client and server have their own contexts/zauth.

"i3x" is "ironhouse3x", which is ironhouse3, except we swap the order of
socket instantiation, so the client socket is instantiated before the
server,
which mimics real-life (split) behavior. After all, we do want to get the
message from the server, so we have to start the client first, so when the
server sends the hello, we are up and ready to receive it on the client
side.
Waiting 5 sec between running the client and server, seems to be optimal.
Longer
yields no better result, shorter yields less (or so it seems).

"i3s" is where I copy ironhouse3.c into cli-ih3.c and ser-ih3.c, and, in the
cli-ih3.c, I remove all the server code, and in ser-ih3.c, I remove all the
client
code. This is like split, but arrived at step by step (just in case I
missed
something in split).

                              LIBZMQ
                     4.0.3                   4.0.4
libzmq-git-latest

CZMQ  2.0.3         split: runs OK,         split: runs OK,       czmq
2.0.3 doesn't compile
                            but no                 but no            on
this libzmq.
                       encryption!                encryption!
                    lt : assert. fail       lt : assert. fail
                          (CentOS)                (CentOS)
                    i2 : runs, but no       i2 : runs, but no
                         encyption!              encyption!
                    i3 : runs, but no       i3 : runs, but no
                         encyption!              encyption!
                    i3x: run w/o encrypt.   i3x: runs w/o encryption
                    i3s: runs w/o encrypt.  i3s: runs w/o encrypt.

 CZMQ  2.1.0         split: client hangs     split: client hangs   czmq
2.1.0 doesn't compile
                    lt: runs OK.            lt: runs OK              on
this libzmq.
                    i2: runs OK.            i2: runs OK
                    i3: runs OK.            i3: runs OK
                    i3x: runs OK.           i3x: runs OK
                    i3s: client hangs       i3s: client hangs

 CZMQ git latest     split: client hangs     split: client hangs   split:
client hangs
                    lt: OK                  lt: OK                lt: OK
                    i2: OK                  i2: OK                i2: OK
                    i3: OK                  i3: OK                i3: OK
                    i3x: OK                 i3x: OK               i3x: OK
                    i3s: client hangs       i3s: client hangs     i3s:
client hangs



Notes:


When I say "client hangs" it means that I wait in zstr_recv() forever. I
can run the
server multiple times. The server never says "I: "-anything, so even if the
client gets the message, I'd
expect no encryption. I have found, that if I repeat the tests, I can
occasionally
have the client get the hello; but this is not very frequent, and when it
does, I get
no encryption. Probability <20% maybe < 10%. I played with the sleep time
between firing
off the client, and starting the server, and it *seems* I get better
probability with
sleep = 5 sec, but... I have not done a large statistics exercise, it could
just be
anecdotal, serendipitous type stuff. I did notice that binding to 127.0.0.1
instead of *
increases the probability of the split/ih3 cases having the client receive
the unencrypted message.
But never at 100%. And never encrypted.

​zauth seems to have noticeable problems in czmq-2.0.3. How irontest works
there, but irontest2
       does not, would be extremely interesting to resolve. (This is on
CentOS 6.5; Ubuntu didn't
       seem to notice. This might be a compiler/linker problem, as CentOS
is NOT cutting edge versions!

In 2.1.0 (and higher), split's client hangs on the recv call, but
ironhouse2's client does not. The only difference
I can spot is that in ironhouse2, both client and server share the same
context and process...

So, I created ironhouse3, which is like ironhouse2, but server and client
each use a different zctx. Since
this also works, then I have to conclude that split and i3 have only 2
differences: order that the
connect/binds are done, and different processes.

So, I created ironhouse3x, which swaps the order. Still works. What's left?
Different processes.
Is that crazy?

But, perhaps, I'm missing something. Maybe it's a bug in my stuff, right
under my own nose, but I can't
spot it.

So, at the moment, I think I've ruled out:
  A. My "REAL" application uses ROUTER sockets and has these problems, but
     these test sets use PUSH/PULL, and demonstrate the problem, so it
doesn't
     look socket-type related.
  B. not connect/bind order dependent.
  C. not common/different zctx dependent.
  D. not compiled-in vs file cert dependent.



Any help that anyone can give, will be highly appreciated!


​
​murf​

-- 

Steve Murphy
ParseTree Corporation
57 Lane 17
Cody, WY 82414
✉  murf at parsetree dot com
☎ 307-899-5535

Attachment: security-blog.tar.gz
Description: GNU Zip compressed data

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to