Hi Devs,

I am making tests on libzmq master. I have an issue when my test becomes big with a lot of STREAM sockets.

The test is a client / server, the server is composed of a broker and workers. I have a separate ctx for client (1 thread, 100 sockets), but the same for workers (6 threads, 20 sockets each) and broker (1 thread). I have 3 I/O threads for client, and 4 for server. It may vary from tests to tests (first column below). Some times, increasing the server I/O threads quantity makes the test work, but usually, not.

hck_step_dur (us) is a delay I have introduced in CURVE between a receive and a send in the handcheck. CURVE is proxied between the different client sockets and the workers. Each client own a triplet of sockets (client, tunel A frontend, tunel A backend). A sticky identity pairing is performed by the broker/proxy. nevertheless, I don't see any reason the problem can come from this specific application.

___________client A___________________ __________________server________________
proxy               __worker B __
                _________tunel A______ _____midpoint_____
Client --tcp-- frontend / backend --tcp-- frontend / backend --tcp-- Worker DEALER ZMQ_STREAM DEALER ROUTER ZMQ_STREAM DEALER
CURVE CURVE


I/O thr   QT_W QT_S/W   QT_CLI  hck_step_dur (us)    DURATION (s)
3/3        6         80       480      1900 Error
3/3        6         80       480     10000     52.6 or Error
3/3        6         80       480     20000        56.4 or Error
3/4        6         80       480     10000        51.9 52.4    52.5
3/4        6         80       480      1900        50.1    50.3
3/4        6         80       480      1900        52.9
3/5        6         80       480      1900        52.8
3/4        6         80       480      1900        80.4
3/3-150    6         80       480       190 Error
3/4        6         80       480       900     50.7
3/4        6         80       480     10000        66.3
3/5        6         80       480     10000        66.3
3/4        6        100       600    100000        226
3/4-10     6        100       600      1900 Error
3/4-8      6        100 600     10000 Error
3/9-10     6        100 600     10000        84.6
3/9        6        200      1200     10000        Segmentation fault
3/9-20     6        200      1000     10000 Error

*The trend is more activity with less I/O threads makes the error happen.*

The Error is always at the same location (in bold below) - I have added traces to find it:
Inside *src/stream.cpp*   / int zmq::stream_t::xsend (msg_t *msg_)

if (it != outpipes.end ()) {
                current_out = it->second.pipe;
                if (!current_out->check_write ()) {
                    it->second.active = false;
                    current_out = NULL;
                    errno = EAGAIN;
puts("E: EAGAIN zmq::stream_t::xsend (msg_t *msg_) / if (!current_out->check_write ()) l108\n");
                    return -1;
                }
            }
            else {
*errno = EHOSTUNREACH;*
puts("E: EHOSTUNREACH zmq::stream_t::xsend (msg_t *msg_) / else if (it != outpipes.end ()) l116\n");
                return -1;
            }

If I comment the else part, the test can pass. Nevertheless, I assume it introduces undefined side effects. Actually, when I add sockets, I get many EAGAIN from the first part of the if.

I would be interested to know if some of you has encountered such problem, or has managed to perform big applications with STREAM sockets ?

Cheers,


Laurent.
















_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to