Hi Matt, Another things is, sorry if I'm wrong, but zmq_term in the child always returns EINTR. This is because most of the sockets operations return EINTR when pid!= getpid(). With your patch signaler will create a new eventfd (correct me if I'm wrong) and then return. It is up to the reaper thread to close the sockets right? but since most operations just return EINTR, I wonder if the sockets are really closed after the fork.
Best, Selim Ciraci On Mon, Sep 16, 2013 at 11:40 AM, Selim Ciraci <[email protected]> wrote: > Hi Matt, > > It is not an assertion fail. The problem occurs in connections between > router-dealer sockets. The send function in router.cpp returns no route to > host because it cannot find the host_id in the outpipes_t. A careful debug > shows that actually the pipe from dealer to the router has not been > established. I put a printf to xidentify_peer method in router.cpp, the new > client ids are inserted to the outpipes_t in this method as far as I know. > The aim here is compare the child process ids with the ids the router > socket received. The comparison actually showed that some child ids went > missing (router socket never received them). I must add that the ids went > missing after a parent process terminates. Though I need further testing to > prove this. > > Any ideas what might be going wrong here? I'm going to try to implement a > simple test case. > > Thanks, > Selim > > > On Mon, Sep 16, 2013 at 6:13 AM, Matt Connolly <[email protected]>wrote: > >> Hi Selim, >> >> I don’t have any ideas yet about why the parent would stop sending >> messages after forking a second child. >> >> Is it possible to reproduce this in a simple test case? >> >> And when the no route to host error occurs, is that an assertion? If so, >> can you provide a stack trace? >> >> -Matt >> >> On 14 Sep 2013, at 6:43 am, Selim Ciraci <[email protected]> wrote: >> >> > Hi Matt, >> > >> > Thanks for your reply. I have actually found out about your patch after >> the email. I have updated zmq to head from github and tried with my >> program. The parent sockets seems to have closed. But the problem is every >> now and then I get "no route to host" errors in zmq_send. This happens >> usually when: >> > parent forks a child, child calls zmq_term(parent_context) does work >> and then terimantes (closes its context). >> > parent in parallel uses parent_context, does work, learns the child has >> terminated, forks a new child child2. >> > child2 zmq_term(parent_context) does work and then terimantes (closes >> its context). >> > after child2 terminates parent cannot receive messages. Even though the >> parent is active, zmq_send in the server fails with no route to host. >> > >> > I have no idea why this fails. Any ideas what might be causing this? >> > >> > Best, >> > Selim Ciraci >> >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> > >
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
