Hi, thanks

I do not want to look bungler, but wouldn't be a shortcut to implement asserts that clean the event before aborting?

El 13/02/2013 9:54, KIU Shueng Chuan escribió:
Hi Pau,

The system-wide critical section is currently implemented using a win32 Event which, as you observed, has the possibility of resulting in a deadlock in the following situation:
1) Process A takes the Event
2) Process B tries to take the Event and blocks
3) Process A aborts within the critical section (due to an assertion being raised) 4) Since Process B has opened the Event, the OS will not clean up the Event. 5) Process B and any subsequent process will now block forever for the Event.

As I mentioned in the previous mail, if the critical section were to be implemented using a Mutex instead, then in step 5, Process B would be able to enter the critical section with a return code of WAIT_ABANDONED from WaitForSingleObject. (Or at least that's what I read from MSDN)

Note: If Process A aborted due to some exhaustion of resources, then Process B would likely hit the same assertion too.

The question is how to convert the Event to a Mutex and yet not break compatibility with existing applications using older versions of the library.



On Wed, Feb 13, 2013 at 3:28 PM, Pau <[email protected] <mailto:[email protected]>> wrote:

    Hi,

    I am back with the asserts happening inside a critical section in
    signaler.cpp.
    The problem still is that in signale.cpp make_fdpair(..) creates
    system-wide critical section and does a number of things that can
    generate a wsa_assert() or win_assert() before releasing the session.

    I have seen that in the trunk someone has added a
    CloseHandle(sync) at the end of the function, I do not know if it
    had something related with this but I understand that the problem
is still there. wsa_assert() and wsa_windows() end up in RaiseException (0x40000015, EXCEPTION_NONCONTINUABLE, 1,
    extra_info) which I understand is a cul de sac that has no way out
    to clean up before leaving.

    I guess we need a special assert function to use inside this
    critical but I'd like a more documented opinion (Kiu?).

    thanks,

    Pau Ceano

    El 21/01/2013 23:37, KIU Shueng Chuan escribió:

    Hi Pau, a fix for the assertion on connection to port 5905 is in
    trunk branch.

    I think the dangling critical section possibility could be fixed
    by changing the Event to a Mutex. When an assertion occurs the
    mutex would just be abandoned. However this change will cause
    backward compatibility issues with older versions.

    On Jan 22, 2013 2:04 AM, "Pieter Hintjens" <[email protected]
    <mailto:[email protected]>> wrote:

        Hi Pau,

        So there are two different problems here, one is that we're
        hitting a
        socket limit on WXP, and the other is that the asserts are
        happening
        inside a critical section.

        I don't think we can fix the first one easily but we can
        presumably
        assert in a smarter way. Do you want to try making a patch
        for this?

        -Pieter

        On Mon, Jan 21, 2013 at 6:23 PM, Pau <[email protected]
        <mailto:[email protected]>> wrote:
        >
        > Hi,
        >
        >
        > I am using (not yet in production) ZMQ on Windows and I
        have found what
        > I think is a big problem for Windows users.
        > We use WXP and W7 and Visual C++ different versions. ZMQ
        version 3.2.0
        > (as far as I see the same problem happens in 3.2.2)
        >
        > I do not fully understand ZMQ internals but I've seen that
        every time a
        > socket is created the function make_fdpair(..) is called and in
        > signaler.cpp(line244) a system event
        "zmq-signaler-port-sync" is created.
        > This event is used as a system-wide critical section and,
        so all
        > applications that try to create an event will
        WaitForSingleObject (sync,
        > INFINITE) until  SetEvent (...) is called.
        > The problem is that the code between:
        >   HANDLE sync = CreateEvent (NULL, FALSE, TRUE, TEXT
        > ("zmq-signaler-port-sync"));
        > and
        > SetEvent (sync);
        > is full of wsa_asserts(..) that will terminate the
        application if
        > something goes wrong.
        >
        > It is clear that terminating the application not leaving
        the system-wide
        > critical section is a bad idea because all applications in
        the system
        > will hang and you will have to stop all them to start again.
        > I understand that no errors should happen but anyway to
        escape from the
        > error is not a good idea in this case.
        >
        > I do not know all possible reasons to generate a fatal
        wsa_assert(..)
        > but there is at least one:
        >
        > I have seen that in XP it is possible that line 301  rc =
        connect (*w_,
        > (sockaddr *) &addr, sizeof (addr)); returns an error when a
        socket tries
        > to connect to 5905 and this has happened many times.
        > Windows uses port numbers starting near 1400 and XP has a
        limit at 5000.
        > In W7 this does not look as a problem because maximum is 65000
        > It sounds as if the number was big enough but apart from
        the fact that
        > ZMQ uses a big number of connections (at least in my tests)
        I have
        > experienced that Windows jumps port numbers by 7, so 5000
        happens
        > sometimes with catastrophic consequences.
        >
        > best,
        >
        > Pau Ceano
        > _______________________________________________
        > zeromq-dev mailing list
        > [email protected]
        <mailto:[email protected]>
        > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
        _______________________________________________
        zeromq-dev mailing list
        [email protected] <mailto:[email protected]>
        http://lists.zeromq.org/mailman/listinfo/zeromq-dev



    _______________________________________________
    zeromq-dev mailing list
    [email protected]  <mailto:[email protected]>
    http://lists.zeromq.org/mailman/listinfo/zeromq-dev


    _______________________________________________
    zeromq-dev mailing list
    [email protected] <mailto:[email protected]>
    http://lists.zeromq.org/mailman/listinfo/zeromq-dev




_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to