I think the code could be modified such that if the signaler port is 5905,
then the old event mechanism is used, else if it is not 0, then a new mutex
mechanism would be used.

Signaler port equal to 0 would be the new default. People who still prefer
fixed port would choose their own port of choice other than 5905.

It's still a compile time choice though. If the default use of ephemeral
ports works, don't think it's worth the effort to make it runtime
configurable.
On Dec 11, 2013 12:43 AM, "Koby Boyango" <[email protected]> wrote:

> Sorry for my late reply, been sick for a few days. I've done some tests
> using the make_fdpair from the master, and it seems like using the
> ephemeral port support and avoiding the locking solved it. Thanks!
> But I do believe that if supporting a fixed signaler port is still
> desired, we should better protect against the scenarios I've described in
> my first mail. What do you think?
>
> Koby
>
>
> On Tue, Dec 10, 2013 at 12:37 AM, KIU Shueng Chuan <[email protected]>wrote:
>
>> I believe no permission is needed to do a pull request. :)
>>
>> Upon rereading Koby's mail more closely, his problem can be reproduced by
>> having one background program use version 3.2.2. The leaked event handle
>> ensures that the global event stays alive and doesn't get recreated each
>> time by Windows.
>>  On Dec 10, 2013 2:44 AM, "Felipe Farinon" <
>> [email protected]> wrote:
>>
>>>  As Koby didn't answered, and I am not able to reproduce the problem
>>> anymore, could I make the modification even being unable to reproduce the
>>> problem (indirectly it will be tested, since I am going to run the
>>> modification in the same environment where the problem was happening)?
>>>
>>> Em 01/12/2013 21:27, KIU Shueng Chuan escreveu:
>>>
>>> In master, you can switch to using ephemeral ports by modifying
>>> signaler_port to 0 in config.hpp. A new ephemeral port is used per
>>> make_fdpair call and no critical section is used.
>>>
>>> Could you try that and see if it solves your problems?
>>> On Dec 1, 2013 9:39 PM, "Koby Boyango" <[email protected]> wrote:
>>>
>>>>   Hi
>>>>  I'm fairly new to ZeroMQ, and have been working on integrating it
>>>> using czmq in several projects, Windows only.
>>>>  I've opened an issue on GitHub*, *#767, and to Pieter's request I'm
>>>> moving the discussion here. So here is what I've written there:
>>>> While trying to integrate ZeroMQ in different modules\processes
>>>> (Windows only), I've encountered a problem where in some situations a
>>>> ZeroMQ call blocks - forever. After debugging the issue, I've found out
>>>> that zmq_init wasn't returning, and after further debugging and digging
>>>> through the code I've found out that the problem was in
>>>> signaler_t::make_fdpair, where the WaitForSingleObject on the
>>>> "zmq-signaler-port-sync" didn't return.
>>>> Initially i wasn't sure in which situations it occurs. So I did some
>>>> further investigation and found out that in my case:
>>>>
>>>>    - For some reason, when I close a test program with Ctrl+C, the
>>>>    event stays un-signaled. Not sure why yet, will need further debugging.
>>>>    - I had a node.js script, which uses ZeroMQ, running in the
>>>>    background. Because it uses version 3.2.2 of libzmq, which leaks the 
>>>> event
>>>>    handle, the existing event wasn't deleted, and stayed in an un-signaled
>>>>    state.
>>>>    - Basically, from that point no one on the system can use ZeroMQ.
>>>>
>>>> I find make_fdpair to be very problematic on Windows:
>>>>
>>>>    - If one call exits without signaling the event, while someone else
>>>>    is holding a handle to the event - All further calls on the system will
>>>>    block. It can happen, for example, if an assertion fails, and the 
>>>> process
>>>>    crashes because of the exception raised.
>>>>    - It can also happen if an assertion has failed, an exception was
>>>>    raised, but caught by the caller using a __try & __except block (SEH). 
>>>> We
>>>>    can't simply rely on the exception to crash the process (for example, a
>>>>    program might wrap calls to its plugins with __try & __except, so a 
>>>> faulty
>>>>    plugin won't crash the while program).
>>>>    - So it basically means that one faulty program can cause other,
>>>>    unrelated programs, to block.
>>>>
>>>> I suggest:
>>>>
>>>>    - No matter which synchronization mechanism is used, wrap the code
>>>>    with __try & __finally, and release the lock in the finally block. This
>>>>    will make sure that we'll release in case of an exception (In my case,
>>>>    though, I tried it and it didn't help. the thread might be terminated
>>>>    during the call).
>>>>    - If possible, don't use a global, system wide, lock. From my
>>>>    understanding, it is used in order to reuse the signaler port. So either
>>>>    use a random, available, port, or make the port "libzmq instance" 
>>>> specific
>>>>    (the first calls binds on a random port, further calls will reuse the 
>>>> port)
>>>>    and protect it with critical section. This will at least limit the 
>>>> problems
>>>>    to the same process.
>>>>    - If the system wide lock is really needed, I suggest using a mutex
>>>>    instead of the event. When using a mutex, if the owning thread dies 
>>>> without
>>>>    releasing it, Windows automatically releases it and the next call to
>>>>    WaitForSingleObject will return WAIT_ABANDONED, and do not block. We can
>>>>    than check if the port was left in a "listening" state, close it if
>>>>    necessary, and "re-listen" with a new socket.
>>>>
>>>> I'm using libzmq 4.0.1 with czmq 2.0.2. I saw that the make_fdpair was
>>>> improved in the master, but I believe it still doesn't entirely solve it.
>>>>  What do you say?
>>>>
>>>> Koby
>>>>
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> [email protected]
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>>
>>>
>>> _______________________________________________
>>> zeromq-dev mailing 
>>> [email protected]http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> [email protected]
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>>
>> _______________________________________________
>> zeromq-dev mailing list
>> [email protected]
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
> _______________________________________________
> zeromq-dev mailing list
> [email protected]
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to