I think there will be a way to fix this without changing the API, e.g. flagging zombies before calling the handlers.
Can you make a minimal test case that reproduces the problem? -Pieter On Fri, Dec 28, 2012 at 11:31 AM, Andy Ballingall TF <[email protected]> wrote: > Hi Pieter, > > > On 27 December 2012 17:25, Pieter Hintjens <[email protected]> wrote: >> Do you want to try fixing that bug, or shall I take a look? > > It might be best if you decide how to proceed on this occasion, the reason > being that the API and zombie representation would need modification to > allow the unique identification of an existing timer for removal... > > Were I to have a go, I'd probably: > > change the zombies list to store pointers to the allocated timers (rather > than the timers' 'arg'). > change zloop_timer_end() so that it adds to the zombies list any timer whose > arg is passed in the function - i.e. it would potentially add a number of > items to the zombies list rather than just one > at line 375, add a check for a removed timer being in the zombies list (e.g. > if zombies length > 0, then look for this timer in the zombies list and > remove if found) > Modify the loop at line 420 so that it compares the timer pointers with the > zombie pointers (which now are pointers to timers too) > (This is where you might have other ideas!) - Possibly add a new timer > creation function - e.g. zloop_timer_new() which returns a pointer to the > new timer or NULL on failure, and a new timer deletion function - e.g. > zloop_timer_destroy() which expects the pointer returned from > zloop_timer_new(). I'd then move the implementation of zloop_timer() into > zloop_timer_new() and make zloop_timer() call zloop_timer_new(), and in the > implementation of zloop_destroy(), it'd make an explicit check for the timer > existing and not already being in the zombies list. > > ...but I'm aware that it's probably not my place to start messing around > with the API. Anyway - how does that sound? > > Andy > > > >> >> -Pieter >> >> On Thu, Dec 27, 2012 at 5:41 PM, Andy Ballingall TF >> <[email protected]> wrote: >>> On 27 December 2012 16:22, Pieter Hintjens <[email protected]> wrote: >>>> Andy, >>>> >>>> You can trace this quite simply; print the tickless timer value before >>>> each zmq_poll in zloop. It should change when you add your new timer. >>> >>> Indeed, but even if the value is initially correct, the problem is >>> that the zombie deletion loop at the bottom of the reactor (line 420, >>> zloop.c) deletes any timers whose 'arg' matches the entry in the >>> zombies list, and the newly added timer's 'arg' matches the 'arg' of >>> the one I wanted to delete, so both will be removed by the loop, >>> rather than just the old one. >>> >>> Once the reactor returns to the top of the loop, there are no timers >>> left, which explains why it reverts to showing a poll time equivalent >>> to 1 hour. >>> >>> It's a by-product of the code deferring deletion of a timer when >>> zloop_timer_end() is called. If, after callling zloop_timer_end(), you >>> then immediately add another timer with the same 'arg' value, then the >>> reactor's zombie cleanup loop will blow away both. >>> >>> By the time the reactor gets back to recalculating the next delay (by >>> calling s_tickless_timer() on line 357 ), there are no timers in the >>> self->timers list, so a timeout of 1 hour is used. >>> >>> Andy >>> >>>> >>>> -Pieter >>>> >>>> On Thu, Dec 27, 2012 at 5:01 PM, Andy Ballingall TF >>>> <[email protected]> wrote: >>>>> Hi Pieter, >>>>> >>>>> On 27 December 2012 13:55, Pieter Hintjens <[email protected]> wrote: >>>>>> On Thu, Dec 27, 2012 at 2:53 PM, Andy Ballingall TF >>>>>> <[email protected]> wrote: >>>>>> >>>>>>> Note how my timer registration succeeds, but the polling seems to >>>>>>> ignore it? >>>>>> >>>>>> Then it's because the tickless timer is being miscalculated; it should >>>>>> be re-calculated each time you add or remove a timer handler. If not, >>>>>> there's a bug. >>>>> >>>>> I agree with Apostolis below that the zloop code seems correct, with >>>>> respect to the calculation of the next timer expiry. Replicating my >>>>> scenario with test code would be quite a bit of work, so I took >>>>> another look at the zloop implementation again, and I think I can see >>>>> what's happening. >>>>> >>>>> Firstly, my timer mutations were occurring within a message handler, >>>>> and the mutation consisted of: >>>>> a) Remove existing timer >>>>> b) Create a new one. >>>>> >>>>> The old removed timer and the new timer both have the same 'arg' (in >>>>> my case this is an instance of my communications class). >>>>> >>>>> The reactor loop (in function zloop_start() ) first processes any poll >>>>> handlers, then at the end, checks if the zombies list has any timers >>>>> to clear. >>>>> >>>>> By the time the zloop_start() function gets to line 420 (where it >>>>> checks for zombies), then the state is: >>>>> >>>>> Zombies: 1 (arg='this') >>>>> >>>>> Timers: 2 >>>>> 1 = old timer (arg='this', expiry=<old time>) >>>>> 2 = new timer (arg = 'this', expiry = <new time>) >>>>> >>>>> The loop at line 423 will blow away all the timers - including the one >>>>> I just added - as it matches the zombie arg. >>>>> >>>>> That explains why my newly added timer wasn't firing, and why the poll >>>>> timeout was set to 1 hour: It's because my newly added timer had been >>>>> removed almost immediately! >>>>> >>>>> If the timer deletion used a reference to the timer object rather than >>>>> 'arg' to reference it, then it would ensure that I could call >>>>> zloop_timer_end() and subsequently zloop_timer() from within a >>>>> handler, without the zloop_start() function removing that newly added >>>>> timer... >>>>> >>>>> In the end, this isn't a problem for me because I've changed my >>>>> implementation to use a heartbeat timer which never gets removed and >>>>> replaced, but I can see scenarios in the future where it would be good >>>>> to reset an existing timer, or delete and add a number of timers >>>>> inside a handler... >>>>> >>>>> Anyway, thanks for your help. >>>>> >>>>> Andy >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> -Pieter >>>>>> _______________________________________________ >>>>>> zeromq-dev mailing list >>>>>> [email protected] >>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>>> >>>>> >>>>> >>>>> -- >>>>> Andy Ballingall >>>>> Senior Software Engineer >>>>> >>>>> The Foundry >>>>> 6th Floor, The Communications Building, >>>>> 48, Leicester Square, >>>>> London, WC2H 7LT, UK >>>>> Tel: +44 (0)20 7968 6828 - Fax: +44 (0)20 7930 8906 >>>>> Web: http://www.thefoundry.co.uk/ >>>>> >>>>> The Foundry Visionmongers Ltd. >>>>> Registered in England and Wales No: 4642027 >>>>> _______________________________________________ >>>>> zeromq-dev mailing list >>>>> [email protected] >>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>> _______________________________________________ >>>> zeromq-dev mailing list >>>> [email protected] >>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >>> >>> >>> >>> -- >>> Andy Ballingall >>> Senior Software Engineer >>> >>> The Foundry >>> 6th Floor, The Communications Building, >>> 48, Leicester Square, >>> London, WC2H 7LT, UK >>> Tel: +44 (0)20 7968 6828 - Fax: +44 (0)20 7930 8906 >>> Web: http://www.thefoundry.co.uk/ >>> >>> The Foundry Visionmongers Ltd. >>> Registered in England and Wales No: 4642027 >>> _______________________________________________ >>> zeromq-dev mailing list >>> [email protected] >>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > > -- > Andy Ballingall > Senior Software Engineer > > The Foundry > 6th Floor, The Communications Building, > 48, Leicester Square, > London, WC2H 7LT, UK > Tel: +44 (0)20 7968 6828 - Fax: +44 (0)20 7930 8906 > Web: http://www.thefoundry.co.uk/ > > The Foundry Visionmongers Ltd. > Registered in England and Wales No: 4642027 > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
