Hi Pieter, On 28 December 2012 10:45, Pieter Hintjens <[email protected]> wrote:
> I think there will be a way to fix this without changing the API, e.g. > flagging zombies before calling the handlers. > Can you make a minimal test case that reproduces the problem? > > I can do, though in my case, my timer deletion and creation calls are being made within a handler, so will that still be a problem? (My timers schedule work to occur in the future as a result of receipt of a particular message). BTW, another workaround occurred to me - I could ensure that each timer has a unique 'arg' by having a wrapper object created for each timer, that wrapper pointing to my class instance. I wish I'd thought of it earlier... Andy > -Pieter > > On Fri, Dec 28, 2012 at 11:31 AM, Andy Ballingall TF > <[email protected]> wrote: > > Hi Pieter, > > > > > > On 27 December 2012 17:25, Pieter Hintjens <[email protected]> wrote: > >> Do you want to try fixing that bug, or shall I take a look? > > > > It might be best if you decide how to proceed on this occasion, the > reason > > being that the API and zombie representation would need modification to > > allow the unique identification of an existing timer for removal... > > > > Were I to have a go, I'd probably: > > > > change the zombies list to store pointers to the allocated timers (rather > > than the timers' 'arg'). > > change zloop_timer_end() so that it adds to the zombies list any timer > whose > > arg is passed in the function - i.e. it would potentially add a number of > > items to the zombies list rather than just one > > at line 375, add a check for a removed timer being in the zombies list > (e.g. > > if zombies length > 0, then look for this timer in the zombies list and > > remove if found) > > Modify the loop at line 420 so that it compares the timer pointers with > the > > zombie pointers (which now are pointers to timers too) > > (This is where you might have other ideas!) - Possibly add a new timer > > creation function - e.g. zloop_timer_new() which returns a pointer to the > > new timer or NULL on failure, and a new timer deletion function - e.g. > > zloop_timer_destroy() which expects the pointer returned from > > zloop_timer_new(). I'd then move the implementation of zloop_timer() into > > zloop_timer_new() and make zloop_timer() call zloop_timer_new(), and in > the > > implementation of zloop_destroy(), it'd make an explicit check for the > timer > > existing and not already being in the zombies list. > > > > ...but I'm aware that it's probably not my place to start messing around > > with the API. Anyway - how does that sound? > > > > Andy > > > > > > > >> > >> -Pieter > >> > >> On Thu, Dec 27, 2012 at 5:41 PM, Andy Ballingall TF > >> <[email protected]> wrote: > >>> On 27 December 2012 16:22, Pieter Hintjens <[email protected]> wrote: > >>>> Andy, > >>>> > >>>> You can trace this quite simply; print the tickless timer value before > >>>> each zmq_poll in zloop. It should change when you add your new timer. > >>> > >>> Indeed, but even if the value is initially correct, the problem is > >>> that the zombie deletion loop at the bottom of the reactor (line 420, > >>> zloop.c) deletes any timers whose 'arg' matches the entry in the > >>> zombies list, and the newly added timer's 'arg' matches the 'arg' of > >>> the one I wanted to delete, so both will be removed by the loop, > >>> rather than just the old one. > >>> > >>> Once the reactor returns to the top of the loop, there are no timers > >>> left, which explains why it reverts to showing a poll time equivalent > >>> to 1 hour. > >>> > >>> It's a by-product of the code deferring deletion of a timer when > >>> zloop_timer_end() is called. If, after callling zloop_timer_end(), you > >>> then immediately add another timer with the same 'arg' value, then the > >>> reactor's zombie cleanup loop will blow away both. > >>> > >>> By the time the reactor gets back to recalculating the next delay (by > >>> calling s_tickless_timer() on line 357 ), there are no timers in the > >>> self->timers list, so a timeout of 1 hour is used. > >>> > >>> Andy > >>> > >>>> > >>>> -Pieter > >>>> > >>>> On Thu, Dec 27, 2012 at 5:01 PM, Andy Ballingall TF > >>>> <[email protected]> wrote: > >>>>> Hi Pieter, > >>>>> > >>>>> On 27 December 2012 13:55, Pieter Hintjens <[email protected]> wrote: > >>>>>> On Thu, Dec 27, 2012 at 2:53 PM, Andy Ballingall TF > >>>>>> <[email protected]> wrote: > >>>>>> > >>>>>>> Note how my timer registration succeeds, but the polling seems to > >>>>>>> ignore it? > >>>>>> > >>>>>> Then it's because the tickless timer is being miscalculated; it > should > >>>>>> be re-calculated each time you add or remove a timer handler. If > not, > >>>>>> there's a bug. > >>>>> > >>>>> I agree with Apostolis below that the zloop code seems correct, with > >>>>> respect to the calculation of the next timer expiry. Replicating my > >>>>> scenario with test code would be quite a bit of work, so I took > >>>>> another look at the zloop implementation again, and I think I can see > >>>>> what's happening. > >>>>> > >>>>> Firstly, my timer mutations were occurring within a message handler, > >>>>> and the mutation consisted of: > >>>>> a) Remove existing timer > >>>>> b) Create a new one. > >>>>> > >>>>> The old removed timer and the new timer both have the same 'arg' (in > >>>>> my case this is an instance of my communications class). > >>>>> > >>>>> The reactor loop (in function zloop_start() ) first processes any > poll > >>>>> handlers, then at the end, checks if the zombies list has any timers > >>>>> to clear. > >>>>> > >>>>> By the time the zloop_start() function gets to line 420 (where it > >>>>> checks for zombies), then the state is: > >>>>> > >>>>> Zombies: 1 (arg='this') > >>>>> > >>>>> Timers: 2 > >>>>> 1 = old timer (arg='this', expiry=<old time>) > >>>>> 2 = new timer (arg = 'this', expiry = <new time>) > >>>>> > >>>>> The loop at line 423 will blow away all the timers - including the > one > >>>>> I just added - as it matches the zombie arg. > >>>>> > >>>>> That explains why my newly added timer wasn't firing, and why the > poll > >>>>> timeout was set to 1 hour: It's because my newly added timer had been > >>>>> removed almost immediately! > >>>>> > >>>>> If the timer deletion used a reference to the timer object rather > than > >>>>> 'arg' to reference it, then it would ensure that I could call > >>>>> zloop_timer_end() and subsequently zloop_timer() from within a > >>>>> handler, without the zloop_start() function removing that newly added > >>>>> timer... > >>>>> > >>>>> In the end, this isn't a problem for me because I've changed my > >>>>> implementation to use a heartbeat timer which never gets removed and > >>>>> replaced, but I can see scenarios in the future where it would be > good > >>>>> to reset an existing timer, or delete and add a number of timers > >>>>> inside a handler... > >>>>> > >>>>> Anyway, thanks for your help. > >>>>> > >>>>> Andy > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>> > >>>>>> -Pieter > >>>>>> _______________________________________________ > >>>>>> zeromq-dev mailing list > >>>>>> [email protected] > >>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Andy Ballingall > >>>>> Senior Software Engineer > >>>>> > >>>>> The Foundry > >>>>> 6th Floor, The Communications Building, > >>>>> 48, Leicester Square, > >>>>> London, WC2H 7LT, UK > >>>>> Tel: +44 (0)20 7968 6828 - Fax: +44 (0)20 7930 8906 > >>>>> Web: http://www.thefoundry.co.uk/ > >>>>> > >>>>> The Foundry Visionmongers Ltd. > >>>>> Registered in England and Wales No: 4642027 > >>>>> _______________________________________________ > >>>>> zeromq-dev mailing list > >>>>> [email protected] > >>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev > >>>> _______________________________________________ > >>>> zeromq-dev mailing list > >>>> [email protected] > >>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev > >>> > >>> > >>> > >>> -- > >>> Andy Ballingall > >>> Senior Software Engineer > >>> > >>> The Foundry > >>> 6th Floor, The Communications Building, > >>> 48, Leicester Square, > >>> London, WC2H 7LT, UK > >>> Tel: +44 (0)20 7968 6828 - Fax: +44 (0)20 7930 8906 > >>> Web: http://www.thefoundry.co.uk/ > >>> > >>> The Foundry Visionmongers Ltd. > >>> Registered in England and Wales No: 4642027 > >>> _______________________________________________ > >>> zeromq-dev mailing list > >>> [email protected] > >>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev > >> _______________________________________________ > >> zeromq-dev mailing list > >> [email protected] > >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > > > > > > -- > > Andy Ballingall > > Senior Software Engineer > > > > The Foundry > > 6th Floor, The Communications Building, > > 48, Leicester Square, > > London, WC2H 7LT, UK > > Tel: +44 (0)20 7968 6828 - Fax: +44 (0)20 7930 8906 > > Web: http://www.thefoundry.co.uk/ > > > > The Foundry Visionmongers Ltd. > > Registered in England and Wales No: 4642027 > > > > _______________________________________________ > > zeromq-dev mailing list > > [email protected] > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > > -- Andy Ballingall Senior Software Engineer The Foundry 6th Floor, The Communications Building, 48, Leicester Square, London, WC2H 7LT, UK Tel: +44 (0)20 7968 6828 - Fax: +44 (0)20 7930 8906 Web: http://www.thefoundry.co.uk/ The Foundry Visionmongers Ltd. Registered in England and Wales No: 4642027
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
