+1 :)

Thanks, Dan, et al.

Worked great for me this morning. I'm a happy camper.

Anna :)

On Friday, January 23, 2015, Edward Galvez <[email protected]> wrote:

> Thank you so much!!! We really appreciate it!
>
> -Edward
>
>
>
> On Fri, Jan 23, 2015 at 9:31 AM, Dan Andreescu <[email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>
>> Wikimetrics has been having serious connectivity problems for a few
>> days.  It turned out to be solvable by using some new hostnames (
>> labsdb1002.eqiad.wmnet).  I fixed it just now, please retry your reports
>> and let me know if anything is still wrong.
>>
>> On Fri, Jan 23, 2015 at 10:46 AM, Dan Andreescu <[email protected]
>> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>>
>>> Hi everyone.  I will work on this as soon as I get into the office, in
>>> about an hour from now.  Yuvi suggested one thing that I wasn't aware of
>>> that might make this a simple fix.
>>>
>>>
>>> On Friday, January 23, 2015, Dan Higgins <[email protected]> wrote:
>>>
>>>> Hi Kevin,
>>>>
>>>> Sorry to be a pest but do you have any update on sorting out the
>>>> Wikimetrics issues? It seems to have gotten worse since we last spoke to
>>>> you with around 1 in 10 reports going through.
>>>>
>>>> Thanks,
>>>>
>>>> Dan
>>>>
>>>> On Tue, Jan 20, 2015 at 7:17 PM, Kevin Leduc <[email protected]>
>>>> wrote:
>>>>
>>>>> All the developers are in transit to SF today.  Dan said he'd be in
>>>>> the office this afternoon.  First dev I see I'll notify them of problems 
>>>>> in
>>>>> wikimetrics.
>>>>>
>>>>> On Tue, Jan 20, 2015 at 11:10 AM, Amanda Bittaker <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hello again gentlemen,
>>>>>>
>>>>>> I think Dan might have already pinged you, but just in case, I wanted
>>>>>> to let you know that we are getting these failures again.  It's kind
>>>>>> of crunch time for getting this data, so we're just banging our heads
>>>>>> against the wall and retrying the reports until they work (1 out of 4
>>>>>> times for me.)  Is there any way you all could work your magic again?
>>>>>>
>>>>>> Many thanks once again,
>>>>>> Amanda
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Dec 10, 2014 at 4:30 PM, Kevin Leduc <[email protected]>
>>>>>> wrote:
>>>>>> > It's good to hear it's working again.  Don't hesitate to reach out
>>>>>> to us
>>>>>> > here or at [email protected] if you notice this kind
>>>>>> of
>>>>>> > trouble again.
>>>>>> >
>>>>>> > On Wed, Dec 10, 2014 at 3:37 PM, Amanda Bittaker <
>>>>>> [email protected]>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> It's working perfectly now--a thousand thank yous, Dan and Marcel.
>>>>>> >>
>>>>>> >> On Wed, Dec 10, 2014 at 3:24 PM, Edward Galvez <
>>>>>> [email protected]>
>>>>>> >> wrote:
>>>>>> >>>
>>>>>> >>> Thanks so much Dan and Marcel!
>>>>>> >>>
>>>>>> >>> -E
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On Wed, Dec 10, 2014 at 3:08 PM, Dan Andreescu <
>>>>>> [email protected]>
>>>>>> >>> wrote:
>>>>>> >>>>
>>>>>> >>>> forgot Marcel - my fault.  Jaime & folks, in general Marcel
>>>>>> rules and
>>>>>> >>>> he's probably going to help you out faster / better than I can.
>>>>>> >>>>
>>>>>> >>>> On Wed, Dec 10, 2014 at 5:57 PM, Dan Andreescu
>>>>>> >>>> <[email protected]> wrote:
>>>>>> >>>>>
>>>>>> >>>>> Ok, Amanda and anyone else who had problems.  Please try
>>>>>> again.  I
>>>>>> >>>>> think I've cleared up some gunk and that might have helped
>>>>>> things.  We'll be
>>>>>> >>>>> looking at performance more closely soon.
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> Steps taken, logging mostly for post-mortem purpose
>>>>>> >>>>>
>>>>>> >>>>> * delete from report where recurrent_parent_id is null and
>>>>>> recurrent =
>>>>>> >>>>> 0 and created < date('2014-12-01');
>>>>>> >>>>> ** This deleted records that are not visible in the system
>>>>>> anymore.
>>>>>> >>>>> They are recoverable from the wikimetrics database backups but
>>>>>> we don't need
>>>>>> >>>>> them in the database.  These probably slowed some things down,
>>>>>> in total the
>>>>>> >>>>> statement deleted 1623628 rows.
>>>>>> >>>>>
>>>>>> >>>>> * alter table report add column old_recurrent tinyint(1);
>>>>>> update report
>>>>>> >>>>> set recurrent = 0, old_recurrent = 1 where user_id = 461 and
>>>>>> recurrent = 1;
>>>>>> >>>>> ** This disables WikimetricsBot recurrent reports, but
>>>>>> preserves the
>>>>>> >>>>> data so we can deal with them later.  When labs is done
>>>>>> re-synchronizing, we
>>>>>> >>>>> will be re-running these reports.  They feed data to Vital
>>>>>> Signs, in case
>>>>>> >>>>> someone's curious about what they are.
>>>>>> >>>>>
>>>>>> >>>>> * Stopped and rebooted the system.  The backup system seems to
>>>>>> be
>>>>>> >>>>> hanging or taking a really long time.  I'd like to take a look
>>>>>> at this in
>>>>>> >>>>> more depth, but my guess is the amount it's transferring has
>>>>>> gone beyond
>>>>>> >>>>> what we expected.
>>>>>> >>>>>
>>>>>> >>>>> On Wed, Dec 10, 2014 at 5:23 PM, Dan Andreescu
>>>>>> >>>>> <[email protected]> wrote:
>>>>>> >>>>>>
>>>>>> >>>>>> We're sorry - the problems we were facing last week have
>>>>>> probably
>>>>>> >>>>>> festered.  I'm going to turn off some things and reset the
>>>>>> system.  I'll
>>>>>> >>>>>> report back.
>>>>>> >>>>>>
>>>>>> >>>>>> On Wed, Dec 10, 2014 at 4:47 PM, Amanda Bittaker
>>>>>> >>>>>> <[email protected]> wrote:
>>>>>> >>>>>>>
>>>>>> >>>>>>> Oh yes, and Jaime did have me restart my browser and clear
>>>>>> the cache,
>>>>>> >>>>>>> but it did not help.
>>>>>> >>>>>>>
>>>>>> >>>>>>> Thanks again,
>>>>>> >>>>>>> Amanda
>>>>>> >>>>>>>
>>>>>> >>>>>>> On Wed, Dec 10, 2014 at 1:45 PM, Amanda Bittaker
>>>>>> >>>>>>> <[email protected]> wrote:
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> Hello Kevin,
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> Jaime asked me to email you about some trouble I've been
>>>>>> having with
>>>>>> >>>>>>>> Wikimetrics.  The whole team has been experiencing a pretty
>>>>>> high rate of
>>>>>> >>>>>>>> failures in both report creation and cohort uploads.  Almost
>>>>>> nothing has
>>>>>> >>>>>>>> gotten through for me today:  of the last 13 reports I've
>>>>>> run, 3 were
>>>>>> >>>>>>>> successful.  Of the failures, I would say maybe only two or
>>>>>> three "pended"
>>>>>> >>>>>>>> at all before becoming failures.  I've been experiencing the
>>>>>> same problem
>>>>>> >>>>>>>> with cohort uploads.
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> The reports have been: Newly Registered, Edits, and Rolling
>>>>>> Active
>>>>>> >>>>>>>> Editor using expanded cohorts.  Please find attached an
>>>>>> example of one of
>>>>>> >>>>>>>> the reports.  I tried uploading cohorts using text files of
>>>>>> user names and
>>>>>> >>>>>>>> pasting user names from Notepad into the "Paste Usernames"
>>>>>> field.  I do
>>>>>> >>>>>>>> expand the cohorts every time.
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> Do you know why the failure rate is so high, especially this
>>>>>> >>>>>>>> morning, and is there a way to eliminate or mitigate this
>>>>>> problem in the
>>>>>> >>>>>>>> future?
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> Many thanks for the assistance, and please do let me know if
>>>>>> you
>>>>>> >>>>>>>> need any more information from me on this.
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> Best,
>>>>>> >>>>>>>> Amanda
>>>>>> >>>>>>>
>>>>>> >>>>>>>
>>>>>> >>>>>>
>>>>>> >>>>>
>>>>>> >>>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>> Edward Galvez
>>>>>> >>> Program Evaluation Associate
>>>>>> >>> Wikimedia Foundation
>>>>>> >>
>>>>>> >>
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>
>
>
> --
> Edward Galvez
> Program Evaluation Associate
> Wikimedia Foundation
>


-- 
Sent from Gmail Mobile
_______________________________________________
Wikimetrics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikimetrics

Reply via email to