+1 :) Thanks, Dan, et al.
Worked great for me this morning. I'm a happy camper. Anna :) On Friday, January 23, 2015, Edward Galvez <[email protected]> wrote: > Thank you so much!!! We really appreciate it! > > -Edward > > > > On Fri, Jan 23, 2015 at 9:31 AM, Dan Andreescu <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > >> Wikimetrics has been having serious connectivity problems for a few >> days. It turned out to be solvable by using some new hostnames ( >> labsdb1002.eqiad.wmnet). I fixed it just now, please retry your reports >> and let me know if anything is still wrong. >> >> On Fri, Jan 23, 2015 at 10:46 AM, Dan Andreescu <[email protected] >> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: >> >>> Hi everyone. I will work on this as soon as I get into the office, in >>> about an hour from now. Yuvi suggested one thing that I wasn't aware of >>> that might make this a simple fix. >>> >>> >>> On Friday, January 23, 2015, Dan Higgins <[email protected]> wrote: >>> >>>> Hi Kevin, >>>> >>>> Sorry to be a pest but do you have any update on sorting out the >>>> Wikimetrics issues? It seems to have gotten worse since we last spoke to >>>> you with around 1 in 10 reports going through. >>>> >>>> Thanks, >>>> >>>> Dan >>>> >>>> On Tue, Jan 20, 2015 at 7:17 PM, Kevin Leduc <[email protected]> >>>> wrote: >>>> >>>>> All the developers are in transit to SF today. Dan said he'd be in >>>>> the office this afternoon. First dev I see I'll notify them of problems >>>>> in >>>>> wikimetrics. >>>>> >>>>> On Tue, Jan 20, 2015 at 11:10 AM, Amanda Bittaker < >>>>> [email protected]> wrote: >>>>> >>>>>> Hello again gentlemen, >>>>>> >>>>>> I think Dan might have already pinged you, but just in case, I wanted >>>>>> to let you know that we are getting these failures again. It's kind >>>>>> of crunch time for getting this data, so we're just banging our heads >>>>>> against the wall and retrying the reports until they work (1 out of 4 >>>>>> times for me.) Is there any way you all could work your magic again? >>>>>> >>>>>> Many thanks once again, >>>>>> Amanda >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Dec 10, 2014 at 4:30 PM, Kevin Leduc <[email protected]> >>>>>> wrote: >>>>>> > It's good to hear it's working again. Don't hesitate to reach out >>>>>> to us >>>>>> > here or at [email protected] if you notice this kind >>>>>> of >>>>>> > trouble again. >>>>>> > >>>>>> > On Wed, Dec 10, 2014 at 3:37 PM, Amanda Bittaker < >>>>>> [email protected]> >>>>>> > wrote: >>>>>> >> >>>>>> >> It's working perfectly now--a thousand thank yous, Dan and Marcel. >>>>>> >> >>>>>> >> On Wed, Dec 10, 2014 at 3:24 PM, Edward Galvez < >>>>>> [email protected]> >>>>>> >> wrote: >>>>>> >>> >>>>>> >>> Thanks so much Dan and Marcel! >>>>>> >>> >>>>>> >>> -E >>>>>> >>> >>>>>> >>> >>>>>> >>> On Wed, Dec 10, 2014 at 3:08 PM, Dan Andreescu < >>>>>> [email protected]> >>>>>> >>> wrote: >>>>>> >>>> >>>>>> >>>> forgot Marcel - my fault. Jaime & folks, in general Marcel >>>>>> rules and >>>>>> >>>> he's probably going to help you out faster / better than I can. >>>>>> >>>> >>>>>> >>>> On Wed, Dec 10, 2014 at 5:57 PM, Dan Andreescu >>>>>> >>>> <[email protected]> wrote: >>>>>> >>>>> >>>>>> >>>>> Ok, Amanda and anyone else who had problems. Please try >>>>>> again. I >>>>>> >>>>> think I've cleared up some gunk and that might have helped >>>>>> things. We'll be >>>>>> >>>>> looking at performance more closely soon. >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> Steps taken, logging mostly for post-mortem purpose >>>>>> >>>>> >>>>>> >>>>> * delete from report where recurrent_parent_id is null and >>>>>> recurrent = >>>>>> >>>>> 0 and created < date('2014-12-01'); >>>>>> >>>>> ** This deleted records that are not visible in the system >>>>>> anymore. >>>>>> >>>>> They are recoverable from the wikimetrics database backups but >>>>>> we don't need >>>>>> >>>>> them in the database. These probably slowed some things down, >>>>>> in total the >>>>>> >>>>> statement deleted 1623628 rows. >>>>>> >>>>> >>>>>> >>>>> * alter table report add column old_recurrent tinyint(1); >>>>>> update report >>>>>> >>>>> set recurrent = 0, old_recurrent = 1 where user_id = 461 and >>>>>> recurrent = 1; >>>>>> >>>>> ** This disables WikimetricsBot recurrent reports, but >>>>>> preserves the >>>>>> >>>>> data so we can deal with them later. When labs is done >>>>>> re-synchronizing, we >>>>>> >>>>> will be re-running these reports. They feed data to Vital >>>>>> Signs, in case >>>>>> >>>>> someone's curious about what they are. >>>>>> >>>>> >>>>>> >>>>> * Stopped and rebooted the system. The backup system seems to >>>>>> be >>>>>> >>>>> hanging or taking a really long time. I'd like to take a look >>>>>> at this in >>>>>> >>>>> more depth, but my guess is the amount it's transferring has >>>>>> gone beyond >>>>>> >>>>> what we expected. >>>>>> >>>>> >>>>>> >>>>> On Wed, Dec 10, 2014 at 5:23 PM, Dan Andreescu >>>>>> >>>>> <[email protected]> wrote: >>>>>> >>>>>> >>>>>> >>>>>> We're sorry - the problems we were facing last week have >>>>>> probably >>>>>> >>>>>> festered. I'm going to turn off some things and reset the >>>>>> system. I'll >>>>>> >>>>>> report back. >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Dec 10, 2014 at 4:47 PM, Amanda Bittaker >>>>>> >>>>>> <[email protected]> wrote: >>>>>> >>>>>>> >>>>>> >>>>>>> Oh yes, and Jaime did have me restart my browser and clear >>>>>> the cache, >>>>>> >>>>>>> but it did not help. >>>>>> >>>>>>> >>>>>> >>>>>>> Thanks again, >>>>>> >>>>>>> Amanda >>>>>> >>>>>>> >>>>>> >>>>>>> On Wed, Dec 10, 2014 at 1:45 PM, Amanda Bittaker >>>>>> >>>>>>> <[email protected]> wrote: >>>>>> >>>>>>>> >>>>>> >>>>>>>> Hello Kevin, >>>>>> >>>>>>>> >>>>>> >>>>>>>> Jaime asked me to email you about some trouble I've been >>>>>> having with >>>>>> >>>>>>>> Wikimetrics. The whole team has been experiencing a pretty >>>>>> high rate of >>>>>> >>>>>>>> failures in both report creation and cohort uploads. Almost >>>>>> nothing has >>>>>> >>>>>>>> gotten through for me today: of the last 13 reports I've >>>>>> run, 3 were >>>>>> >>>>>>>> successful. Of the failures, I would say maybe only two or >>>>>> three "pended" >>>>>> >>>>>>>> at all before becoming failures. I've been experiencing the >>>>>> same problem >>>>>> >>>>>>>> with cohort uploads. >>>>>> >>>>>>>> >>>>>> >>>>>>>> The reports have been: Newly Registered, Edits, and Rolling >>>>>> Active >>>>>> >>>>>>>> Editor using expanded cohorts. Please find attached an >>>>>> example of one of >>>>>> >>>>>>>> the reports. I tried uploading cohorts using text files of >>>>>> user names and >>>>>> >>>>>>>> pasting user names from Notepad into the "Paste Usernames" >>>>>> field. I do >>>>>> >>>>>>>> expand the cohorts every time. >>>>>> >>>>>>>> >>>>>> >>>>>>>> Do you know why the failure rate is so high, especially this >>>>>> >>>>>>>> morning, and is there a way to eliminate or mitigate this >>>>>> problem in the >>>>>> >>>>>>>> future? >>>>>> >>>>>>>> >>>>>> >>>>>>>> Many thanks for the assistance, and please do let me know if >>>>>> you >>>>>> >>>>>>>> need any more information from me on this. >>>>>> >>>>>>>> >>>>>> >>>>>>>> Best, >>>>>> >>>>>>>> Amanda >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>>> >>>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> -- >>>>>> >>> Edward Galvez >>>>>> >>> Program Evaluation Associate >>>>>> >>> Wikimedia Foundation >>>>>> >> >>>>>> >> >>>>>> > >>>>>> >>>>> >>>>> >>>> >> > > > -- > Edward Galvez > Program Evaluation Associate > Wikimedia Foundation > -- Sent from Gmail Mobile
_______________________________________________ Wikimetrics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikimetrics
