Yep, labs is currently experiencing a disk failure which will affect our
instance.  The thread subject on the labs list:

[Labs-l] Partial outage in progress

On Tue, Feb 17, 2015 at 12:43 PM, Dan Andreescu <[email protected]>
wrote:

> Amanda, did you base your query on the pseudo-sql listed on the metric
> page?  In case you haven't seen it:
>
>
> https://github.com/wikimedia/analytics-wikimetrics/blob/master/wikimetrics/metrics/bytes_added.py#L26
>
> (Sorry I'm linking to github, labs seems to be suffering some serious DNS
> problems right now, all my attempts to load wikimetrics are failing)
>
> On Tue, Feb 17, 2015 at 12:40 PM, Amanda Bittaker <[email protected]
> > wrote:
>
>> On that note, Jonathan, do you have SQL queries that return the same
>> results as the Wikimetrics reports?  I tried writing my own for bytes
>> added, but it's pretty janky and takes forever to return anything in
>> Quarry.  Would you share your wisdom with a poor wayward amateur?
>>
>>
>> On Tue, Feb 17, 2015 at 9:35 AM, Amanda Bittaker <[email protected]
>> > wrote:
>>
>>> Sweet, thanks Jonathan.  I added a "Heartbreak" token, because at this
>>> point I am really far too emotionally attached to Wikimetrics.
>>>
>>> On Tue, Feb 17, 2015 at 9:25 AM, Jonathan Morgan <[email protected]>
>>> wrote:
>>>
>>>> Hi Amanda,
>>>>
>>>> Here's a ticket you can upvote:
>>>> https://phabricator.wikimedia.org/T87596
>>>>
>>>> I added a link to this thread to the task. I also added an "Evil Spooky
>>>> Haunted Tree" token to the task. Because... well it just felt like the
>>>> right thing to do.
>>>>
>>>> - J
>>>>
>>>> On Tue, Feb 17, 2015 at 7:58 AM, Dan Andreescu <
>>>> [email protected]> wrote:
>>>>
>>>>> I can't find a specific ticket, Nuria may know of one.  In general,
>>>>> this is the project that LabsDB tickets are tagged with:
>>>>> https://phabricator.wikimedia.org/tag/wikimedia-labs-infrastructure/
>>>>>
>>>>> On Tue, Feb 17, 2015 at 10:53 AM, Amanda Bittaker <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Good morning Dan,
>>>>>>
>>>>>> Thanks very much for the explanation.  Is there a Phabricator task we
>>>>>> can upvote (award a token?) to make this issue more visible?
>>>>>>
>>>>>> As always, we really appreciate your help with this.
>>>>>>
>>>>>> Best,
>>>>>> Amanda
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 17, 2015 at 7:20 AM, Dan Andreescu <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Sorry for the trouble, Amanda.  The problem is solely with the
>>>>>>> underlying database, which we don't maintain.  It's a sanitized replica 
>>>>>>> of
>>>>>>> all the changes being made to all the wikis so it's a fairly complicated
>>>>>>> piece of infrastructure that sometimes has problems.  The folks who
>>>>>>> maintain it are aware of the issues, but we'll continue representing 
>>>>>>> them
>>>>>>> until they're solved.
>>>>>>>
>>>>>>> On Mon, Feb 16, 2015 at 3:49 PM, Amanda Bittaker <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Oop, thanks for the ping, Nuria.  Wikimetrics seems to be working
>>>>>>>> better now.  I still get failures, especially when running three or 
>>>>>>>> four
>>>>>>>> reports in one batch, but the reports work if you rerun them 
>>>>>>>> (sometimes a
>>>>>>>> couple times.)
>>>>>>>>
>>>>>>>> I'm still getting "PENDING"s that turn into "FAILURE"s sometimes,
>>>>>>>> which I just noticed for the first time last Thursday.  Also, 
>>>>>>>> sometimes the
>>>>>>>> "FAILURE"s change position in the Current Report Inbox list, moving up 
>>>>>>>> or
>>>>>>>> down a spot.  Not sure if that helps diagnose what might be 
>>>>>>>> happening...
>>>>>>>>
>>>>>>>> In any case, Wikimetrics is mostly functioning but seems to be
>>>>>>>> having recurring troubles that sometimes blow up to freeze the whole 
>>>>>>>> tool.
>>>>>>>> It would be great to resolve the troubles before the next explosion--is
>>>>>>>> there anything I can do to help?  Dan H and I still have plenty of 
>>>>>>>> reports
>>>>>>>> to run, we can keep you updated on the reports ran and failure rate 
>>>>>>>> while
>>>>>>>> you are fixing, if that would be useful.
>>>>>>>>
>>>>>>>> Many thanks,
>>>>>>>> Amanda
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Feb 16, 2015 at 10:15 AM, Nuria Ruiz <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Ping ....
>>>>>>>>>
>>>>>>>>> On Fri, Feb 13, 2015 at 2:19 PM, Nuria Ruiz <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Amanda,
>>>>>>>>>>
>>>>>>>>>> Looks like wikimetrics was able to run automatic reports last
>>>>>>>>>> night w/o big issues, are your reports still failing?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Nuria
>>>>>>>>>>
>>>>>>>>>> On Thu, Feb 12, 2015 at 1:42 PM, Amanda Bittaker <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Alright, thanks so much for your help once again, Nuria.
>>>>>>>>>>>
>>>>>>>>>>> If there's anything I can do or any information I can
>>>>>>>>>>> contribute, please don't hesitate to ping me.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Amanda
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Feb 12, 2015 at 1:36 PM, Nuria Ruiz <[email protected]
>>>>>>>>>>> > wrote:
>>>>>>>>>>>
>>>>>>>>>>>> DB connections in labs look to be failing, unfortunately  I
>>>>>>>>>>>> think besides asking for help on the labs list there is not much 
>>>>>>>>>>>> we can do
>>>>>>>>>>>> there. I will start a thread on this regard.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Nuria
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Feb 12, 2015 at 1:32 PM, Amanda Bittaker <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks so much for the quick response, Nuria.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I ran the exact same reports on the same cohort as one of the
>>>>>>>>>>>>> last batches that were failing.  Last time 2/4 of the reports 
>>>>>>>>>>>>> failed, when
>>>>>>>>>>>>> I reran the individually they succeeded.  (But they don't always, 
>>>>>>>>>>>>> I reran
>>>>>>>>>>>>> one report 3 times this morning before it worked.)   This time, 
>>>>>>>>>>>>> my failure
>>>>>>>>>>>>> rate got worse:  4/4 failed, although they said "PENDING" for a 
>>>>>>>>>>>>> few seconds
>>>>>>>>>>>>> first, which is new.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is that useful information?  Please do let me know what else I
>>>>>>>>>>>>> can do to help solve this.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>> Amanda
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Feb 12, 2015 at 1:09 PM, Jonathan Morgan <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks Nuria!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Feb 12, 2015 at 12:57 PM, Nuria Ruiz <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If so a cohort + report to repro will be most useful.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Translation:* try to run the exact same reports on the same
>>>>>>>>>>>>>> cohort again, to see if the same metrics fail. Let us know what 
>>>>>>>>>>>>>> you find. ;)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Same goes for anyone else who experiences these issues: the
>>>>>>>>>>>>>> more details we (users) can provide the engineers, the more 
>>>>>>>>>>>>>> effective they
>>>>>>>>>>>>>> can be at diagnosing and addressing the problems.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> - J
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *for anyone who is not 100% familiar with that hip, new
>>>>>>>>>>>>>> software engineering lingo
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Nuria
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Feb 12, 2015 at 12:35 PM, Dan Andreescu <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Recently there was a restart of the labsdb cluster.  I'm
>>>>>>>>>>>>>>>> sorry but I don't have time to check on it, but I bet that's 
>>>>>>>>>>>>>>>> the problem.
>>>>>>>>>>>>>>>> I'm off tomorrow unfortunately but I'll try to check tomorrow 
>>>>>>>>>>>>>>>> night :(  I
>>>>>>>>>>>>>>>> hope someone else beats me to it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Feb 12, 2015 at 3:20 PM, Jonathan Morgan <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (ping Kevin and Dan A.)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Amanda, I've had some problems with report failures
>>>>>>>>>>>>>>>>> recently when I ran a few test cohorts. On the same cohort, 
>>>>>>>>>>>>>>>>> when I ran
>>>>>>>>>>>>>>>>> multiple concurrent reports (say, bytes added, edits, and 
>>>>>>>>>>>>>>>>> pages created),
>>>>>>>>>>>>>>>>> some would fail and others succeed. It wasn't clear what the 
>>>>>>>>>>>>>>>>> issue was.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - J
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Feb 12, 2015 at 12:16 PM, Amanda Bittaker <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am getting failures again, both when uploading cohorts
>>>>>>>>>>>>>>>>>> and running reports.  Strangely, it seems the more reports 
>>>>>>>>>>>>>>>>>> you try to run
>>>>>>>>>>>>>>>>>> in one batch the less likely it is any report will succeed.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is anyone else having these problems again?  Wonderful
>>>>>>>>>>>>>>>>>> Analytics people, could you please work your magic again?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Many thanks,
>>>>>>>>>>>>>>>>>> Amanda
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> Wikimetrics mailing list
>>>>>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Jonathan T. Morgan
>>>>>>>>>>>>>>>>> Community Research Lead
>>>>>>>>>>>>>>>>> Wikimedia Foundation
>>>>>>>>>>>>>>>>> User:Jmorgan (WMF)
>>>>>>>>>>>>>>>>> <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
>>>>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Wikimetrics mailing list
>>>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Jonathan T. Morgan
>>>>>>>>>>>>>> Community Research Lead
>>>>>>>>>>>>>> Wikimedia Foundation
>>>>>>>>>>>>>> User:Jmorgan (WMF)
>>>>>>>>>>>>>> <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Wikimetrics mailing list
>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Wikimetrics mailing list
>>>>>>>> [email protected]
>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Wikimetrics mailing list
>>>>>>> [email protected]
>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Wikimetrics mailing list
>>>>>> [email protected]
>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Wikimetrics mailing list
>>>>> [email protected]
>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Jonathan T. Morgan
>>>> Community Research Lead
>>>> Wikimedia Foundation
>>>> User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
>>>> [email protected]
>>>>
>>>>
>>>> _______________________________________________
>>>> Wikimetrics mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Wikimetrics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics
>>
>>
>
_______________________________________________
Wikimetrics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikimetrics

Reply via email to