https://bugzilla.wikimedia.org/show_bug.cgi?id=70118

            Bug ID: 70118
           Summary: Webstatscollector's pagecounts file shaky between
                    2014-08-24 14:00 and 2014-08-27 21:00
           Product: Analytics
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: General/Unknown
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected],
                    [email protected], [email protected],
                    [email protected], [email protected]
        Depends on: 70053
       Web browser: ---
   Mobile Platform: ---

TL;DR: For the files between 2014-08-24 14:00 and 2014-08-27 21:00
webstatscollector output are showing some irregularities. This affects
both the pagecounts and projectcount files from

  https://dumps.wikimedia.org/other/pagecounts-raw/2014/2014-08/

and all services that process them.

--------------------------------------------

The pagecount file 20140824 14:00 [1] did not show
irregularities. But the file for one day later show a drop of 80% for
a few pages.

As gadolinium (the host that is writing the pagecount files) showed a
high and still increasing process count for no longer needed services
(bug 70053), those services got turned off around 2014-08-26 19:00:00.

But although more resources were freed on gadolinium, its
webstatscollector's collector process degraded further. Since the
service did not recover (UDP Receive Buffers filled up again and
again, disks could not take the write load, and the service gathered
95GB of virtual memory since its last restart), the service got
restarted on 2014-08-28 ~15:32.

Since the restart did not relax the situation either, the service was
put on tmpfs 2014-08-28 ~19:48, which reduced load on the disks, and
made the service work again. The files starting at 2014-08-27 21:00
are good again.

Thanks ottomata for all those fixes!
More details are in the corresponding IRC channel logs [3].

Closer investigation of the files between 20140824 14:00 and
2014-08-27 21:00 is still pending.


[1]
https://dumps.wikimedia.org/other/pagecounts-raw/2014/2014-08/pagecounts-20140824-140000.gz
[2]
https://dumps.wikimedia.org/other/pagecounts-raw/2014/2014-08/pagecounts-20140827-210000.gz
[3] http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-analytics/20140827.txt

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to