https://bugzilla.wikimedia.org/show_bug.cgi?id=47227

Diederik van Liere <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |WORKSFORME

--- Comment #2 from Diederik van Liere <[email protected]> ---
I posted the following explanation on the Village Pump
(http://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Sudden_drop_in_pageviews):

Explanation: On March 25th, the Analytics Team removed SSL traffic from the
udp2log stream of webrequests. This webrequest stream is consumed by
webstatscollector, the tool that generates the data that is presented by
stats.grok.se. The reason we removed SSL traffic was twofold:
* Each logline is tagged with a unique number that allows us to see how much
loglines we lose (aka packetloss); this numbering system was not working for
SSL traffic and hence our packetloss monitoring was inadequate. You can see a
nice drop in packetloss reporting as a result of this fix.
* SSL traffic actually generates two hits in our log files, once when it hits
the SSL server (nginx) and the second time when it hits the cache server
(squid). Webstatscollector was not deduplicating these numbers and so actually
the drop in pageviews that we are seeing means that we have gone back to the
actual pageview count.

So removing SSL traffic from the main webrequest stream was the cause of this
drop but it did not introduce a bug, it actually fixed an unknown bug of
overreporting SSL generated pageviews. Thanks to Wikid77 who got me thinking
about the SSL cause in the first place.

Potential Next Steps:
# WMF only recently started to enable SSL traffic so it would be interesting to
see if we can find a sudden spike in pageviews for this article.
# We can try to get Google link to the non-SSL page; it should not impact the
numbers anymore but WMF's infrastructure is not quite ready for handling
massive volumes of SSL traffic for anonymous readers.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to