Hi Valerio,

Mako was referring to https://dumps.wikimedia.org/other/pagecounts-raw/ and
the current logging practices. My understanding is also that these things
are not logged on a routine basis. The Wikibench traces seem to have been a
special case.

I've also contacted the researchers who partially released it, but making
> it publicly available is tricky for them, due to its size (12 TB), which
> might instead be somehow in the norms of the operations taken daily by
> Wikipedia servers.
>

Have the researchers looked into requester-pays data storage on Amazon or
another provider? They should be able to make it public with no resources
and at no cost to themselves whatever the size.

Cheers,
Scott


On Wed, Sep 24, 2014 at 7:09 PM, Valerio Schiavoni <
valerio.schiav...@gmail.com> wrote:

> Hello Mako,
>
> On Wed, Sep 24, 2014 at 8:13 AM, Benj. Mako Hill <m...@atdot.cc> wrote:
>
>> > Users mostly read the most recent version of a given page, but from
>> time to
>> > time, read accesses to the 'history' of a page happens.
>>
>> At least as far as know, views to historical versions of webpages in
>> Wikipedia don't show up in the access logs at all because certain
>> kinds of requests (like requests to /w/index.php?oldid=NUMBER) don't
>> get recorded in the pageview data.
>>
>
> I'm sorry to contradict you, but at least on the Wikibench traces, that
> information is very well present. I see things like:
>
> 1609418296 1190438479.078
> http://en.wikipedia.org/w/index.php?title=Western_betrayal&oldid=9828122&action=raw
>
>
> That is, back in 2007, users were accessing a version of that page that
> dated back in 2005 or so.
>
> > New versions of a page are created as well. Finally, users might
>> > potentially need to explore several old versions of a given web
>> > page, for example by accessing the details of its history[1].
>>
>> AFAIK, viewing the history page itself is also not recorded in the
>> page view data either.
>>
>
> Sorry to contradict you again, but there are indeed logs for that as well:
>
> http://en.wikipedia.org/w/index.php?title=Marina_Nadiradze&action=history
>
>
> I'm quite surprised that such informations are not known by the community
> of Wikipedia researchers.
>
> Best,
> Valerio
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>


-- 
Scott Hale
Oxford Internet Institute
University of Oxford
http://www.scotthale.net/
scott.h...@oii.ox.ac.uk
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to