Hello Giovanni, thanks for the pointer to the Click datasets. I'd have to take a look at the complete dataset, to see how much of those requests are touching wikipedia.
Then, one of the requirements to access those datas is: "The Click Dataset is large (~2.5 TB compressed), which requires that it be transferred on a physical hard drive. You will have to provide the drive as well as pre-paid return shipment. " I have to check if this is possible and how long this might take to ship and send back an hard-drive from Switzerland. I'll let you know !! Best, Valerio On Wed, Sep 17, 2014 at 4:09 PM, Giovanni Luca Ciampaglia < gciam...@indiana.edu> wrote: > Valerio, > > I didn't know such data existed. As an alternative, perhaps you could have > a look at our click datasets, which contain requests to the Web at large > (i.e., not just Wikipedia) generated from within the campus of Indiana > University over a period of several months. HTH > > http://carl.cs.indiana.edu/data/#click > > Cheers > > G > > Giovanni Luca Ciampaglia > > ✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA > ☞ http://www.glciampaglia.com/ > ✆ +1 812 855-7261 > ✉ gciam...@indiana.edu > > 2014-09-17 9:53 GMT-04:00 Valerio Schiavoni <valerio.schiav...@gmail.com>: > >> Hello, >> just bumping my email from last week, since so far I did not get any >> answer. >> >> Should I consider that dataset to be somehow lost ? >> >> I've also contacted the researchers who partially released it, but making >> it publicly available is tricky for them, due to its size (12 TB), which >> might instead be somehow in the norms of the operations taken daily by >> Wikipedia servers. >> >> Thanks again, >> Valerio >> >>> >>> On Wed, Sep 10, 2014 at 4:15 AM, Valerio Schiavoni < >>> valerio.schiav...@gmail.com> wrote: >>> >>>> Dear WikiMedia foundation, >>>> in the context of a EU research project [1], we are interested in >>>> accessing >>>> wikipedia access traces. >>>> In the past, such traces were given for research purposes to other >>>> groups >>>> [2]. >>>> Unfortunately, only a small percentage (10%) of that trace has been made >>>> made available (10%). >>>> We are interested in accessing the totality of that same trace (or even >>>> better, a more recent one, but the same one will do). >>>> >>>> If this is not the correct ML to use for such requests, could please >>>> anyone >>>> redirect me to correct one ? >>>> >>>> Thanks again for your attention, >>>> >>>> Valerio Schiavoni >>>> Post-Doc Researcher >>>> University of Neuchatel, Switzerland >>>> >>>> 1 - http://www.leads-project.eu >>>> 2 - http://www.wikibench.eu/?page_id=60 >>>> >>> >>> >> >> _______________________________________________ >> Wiki-research-l mailing list >> Wiki-research-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >> >> > > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > >
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l