On Thu, Sep 17, 2009 at 10:18 PM, Robert Rohde <[email protected]> wrote:
> On Thu, Sep 17, 2009 at 8:58 PM, Brian <[email protected]> wrote: > > On Thu, Sep 17, 2009 at 9:55 PM, Robert Rohde <[email protected]> wrote: > > > >> On Thu, Sep 17, 2009 at 8:25 PM, Steve Bennett <[email protected]> > >> wrote: > >> > On Fri, Sep 18, 2009 at 12:20 PM, Robert Rohde <[email protected]> > >> wrote: > >> >> That particular result is unpublished. I could make you a list of > >> >> infrequently viewed articles, but it would be quite long. > >> > > >> > Could you make a list of the 100 least viewed? Or are there are large > >> > number which are essentially equal? > >> > >> My sample consisted of collating 30 non-consecutive hours of data on > >> enwiki traffic where each hour was randomly chosen from any point > >> during the last 8 months. This was filtered to only include page > >> titles that were valid mainspace pages. > >> > >> In those 30 hours, there are 1.36 million valid article titles that > >> are viewed exactly once [1]. > >> > >> Examples include: > >> > >> 129342_Ependes > >> 1421_in_literature > >> Antiprotonic_helium > >> Antonella_Mularoni > >> Madhusoodhanan_Nair > >> Blue_Murder_(play) > >> Ozonotherapy > >> Veronika_Krausas > >> Verret,_New_Brunswick > >> Bare_Truth_(Nat_album) > >> > >> As you can see, these are obscure topics, but they are not necessarily > >> crazy topics. If I were to repeat it with a longer baseline (say 1000 > >> hours rather than 30) I'm suspect you might get more interesting > >> information on the tail, but right now probably the best I can say is > >> that a cumulatively significant amount of traffic goes to relatively > >> obscure pages. > >> > >> -Robert Rohde > >> > >> [1] Note: Because the traffic data is based on url request stings, and > >> some url strings map to the same pages, i.e. Blue_Ocean and > >> Blue%20Ocean, the number of valid article titles in not necessarily > >> the same as the number of distinct pages. For practical reasons my > >> analysis was based of the url strings, and so probably over counts the > >> number of distinct articles involved, and to a degree overstates the > >> fraction of traffic to obscure pages. > > > > > > How sure are you that they were viewed by a person and not a bot? > > There is no differentiation between people and bots. (Some of these > things are why it is an unpublished analysis. ;-) I was actually > using traffic data for a totally different purpose, but decided to > look at things likes like obscure pages, while I was at it.) > > -Robert Rohde > > Oh I see. It would be reassuring to know that there were a million or so articles not viewed at all? _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
