On Thu, Sep 17, 2009 at 10:18 PM, Robert Rohde <[email protected]> wrote:

> On Thu, Sep 17, 2009 at 8:58 PM, Brian <[email protected]> wrote:
> > On Thu, Sep 17, 2009 at 9:55 PM, Robert Rohde <[email protected]> wrote:
> >
> >> On Thu, Sep 17, 2009 at 8:25 PM, Steve Bennett <[email protected]>
> >> wrote:
> >> > On Fri, Sep 18, 2009 at 12:20 PM, Robert Rohde <[email protected]>
> >> wrote:
> >> >> That particular result is unpublished.  I could make you a list of
> >> >> infrequently viewed articles, but it would be quite long.
> >> >
> >> > Could you make a list of the 100 least viewed? Or are there are large
> >> > number which are essentially equal?
> >>
> >> My sample consisted of collating 30 non-consecutive hours of data on
> >> enwiki traffic where each hour was randomly chosen from any point
> >> during the last 8 months.  This was filtered to only include page
> >> titles that were valid mainspace pages.
> >>
> >> In those 30 hours, there are 1.36 million valid article titles that
> >> are viewed exactly once [1].
> >>
> >> Examples include:
> >>
> >> 129342_Ependes
> >> 1421_in_literature
> >> Antiprotonic_helium
> >> Antonella_Mularoni
> >> Madhusoodhanan_Nair
> >> Blue_Murder_(play)
> >> Ozonotherapy
> >> Veronika_Krausas
> >> Verret,_New_Brunswick
> >> Bare_Truth_(Nat_album)
> >>
> >> As you can see, these are obscure topics, but they are not necessarily
> >> crazy topics.  If I were to repeat it with a longer baseline (say 1000
> >> hours rather than 30) I'm suspect you might get more interesting
> >> information on the tail, but right now probably the best I can say is
> >> that a cumulatively significant amount of traffic goes to relatively
> >> obscure pages.
> >>
> >> -Robert Rohde
> >>
> >> [1] Note: Because the traffic data is based on url request stings, and
> >> some url strings map to the same pages, i.e. Blue_Ocean and
> >> Blue%20Ocean, the number of valid article titles in not necessarily
> >> the same as the number of distinct pages.  For practical reasons my
> >> analysis was based of the url strings, and so probably over counts the
> >> number of distinct articles involved, and to a degree overstates the
> >> fraction of traffic to obscure pages.
> >
> >
> > How sure are you that they were viewed by a person and not a bot?
>
> There is no differentiation between people and bots.  (Some of these
> things are why it is an unpublished analysis.  ;-)  I was actually
> using traffic data for a totally different purpose, but decided to
> look at things likes like obscure pages, while I was at it.)
>
> -Robert Rohde
>
>
Oh I see.  It would be reassuring to know that there were a million or so
articles not viewed at all?
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to