On Thu, Sep 17, 2009 at 8:58 PM, Brian <[email protected]> wrote:
> On Thu, Sep 17, 2009 at 9:55 PM, Robert Rohde <[email protected]> wrote:
>
>> On Thu, Sep 17, 2009 at 8:25 PM, Steve Bennett <[email protected]>
>> wrote:
>> > On Fri, Sep 18, 2009 at 12:20 PM, Robert Rohde <[email protected]>
>> wrote:
>> >> That particular result is unpublished.  I could make you a list of
>> >> infrequently viewed articles, but it would be quite long.
>> >
>> > Could you make a list of the 100 least viewed? Or are there are large
>> > number which are essentially equal?
>>
>> My sample consisted of collating 30 non-consecutive hours of data on
>> enwiki traffic where each hour was randomly chosen from any point
>> during the last 8 months.  This was filtered to only include page
>> titles that were valid mainspace pages.
>>
>> In those 30 hours, there are 1.36 million valid article titles that
>> are viewed exactly once [1].
>>
>> Examples include:
>>
>> 129342_Ependes
>> 1421_in_literature
>> Antiprotonic_helium
>> Antonella_Mularoni
>> Madhusoodhanan_Nair
>> Blue_Murder_(play)
>> Ozonotherapy
>> Veronika_Krausas
>> Verret,_New_Brunswick
>> Bare_Truth_(Nat_album)
>>
>> As you can see, these are obscure topics, but they are not necessarily
>> crazy topics.  If I were to repeat it with a longer baseline (say 1000
>> hours rather than 30) I'm suspect you might get more interesting
>> information on the tail, but right now probably the best I can say is
>> that a cumulatively significant amount of traffic goes to relatively
>> obscure pages.
>>
>> -Robert Rohde
>>
>> [1] Note: Because the traffic data is based on url request stings, and
>> some url strings map to the same pages, i.e. Blue_Ocean and
>> Blue%20Ocean, the number of valid article titles in not necessarily
>> the same as the number of distinct pages.  For practical reasons my
>> analysis was based of the url strings, and so probably over counts the
>> number of distinct articles involved, and to a degree overstates the
>> fraction of traffic to obscure pages.
>
>
> How sure are you that they were viewed by a person and not a bot?

There is no differentiation between people and bots.  (Some of these
things are why it is an unpublished analysis.  ;-)  I was actually
using traffic data for a totally different purpose, but decided to
look at things likes like obscure pages, while I was at it.)

-Robert Rohde

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to