Domas Mituzas wrote: > Hello Anthony, > > I'm back at my lair (phew, finally ;-) > >> Regarding the files at http://dammit.lt/wikistats/ : >> What are "en.b", "en.d", "en2", etc? > > suffixes indicate projects - from > http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/filter.c?revision=34989&view=markup > > : > > projects[] = { > {"wikipedia","",NULL}, > {"wiktionary",".d",NULL}, > {"wikinews",".n",NULL}, > {"wikimedia",".m",check_wikimedia}, > {"wikibooks",".b",NULL}, > {"wikisource",".s",NULL}, > {"mediawiki",".w",NULL}, > {"wikiversity",".v",NULL}, > {"wikiquote",".q",NULL}, > NULL > }, > > en2 is, um, http://en2.wikipedia.org/ ;-) it used to exist once upon a > time, and apparently there're some referrals. > >> Are edits included, or only views? > > That is views only - though you can find actual logic in above file, > it is mostly this pattern: > > http://*.*.org/wiki/* > > which is what we have for special pages and views. > >> Are the hit counts actual, or 1/10th sampled, or something else? > > They are actual, with duplicates removed (that is, we don't count in > cache-to-cache traffic, only end-user-to-cache). > >> pagecounts-20090501-200000.gz<http://dammit.lt/wikistats/pagecounts-20090501-200000.gz >> >>> is >> the hour *beginning* 20:00:00? > > ending, I think. let me check, yes, end time. logic is in > produceDump() at > http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/collector.c?revision=30113&view=markup > > :) > > I think I may end up documenting this somewhat more, but I need to do > some promised and long overdue development on this project.
If no one minds, I think I will copy this email to the toolserver wiki :) _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
