On Fri, Sep 18, 2009 at 10:02 AM, Platonides <[email protected]> wrote:
> Erik Zachte wrote:
>> Sure, info gets lost. And the Long Tail is meaningful for some research no
>> doubt.
>> But my resources are finite.
>>
>> Actually I do store some all inclusive counts in the compacted 24 hr file:
>>
>> # Lines starting with ampersand (@) show totals per 'namespace' (including
>> omitted counts for low traffic articles)
>> # Since valid namespace string are not known in the compression script any
>> string followed by colon (:) counts as possible namespace string
>> # Please reconcile with real namespace name strings later
>> # 'namespaces' with count < 5 are combined in 'Other' (on larger wikis these
>> are surely false positives)
>
> Making the script aware of namespace names would be quite easy.

For English this is obviously true, but Erik writes scripts intended
to be language agnostic and work with all WMF projects.  While
certainly possible to teach it about namespaces in the general sense,
it would take rather a bit of effort to call up the local namespace
names and all legitimate variants for every different project/language
in turn.

-Robert Rohde

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to