https://bugzilla.wikimedia.org/show_bug.cgi?id=37291

       Web browser: ---
             Bug #: 37291
           Summary: updateArticleCount.php script is broken
           Product: MediaWiki
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: Maintenance scripts
        AssignedTo: [email protected]
        ReportedBy: [email protected]
                CC: [email protected]
    Classification: Unclassified
   Mobile Platform: ---


In short: The updateArticleCount.php script is not counting articles correctly.

The evidence:

See the table I'm still filling out at [[m:User:Dcljr/Article counts]], which
collects (way too many) statistics based on the official database dumps. (In
particular, see the columns highlighted in pink, which show how far off the
"on-wiki" article counts were from the actual dump-based article counts, both
before and after the script was run.)

The longer version:

Ever since the resolution of bug 33253, which led to several wikis "losing" or
"gaining" huge numbers of articles (according to their {NUMBEROFARTICLES}
count), I've suspected very strongly that the updateArticleCount.php script is
not counting articles correctly. Now I have firm evidence.

I wrote a Perl script to download and parse relevant dumps from
<dumps.wikimedia.org> thereby counting articles "from scratch" based on the
current "non-redirect with at least one wikilink" criteria (as well as some
more and less generous criteria that I'm trying out for comparison). The
results are being collected at the Meta page above.

I've started with the Wiktionaries whose article counts dropped the most (in
terms of percentage), so the table is currently showing huge undercounts. I
originally suspected that the wikis whose article counts gained the most would
show significant overcounts, but the handful of checks I've made of such wikis
(which haven't been added to the table yet) haven't shown this to be the case.

We Shall See...

Punchline: Someone needs to check the updateArticleCount.php script to see why
it's undercounting articles.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to