https://bugzilla.wikimedia.org/show_bug.cgi?id=37291
Web browser: ---
Bug #: 37291
Summary: updateArticleCount.php script is broken
Product: MediaWiki
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: Unprioritized
Component: Maintenance scripts
AssignedTo: [email protected]
ReportedBy: [email protected]
CC: [email protected]
Classification: Unclassified
Mobile Platform: ---
In short: The updateArticleCount.php script is not counting articles correctly.
The evidence:
See the table I'm still filling out at [[m:User:Dcljr/Article counts]], which
collects (way too many) statistics based on the official database dumps. (In
particular, see the columns highlighted in pink, which show how far off the
"on-wiki" article counts were from the actual dump-based article counts, both
before and after the script was run.)
The longer version:
Ever since the resolution of bug 33253, which led to several wikis "losing" or
"gaining" huge numbers of articles (according to their {NUMBEROFARTICLES}
count), I've suspected very strongly that the updateArticleCount.php script is
not counting articles correctly. Now I have firm evidence.
I wrote a Perl script to download and parse relevant dumps from
<dumps.wikimedia.org> thereby counting articles "from scratch" based on the
current "non-redirect with at least one wikilink" criteria (as well as some
more and less generous criteria that I'm trying out for comparison). The
results are being collected at the Meta page above.
I've started with the Wiktionaries whose article counts dropped the most (in
terms of percentage), so the table is currently showing huge undercounts. I
originally suspected that the wikis whose article counts gained the most would
show significant overcounts, but the handful of checks I've made of such wikis
(which haven't been added to the table yet) haven't shown this to be the case.
We Shall See...
Punchline: Someone needs to check the updateArticleCount.php script to see why
it's undercounting articles.
--
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l