https://bugzilla.wikimedia.org/show_bug.cgi?id=21195
Summary: Include page count in database dumps
Product: Wikimedia
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: Normal
Component: General/Unknown
AssignedTo: [email protected]
ReportedBy: [email protected]
The dumps "pages-meta-current" and "pages-articles", as well as the
hypothetical article-namespace-only dump that I would like to see (bug 18919),
should include the total number of pages in the dump at the start of the file
in the "siteinfo" section.
Among other things, it would be useful for displaying dump search progress to
the user. Attempts to estimate the total number based on a small proportion of
the file seem to produce wildly inaccurate results, especially with the
en.wikipedia dump (pages are approximately ordered by creation time, and it
seems the older a page is, the larger it is, which makes sense). Even if it
were more accurate, it would be helpful to have the exact number to hand. And
obviously the extra few bytes in a 25GB file are negligible :)
An analogous thing could probably be done for some of the other dumps.
--
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l