https://bugzilla.wikimedia.org/show_bug.cgi?id=24354
Summary: Categorymembers counting behaviour when searching in a
namespace
Product: MediaWiki
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: Normal
Component: API
AssignedTo: [email protected]
ReportedBy: [email protected]
CC: [email protected], [email protected],
[email protected], [email protected]
When looking for subcategories of a known cat, it is tempting to use the API
with &cmnamespace=14 (14 being the NS id for Category: ). For categories that
have both subcategories and articles as members, this triggers an odd behaviour
:
http://en.wikipedia.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Physics
This lists (by sortkey) the 10 (default) first categorymembers of
[[Category:Physics]] on enwp, as sorted by sortkey.
Among those are two categories (subcategories of Category:Physics) :
[[Category:Fundamental physics concepts]] (sortkey:"*") and
[[Category:Physicists]] (sortkey:"*Physicists" ; possible mistake on the user
side for the sortkey by the way).
Now lets check for the same list when adding &cmnamespace=14.
http://en.wikipedia.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Physics&cmnamespace=14
Expected output would be to list the 10 (default) first members of the category
that are in NS 14. However, the output is slightly different : it actually
lists pages that are in NS 14 *and* in the 10 first categorymembers as shown
above.
We then get only 2 subcategories instead of the expected 10. A query-continue
is shown, which is the sortkey for a page that's outside NS 14.
Similar output is shown for &cmnamespace=45 (which is the number of subcats
according to the GUI) : only 10 (coincidence) subcats are actually outputted,
albeit with a query-continue, instead of the 45 expected.
http://en.wikipedia.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Physics&cmnamespace=14&cmlimit=45
Similar output is also shown for &cmnamespace=0, or whatever :
&cmnamespace=0&cmlimit=10 displays 7 results
http://en.wikipedia.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Physics&cmnamespace=0&cmlimit=10
&cmlimit does have an odd behaviour in this matter : instead of counting
results, it counts "potential" results (in all namespaces) before outputting
elements of requested namespace(s).
Although this is merely annoying for small categories such as
[[Category:Physics]] (283 articles and 45 subcats according to the GUI), it can
become a major problem when looking for subcategories of a far bigger category.
It could also add some strain on the servers, should anything try to list
subcategories of a category with many pages and a few subcats : many requests
will occur for a rather limited result.
One should be able to list :
- the X {{first categorymembers in a given namespace} of a given category}
- rather than the {categorymembers in a given NS} among the {X first
categorymembers of a category}
I don't know if I'm very clear here, my own head is starting to ache :D
--
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l