https://bugzilla.wikimedia.org/show_bug.cgi?id=1211
--- Comment #41 from Aryeh Gregor <[email protected]> 2008-12-20 23:13:38 UTC --- (In reply to comment #40) > I suggest that we add a cl_namespace column to the table and also add that > column to the cl_sortkey index. This will allow efficient independent > selection (and thus paging) of a category's members based on the namespace, > which would not only resolve this bug but would help with #450 as well. It's > really not feasible to resolve #450 unless the different namespaces can be > paged independently. > > I'll be happy to provide a software patch for both of these bugs (I'm almost > done with the one for #450) but want to be sure that the proposed schema > change > is acceptable, first. First of all, this implementation of a fix would depend on bug 16012. All schema changes to large tables are suspended until there's a reasonable way of executing them, apparently, and categorylinks is certainly a large table. Second of all, you're (further) denormalizing the database here. This might be justifiable -- certainly it seems like the only reasonable way to fix bug 450 -- but it's a cost. You'd have to make sure that anything that moves pages also updates cl_namespace. This is not currently a cost we have anywhere else, AFAIK: either things are associated with the page_id and don't have to be updated, or are associated with the namespace/title combination rather than the page itself (e.g., pl_namespace and pl_title) and therefore still don't have to be updated. We can expect that bugs will occur here as cl_namespace sometimes isn't updated on moves, for whatever reason. Thirdly, this would probably *require* bug 450 to be fixed to be reasonably efficient. In the current setup, we'd need something like 'SELECT ... WHERE cl_from=xxx AND cl_namespace NOT IN (' . NS_FILE . ',' . NS_CATEGORY . ') ORDER BY cl_sortkey', which MySQL (at least older versions like 4.0) is too stupid to execute efficiently: it can't use the index for ordering if there's a range condition like IN in the WHERE clause. (I think it would be smart enough in 5.0, or at least 5.1.) And as for the proposal in bug 450, I see two issues. First of all, you'd surely need one query per namespace on each category view, which could be a couple dozen on a lot of wikis. That seems kind of excessive, if it's avoidable. Second of all, the current structure of the category table reflects the pages/subcats/files breakdown. If counts are given separately for all namespaces, you'd need to store all the counts in the category table. This would either require an ALTER TABLE for every namespace added or removed (not happening), or else breaking off a new categorycount table like (cc_cat, cc_namespace, cc_count) to store the counts and deleting the cat_pages, cat_subcats, cat_files from the category table. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. You are the assignee for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
