On Sat, Nov 21, 2009 at 7:02 PM, Anthony <[email protected]> wrote: > You could build an index on page_is_redirect in the "page" table (see > enwiki-*-page.sql.gz). But I'm pretty sure Postgresql wouldn't use > it, and would do a sequential scan, since pretty much all the pages > are going to have to be accessed anyway. > > Five or ten minutes sounds about right. I can't imagine this is a > query you want to run over and over again. If it is, you'd probably > want to use partitioning > (http://www.postgresql.org/docs/8.4/interactive/ddl-partitioning.html), > but you're still not going to cut down the query time very much, as > it's going to be returning millions of rows.
Yeah, pretty much. If you had an index on (page_is_redirect, page_namespace, page_title) or such, that would speed it up significantly (at least in MySQL). It would have to scan through the whole index, but that only contains three columns plus a row id of some kind, so it should be quite a lot faster than scanning the whole table. But this isn't a likely query for optimization. > Yes, it can be very hard to switch your DBMS, and that's a very good > thing for MySQL. :) Let's not have a DBMS flame war here, please. _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
