On Sat, Nov 21, 2009 at 7:02 PM, Anthony <[email protected]> wrote:
> You could build an index on page_is_redirect in the "page" table (see
> enwiki-*-page.sql.gz).  But I'm pretty sure Postgresql wouldn't use
> it, and would do a sequential scan, since pretty much all the pages
> are going to have to be accessed anyway.
>
> Five or ten minutes sounds about right.  I can't imagine this is a
> query you want to run over and over again.  If it is, you'd probably
> want to use partitioning
> (http://www.postgresql.org/docs/8.4/interactive/ddl-partitioning.html),
> but you're still not going to cut down the query time very much, as
> it's going to be returning millions of rows.

Yeah, pretty much.  If you had an index on (page_is_redirect,
page_namespace, page_title) or such, that would speed it up
significantly (at least in MySQL).  It would have to scan through the
whole index, but that only contains three columns plus a row id of
some kind, so it should be quite a lot faster than scanning the whole
table.  But this isn't a likely query for optimization.

> Yes, it can be very hard to switch your DBMS, and that's a very good
> thing for MySQL.  :)

Let's not have a DBMS flame war here, please.

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to