On Wed, Feb 28, 2018 at 5:26 PM, Brad Jorsch (Anomie) <bjor...@wikimedia.org
> wrote:

> On Wed, Feb 28, 2018 at 8:47 AM, Jaime Crespo <jcre...@wikimedia.org>
> wrote:
>
> > Very recently I have been experimenting with recursive Common Table
> > Expressions [2], which are or will be available on the latest versions of
> > MySQL and MariaDB.
> >
>
> Do the other databases MediaWiki tries to support have that feature?
>

Actually, MySQL/MariaDB is the *last* database to conform to the sql:1999
WITH standard: https://modern-sql.com/feature/with#compatibility Even
sqlite suported a limited set of those!

The good news is that, probably because it arrived last, it got a pretty
feature-full implementation:
https://twitter.com/MarkusWinand/status/852862475699707904


>
> > With a single query on can obtain all titles directly or indirectly in a
> > category:
> >
> > WITH RECURSIVE cte (cl_from, cl_type) AS
> > (
> >     SELECT cl_from, cl_type FROM categorylinks WHERE cl_to =
> > 'Database_management_systems' -- starting category
> >     UNION
> >     SELECT categorylinks.cl_from, categorylinks.cl_type FROM cte JOIN
> page
> > ON
> > cl_from = page_id JOIN categorylinks ON page_title = cl_to WHERE
> > cte.cl_type
> > = 'subcat' -- subcat addition on each iteration
> > )
> > SELECT page_title FROM cte JOIN page ON cl_from = page_id WHERE
> > page_namespace = 0 ORDER BY page_title; -- printing only articles in the
> > end
> > , ordered by title
> >
>
> Does that work efficiently on huge categories, or does it wind up fetching
> millions of rows and filesorting?


Needs more testing-- that query worked for me well enough on my laptop to
expose it directly on a webrequest (<0.1 s), but I only imported the
categorylinks and page tables, so I was working with memory. Obviously, the
more complex the query, and the more results it returns, the less likely it
is to be able to be exposed to, e.g. a public API. But at least there are
configurable limits on recursivitiy and max execution time.

Honestly, given it is a new feature, I don't expect mediawiki --which at
the moment has to support 5.5- to embrace it any time soon. However, I
wanted to ask if it was interesting enough to setup some test hosts for
mediawiki to "play" with it --e.g. evaluate performance--, and maybe (?)
some upgraded mariadb/mysql servers for WMF labsdb (for long-running
analytics or gadgets that generates reports).

-- 
Jaime Crespo
<http://wikimedia.org>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to