https://bugzilla.wikimedia.org/show_bug.cgi?id=43652

--- Comment #4 from MZMcBride <b...@mzmcbride.com> ---
(In reply to comment #3)
> I like implementing this in labs because it could be a real performance drain
> on the production infrastructure if done there.  OTOH, if we put the wikitext
> in Elasticsearch we could have it run the regexes pretty easily.  The only
> trouble would be making sure the regexes don't cause a performance problem
> and I'm not sure that is possible.

Can you please ballpark how much work would be involved in setting up
Elasticsearch with the most recent English Wikipedia page text (wikitext) dump
on Labs for use with sane regular expressions? The current dump is about 19.1
GB compressed (cf. <http://dumps.wikimedia.org/enwiki/20140102/>).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to