--- Comment #5 from Nik Everett <neverett+bugzi...@wikimedia.org> ---
I suppose that depends on how good you need it to be. I spent half an hour
this morning and have an instance loading the data. It is using the wikipedia
river which is some toy thing the Elasticsearch folks maintain, ostensibly for
testing. It isn't what we want in the end for a great many reasons, not least
of which that it munges the wikitext something fierce so it probably isn't what
we want but it is something. It was easy to set up and gives us something to
I think what you are asking for is actually a few pieces:
* A tool to keep the index up to date- My guess is this'd take a day to get
know labs, another day or two to get it working the first time, then about a
week of bug fixes spread out over the first couple month.
* A tool to dispatch queries against it sanely- I'm less sure about this.
Anywhere from a couple of days to a month depending on surprises. I can't
really estimate bug fixes because I'm so wild on the mark for the tool.
I'll play with the wikipedia river instance and see what kind of queries I can
fire off against it manually.
Finally, if we forgo making the second tool then users could technically just
use it as an Elasticsearch instance with wikitext on it. I'm not sure how many
people that would be useful for and what kind of protection it'd need to have.
I imagine hiding it in the labs network and making folks sign in to labs with
port forwarding would be safe enough.
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
Wikibugs-l mailing list