Hi, I've been reading the mw.org and wikitech pages on Cirrussearch (and the code) in the hope that I will be able to understand how is the page content transformed before being sent to ES and how is it kept in ES and I have a few questions:
1. Is the documentation available anywhere? I don't see it on https://doc.wikimedia.org/ 2. What part of the whole ecosystem transforms the wikitext into indexable text? Where can I find it? It should be somewhere downstream fromCirrusSearch\Updater::updateFromTitle(), but I can't figure uout where exactly. If this transformation doesn't happen, from where is the searchable text obtained? 3. Where can I find the ES schema used for wikipages? Is it different for images/categories? Thanks, Strainu _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
