https://bugzilla.wikimedia.org/show_bug.cgi?id=67521
Nik Everett <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|Unprioritized |Normal --- Comment #2 from Nik Everett <[email protected]> --- (In reply to Tisza Gergő from comment #0) > Steps to reproduce: > > 1. visit http://hu.wikipedia.org > 2. type "kurtvirag" in the search box > > Expected: [[hu:Kürtvirág]] is suggested > > Actual: no suggestions > The ideal behavior would be to index both the exact and the stripped title, > and give more weight to the first; so search suggestions with different > diacritics would not crowd out better matches but would still appear if > there is no perfect match. Two solutions: Better suggestions: Add an ascii normalized lookup for suggestions. It looks like German already does this so I'd just have to figure out how and use it in more places. Weighted search: Everywhere where we search look with the diacritics and without - with gets more boost. Hmmm - so we already perform some weighted search: exact matches are worth more then normalized (non-conjugated, non-declined, etc) matches. I'm worried adding another layer would be nasty from a performance perspective. The suggestions might be faster. I'm not really sure. I'll have to sleep on it. (In reply to Mikko Silvonen from comment #1) > Can the new search be configured per site? It certainly can. If the language is in this list then it already is: arabic, armenian, basque, brazilian, bulgarian, catalan, chinese, cjk, czech, danish, dutch, english, finnish, french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, norwegian, persian, portuguese, romanian, russian, sorani, spanish, swedish, turkish, thai. Both Finnish and Hungarian are in the list so they are getting whatever the Lucene project things are good defaults. I'm happy to customize it from there. In the mean time, I'm setting this to "Normal" priority. It won't be the top of my list but its certainly on it. Feel free to poke the priority if lack of this makes search horrible for you. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
