https://bugzilla.wikimedia.org/show_bug.cgi?id=67521

Nik Everett <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|Unprioritized               |Normal

--- Comment #2 from Nik Everett <[email protected]> ---
(In reply to Tisza Gergő from comment #0)
> Steps to reproduce:
> 
> 1. visit http://hu.wikipedia.org
> 2. type "kurtvirag" in the search box
> 
> Expected: [[hu:Kürtvirág]] is suggested
> 
> Actual: no suggestions

> The ideal behavior would be to index both the exact and the stripped title,
> and give more weight to the first; so search suggestions with different
> diacritics would not crowd out better matches but would still appear if
> there is no perfect match.

Two solutions:
Better suggestions: Add an ascii normalized lookup for suggestions.  It looks
like German already does this so I'd just have to figure out how and use it in
more places.

Weighted search: Everywhere where we search look with the diacritics and
without - with gets more boost.

Hmmm - so we already perform some weighted search: exact matches are worth more
then normalized (non-conjugated, non-declined, etc) matches.  I'm worried
adding another layer would be nasty from a performance perspective.  The
suggestions might be faster.  I'm not really sure.  I'll have to sleep on it.


(In reply to Mikko Silvonen from comment #1)
> Can the new search be configured per site?

It certainly can.  If the language is in this list then it already is:
arabic, armenian, basque, brazilian, bulgarian, catalan, chinese, cjk, czech,
danish, dutch, english, finnish, french, galician, german, greek, hindi,
hungarian, indonesian, irish, italian, norwegian, persian, portuguese,
romanian, russian, sorani, spanish, swedish, turkish, thai.


Both Finnish and Hungarian are in the list so they are getting whatever the
Lucene project things are good defaults.  I'm happy to customize it from there.



In the mean time, I'm setting this to "Normal" priority.  It won't be the top
of my list but its certainly on it.  Feel free to poke the priority if lack of
this makes search horrible for you.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to