[Bug 73605] No normalization for ancient greek accents in searches

bugzilla-daemon Wed, 19 Nov 2014 05:25:24 -0800

https://bugzilla.wikimedia.org/show_bug.cgi?id=73605


Nik Everett <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #2 from Nik Everett <[email protected]> ---
Cirrus uses Elasticsearch for the anlaysis which in turn uses Apache Lucene.  I
imagine the right place to implement this is there.

It looks like
https://github.com/apache/lucene-solr/blob/trunk/lucene/analysis/common/src/java/org/apache/lucene/analysis/el/GreekLowerCaseFilter.java
implements the normalization.  I'd file a bug over there.  It doesn't _look_
like adding the extra normalization would be that hard.  I suppose you'd have
to decide with them whether they should be enabled by default (so you could
just add them to that file) or optional.  If optional you'd just make a new
filter I believe.

After its released in Lucene and Elasticsearch we could enable it by default
for Greek across the site I think.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

[Bug 73605] No normalization for ancient greek accents in searches

Reply via email to