https://bugzilla.wikimedia.org/show_bug.cgi?id=54875

       Web browser: ---
            Bug ID: 54875
           Summary: Automatic stopwords for the 200+ languages without
                    their own analyzer available
           Product: MediaWiki extensions
           Version: master
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: CirrusSearch
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected],
                    [email protected]
    Classification: Unclassified
   Mobile Platform: ---

Split from bug 54022: apart from the 30 languages currently supported, rather
than use the default analyzer bare we should probably use stopwords calculated
in an automatic way, while we wait for a custom ones to be made.
It seems cutoff_frequency setting and common_terms query may be used for this
purpose.

I'd say that this is currently low priority but should probably be done before
expanding elasticsearch beyond the ~30 supported languages.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to