https://bugzilla.wikimedia.org/show_bug.cgi?id=54875
Web browser: ---
Bug ID: 54875
Summary: Automatic stopwords for the 200+ languages without
their own analyzer available
Product: MediaWiki extensions
Version: master
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: Unprioritized
Component: CirrusSearch
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected],
[email protected]
Classification: Unclassified
Mobile Platform: ---
Split from bug 54022: apart from the 30 languages currently supported, rather
than use the default analyzer bare we should probably use stopwords calculated
in an automatic way, while we wait for a custom ones to be made.
It seems cutoff_frequency setting and common_terms query may be used for this
purpose.
I'd say that this is currently low priority but should probably be done before
expanding elasticsearch beyond the ~30 supported languages.
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l