| Smalyshev updated the task description. (Show Details) |
CHANGES TO TASK DESCRIPTION
After talking with @dcausse, we decided that having two custom analyzers set up (stemmed & non-stemmed one) for every language in descriptions is wasteful, since not all of them are useful for Wikibase use case. We'd want to only make stemmed ones for those languages, and use the plain (non-stemmed) analyzer for others.
Here is the list of languages for which we have "non-trivial" configuration for stemming (`text`) analyzer:
```
ar
bg
ca
ckb
cs
da
de
el
en
en-ca
en-gb
es
eu
fa
fi
fr
ga
gl
hi
hu
hy
id
it
ja
ko
lt
lv
nb
nl
nn
pt
pt-br
ro
ru
simple
sv
th
tr
```
That includes having named analyzer types (e.g. 'bulgarian') and specialized filters or tokenizers.
Note that we are only concerned about whether the `text` analyzer we have will have additional value as compared to `plain` analyzer, since we're keeping `plain` one anyway, and only in the context of common Wikibase/Wikidata usage on descriptions.
Here is the list of languages for which we have "non-trivial" configuration for stemming (`text`) analyzer:
```
ar
bg
ca
ckb
cs
da
de
el
en
en-ca
en-gb
es
eu
fa
fi
fr
ga
gl
hi
hu
hy
id
it
ja
ko
lt
lv
nb
nl
nn
pt
pt-br
ro
ru
simple
sv
th
tr
```
That includes having named analyzer types (e.g. 'bulgarian') and specialized filters or tokenizers.
Note that we are only concerned about whether the `text` analyzer we have will have additional value as compared to `plain` analyzer, since we're keeping `plain` one anyway, and only in the context of common Wikibase/Wikidata usage on descriptions.
TASK DETAIL
EMAIL PREFERENCES
To: Smalyshev
Cc: TJones, Aklapper, EBernhardson, Lydia_Pintscher, hoo, aude, Smalyshev, dcausse, Lahi, GoranSMilovanovic, QZanden, EBjune, Avner, debt, Gehel, Jdrewniak, FloNight, Wikidata-bugs, Mbch331
Cc: TJones, Aklapper, EBernhardson, Lydia_Pintscher, hoo, aude, Smalyshev, dcausse, Lahi, GoranSMilovanovic, QZanden, EBjune, Avner, debt, Gehel, Jdrewniak, FloNight, Wikidata-bugs, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
