| Smalyshev created this task. Smalyshev added projects: Discovery, Wikidata, CirrusSearch. Herald added a subscriber: Aklapper. Herald added a project: Discovery-Search. |
TASK DESCRIPTION
Right now we are using tuning parameters for Wikidata search (both prefix and fulltext) which are more or less invented out of the thin air. I wonder if we could use some ML (or other) technology with actual user clicks data to have better tuning of those parameters.
Potential targets:
- Entity weight parameters (both satu params and weights of features on entities). We are only using incoming links and sitelinks counts now - maybe we should use more features?
- Relative weights of various matches - label, alias, description, other language, etc.?
- For fulltext possibly also more advanced features that we're building with Mjolnir?
The start would be to actually build a data pipeline allowing us to know which search result was chosen by the user, especially for prefix search which is used ~1M times a day.
As this is an exploratory task, suggestions about what else could be done here are welcome.
TASK DETAIL
EMAIL PREFERENCES
To: Smalyshev
Cc: Aklapper, Smalyshev, Lahi, Gq86, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, LawExplorer, Avner, Gehel, FloNight, Wikidata-bugs, aude, jayvdb, Mbch331, jeremyb
Cc: Aklapper, Smalyshev, Lahi, Gq86, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, LawExplorer, Avner, Gehel, FloNight, Wikidata-bugs, aude, jayvdb, Mbch331, jeremyb
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
