https://bugzilla.wikimedia.org/show_bug.cgi?id=53474
Web browser: ---
Bug ID: 53474
Summary: Allow more_like_this searches (return articles similar
to query text)
Product: MediaWiki extensions
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: enhancement
Priority: Unprioritized
Component: CirrusSearch
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected],
[email protected]
Classification: Unclassified
Mobile Platform: ---
Elasticsearch provides a "More Like This" query, which - given some initial
text - extracts the key terms and uses those to build a new query, returning
the documents that best match those terms.
If this was available in MediaWiki's search API, it would allow the index to be
queried by example. This can be useful for finding Wikipedia articles that are
most similar to a starting document (e.g. "Wikipedia articles related to this
page", alongside a news story), and also for automatically categorising
documents (using the categories that have been attached to the most similar
Wikipedia articles).
An example query: https://gist.github.com/hubgit/6365895
Most of those parameters (fields to query, fields to return, number of items to
return, query text) can be passed through as query parameters, and the others
(min_term_freq, max_query_terms, percent_terms_to_match) can be hard-coded to
values appropriate for the index.
It might be appropriate to use POST for the query, as the query text can be a
whole document.
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l