https://bugzilla.wikimedia.org/show_bug.cgi?id=53474

       Web browser: ---
            Bug ID: 53474
           Summary: Allow more_like_this searches (return articles similar
                    to query text)
           Product: MediaWiki extensions
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: Unprioritized
         Component: CirrusSearch
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected],
                    [email protected]
    Classification: Unclassified
   Mobile Platform: ---

Elasticsearch provides a "More Like This" query, which - given some initial
text - extracts the key terms and uses those to build a new query, returning
the documents that best match those terms.

If this was available in MediaWiki's search API, it would allow the index to be
queried by example. This can be useful for finding Wikipedia articles that are
most similar to a starting document (e.g. "Wikipedia articles related to this
page", alongside a news story), and also for automatically categorising
documents (using the categories that have been attached to the most similar
Wikipedia articles).

An example query: https://gist.github.com/hubgit/6365895

Most of those parameters (fields to query, fields to return, number of items to
return, query text) can be passed through as query parameters, and the others
(min_term_freq, max_query_terms, percent_terms_to_match) can be hard-coded to
values appropriate for the index.

It might be appropriate to use POST for the query, as the query text can be a
whole document.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to