Cross-posting from wikitech-l. Please reply there.

---------- Forwarded message ----------
From: Dan Garry <dga...@wikimedia.org>
Date: 1 September 2015 at 20:43
Subject: Discovery Department A/B testing an alternative to prefix search
next week
To: Wikimedia developers <wikitec...@lists.wikimedia.org>


Hi everyone,

*tl;dr: Discovery Department to run A/B test
<https://phabricator.wikimedia.org/T111078> comparing new search suggester
to prefix search, to see if it can reduce zero results rate.*

As I'm sure you're all aware, the search box at the top right of every page
on desktop uses prefix search to generate its results. The main reason for
this is that prefix search is incredibly fast and performant; that search
box sees a lot of traffic, and it's important to keep it scalable.

However, we know that there are numerous problems with prefix search.
Prefix searches are prone to give you no results; if you make even a slight
typo, then you won't get the result you want. And thus a complex system of
manually curated redirects were born to try to alleviate this navigation
issue. Wouldn't it be nice if we could work towards a solution that doesn't
require the manual curation of redirects, thus freeing up Wikimedians to do
other more meaningful tasks? And make search a bit better in the process,
too? That's a long term goal of mine... emphasis on the long. ;-)

The Q1 2015-17 (Jul - Aug 2015) goal of the Search Team in the Discovery
Department is to reduce the zero results rate
<https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q1_Goals#Search>.
Amongst other things, we've been working to build an alternative to prefix
search <https://phabricator.wikimedia.org/T105746>. Documentation on the
API is pretty light right now because we're scrambling to get it up and
running (but there's a task for that!
<https://phabricator.wikimedia.org/T111139>).

An initial version of the suggestion API is now in production on enwiki and
dewiki [1], but is currently not being used for anything. Our initial tests
<https://phabricator.wikimedia.org/T109729> of the API show that it's
incredibly promising for reducing the zero results rate. But we need more
data!

We're planning on running an A/B test on whether this API is better at
reducing zero results. We're targeting beginning on Tuesday 8th September,
for two weeks. This is documented in T111078
<https://phabricator.wikimedia.org/T111078>.

A very important note here is that we currently have no way of
quantitatively measuring result relevance (although we're working on it
<https://phabricator.wikimedia.org/T109482>), so this test will be highly
limited in scope, only measuring the zero results rate. Given the limits of
this, even seeing massive success in this test is not enough to deploy this
API as a full replacement of prefix search; we'd need additional data. But,
that's not stopping us from gathering initial data from this test.

As always, if you have any questions, let me know.

Thanks,
Dan

[1]: The API is actually live on all wikis, but we only built the search
indices for enwiki and dewiki since they're our biggest content wikis and
this is an early test. Attempting to use the API on any other wiki will get
you a cirrus backend error.

-- 
Dan Garry
Lead Product Manager, Discovery
Wikimedia Foundation



-- 
Dan Garry
Lead Product Manager, Discovery
Wikimedia Foundation
_______________________________________________
Wikimedia-search mailing list
Wikimedia-search@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search

Reply via email to