That's very cool! Have you stress-tested it at all? Like, what happens if
you search 10 wikipedias at once? (Because you know I want to search 10
wikis at once. <g>)

Trey Jones
Software Engineer, Discovery
Wikimedia Foundation

On Mon, Sep 21, 2015 at 11:15 AM, Erik Bernhardson <
ebernhard...@wikimedia.org> wrote:

> Just to follow up here, i've updated the `async` branch of my Elastica
> fork, it now completely passes the test suite so might be ready for further
> CirrusSearch testing.
>
> On Wed, Sep 9, 2015 at 12:23 PM, Erik Bernhardson <
> ebernhard...@wikimedia.org> wrote:
>
>> This would allow them to be run in parallel, yes. Being in separate
>> databases means extra last-minute checks for existence or security (CYA for
>> if deletes get missed) are skipped as interwiki links are generated, but
>> overall not a big deal and an expected loss as part of the interwiki search.
>>
>> On Wed, Sep 9, 2015 at 10:11 AM, Kevin Smith <ksm...@wikimedia.org>
>> wrote:
>>
>>> Would this help if we wanted to simultaneously search multiple wikis, or
>>> are those in separate databases so it would have no effect?
>>>
>>>
>>> Kevin Smith
>>> Agile Coach, Wikimedia Foundation
>>>
>>>
>>> On Wed, Sep 9, 2015 at 5:18 AM, David Causse <dcau...@wikimedia.org>
>>> wrote:
>>>
>>>> Thanks Erik!
>>>>
>>>> This is very promising and it opens a lot of new possibilities.
>>>> Guessing the gain is pretty hard but I think we run many small requests
>>>> where network overhead is quite high compared to the actual work done by
>>>> elastic. This would definitely help.
>>>>
>>>> Le 08/09/2015 21:01, Erik Bernhardson a écrit :
>>>>
>>>> The php engine used in prod by the wmf, hhvm, has built in support for
>>>> shared (non-preemptive) concurrency via async/await keywords[1][2]. Over
>>>> the weekend i spent some time converting the Elastica client library we use
>>>> to work asynchronously, which would essentially let us continue on
>>>> performing other calculations in the web request while network requests are
>>>> processing. I've only ported over the client library[3], not the
>>>> CirrusSearch code. Also this is not a complete port, there are a couple
>>>> code paths that work but most of the test suite still fails.
>>>>
>>>> The most obvious place we could see a benefit from this is when
>>>> multiple queries are issued to elasticsearch from a single web request. If
>>>> the second query doesn't depend on the results of the first it can be
>>>> issued in parallel. This is actually somewhat common use case, for example
>>>> doing a full text and a title search in the same request. I'm wary of
>>>> making much of a guess in terms of actual latency reduction we could
>>>> expect, but maybe on the order of 50 to 100 ms in cases which we currently
>>>> perform requests serially and we have enough work to process. Really its
>>>> hard to say at this point.
>>>>
>>>> In addition to making some existing code faster, having the ability to
>>>> do multiple network operations in an async manner opens up other
>>>> possibilities when we are implementing things in the future.  In closing,
>>>> this currently isn't going anywhere it was just something interesting to
>>>> toy with.  I think it could be quite interesting to investigate further.
>>>>
>>>> [1] http://docs.hhvm.com/manual/en/hack.async.php
>>>> [2] https://phabricator.wikimedia.org/T99755
>>>> [2] https://github.com/ebernhardson/Elastica/tree/async
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Wikimedia-search mailing 
>>>> listWikimedia-search@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikimedia-search
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Wikimedia-search mailing list
>>>> Wikimedia-search@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Wikimedia-search mailing list
>>> Wikimedia-search@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
>>>
>>>
>>
>
> _______________________________________________
> Wikimedia-search mailing list
> Wikimedia-search@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
>
>
_______________________________________________
Wikimedia-search mailing list
Wikimedia-search@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search

Reply via email to