That's very cool! Have you stress-tested it at all? Like, what happens if you search 10 wikipedias at once? (Because you know I want to search 10 wikis at once. <g>)
Trey Jones Software Engineer, Discovery Wikimedia Foundation On Mon, Sep 21, 2015 at 11:15 AM, Erik Bernhardson < [email protected]> wrote: > Just to follow up here, i've updated the `async` branch of my Elastica > fork, it now completely passes the test suite so might be ready for further > CirrusSearch testing. > > On Wed, Sep 9, 2015 at 12:23 PM, Erik Bernhardson < > [email protected]> wrote: > >> This would allow them to be run in parallel, yes. Being in separate >> databases means extra last-minute checks for existence or security (CYA for >> if deletes get missed) are skipped as interwiki links are generated, but >> overall not a big deal and an expected loss as part of the interwiki search. >> >> On Wed, Sep 9, 2015 at 10:11 AM, Kevin Smith <[email protected]> >> wrote: >> >>> Would this help if we wanted to simultaneously search multiple wikis, or >>> are those in separate databases so it would have no effect? >>> >>> >>> Kevin Smith >>> Agile Coach, Wikimedia Foundation >>> >>> >>> On Wed, Sep 9, 2015 at 5:18 AM, David Causse <[email protected]> >>> wrote: >>> >>>> Thanks Erik! >>>> >>>> This is very promising and it opens a lot of new possibilities. >>>> Guessing the gain is pretty hard but I think we run many small requests >>>> where network overhead is quite high compared to the actual work done by >>>> elastic. This would definitely help. >>>> >>>> Le 08/09/2015 21:01, Erik Bernhardson a écrit : >>>> >>>> The php engine used in prod by the wmf, hhvm, has built in support for >>>> shared (non-preemptive) concurrency via async/await keywords[1][2]. Over >>>> the weekend i spent some time converting the Elastica client library we use >>>> to work asynchronously, which would essentially let us continue on >>>> performing other calculations in the web request while network requests are >>>> processing. I've only ported over the client library[3], not the >>>> CirrusSearch code. Also this is not a complete port, there are a couple >>>> code paths that work but most of the test suite still fails. >>>> >>>> The most obvious place we could see a benefit from this is when >>>> multiple queries are issued to elasticsearch from a single web request. If >>>> the second query doesn't depend on the results of the first it can be >>>> issued in parallel. This is actually somewhat common use case, for example >>>> doing a full text and a title search in the same request. I'm wary of >>>> making much of a guess in terms of actual latency reduction we could >>>> expect, but maybe on the order of 50 to 100 ms in cases which we currently >>>> perform requests serially and we have enough work to process. Really its >>>> hard to say at this point. >>>> >>>> In addition to making some existing code faster, having the ability to >>>> do multiple network operations in an async manner opens up other >>>> possibilities when we are implementing things in the future. In closing, >>>> this currently isn't going anywhere it was just something interesting to >>>> toy with. I think it could be quite interesting to investigate further. >>>> >>>> [1] http://docs.hhvm.com/manual/en/hack.async.php >>>> [2] https://phabricator.wikimedia.org/T99755 >>>> [2] https://github.com/ebernhardson/Elastica/tree/async >>>> >>>> >>>> >>>> _______________________________________________ >>>> Wikimedia-search mailing >>>> [email protected]https://lists.wikimedia.org/mailman/listinfo/wikimedia-search >>>> >>>> >>>> >>>> _______________________________________________ >>>> Wikimedia-search mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/wikimedia-search >>>> >>>> >>> >>> _______________________________________________ >>> Wikimedia-search mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/wikimedia-search >>> >>> >> > > _______________________________________________ > Wikimedia-search mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikimedia-search > >
_______________________________________________ Wikimedia-search mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
