https://bugzilla.wikimedia.org/show_bug.cgi?id=25931
--- Comment #27 from Asher Feldman <[email protected]> 2012-04-27 06:47:47 UTC --- Our current build of lsearchd won't go deeper than an offset of 100000 (SearchEngine.java:protected static int maxoffset = 100000;) so for categories like Living People, we wouldn't be able to provide random results over the full set, just the first 100k as they appear in the index, which appears to be ordered on create time. Actually getting the 100kth result (upper latency bound) takes ~280ms asher@bast1001:~/srchtest$ curl 'http://search1001:8123/search/enwiki/incategory:%22Living%20people%22?limit=1&offset=99999&searchall=0' 567274 #info search=[search1001,search1001], highlight=[search1005] in 283 ms #no suggestion #interwiki 0 0 #results 1 1.4743276 0 Boris_Boillon If you ditch the join and take the same approach with mysql, it's several times faster than lucene: mysql> select cl_from from categorylinks where cl_to='Living_people' limit 1 offset 99999; +----------+ | cl_from | +----------+ | 13546433 | +----------+ 1 row in set (0.06 sec) The worst case for Living_people isn't great (~350ms), but still faster than lucene would be if we upped lsearchd's max offset: mysql> select cl_from from categorylinks where cl_to='Living_people' limit 1 offset 560000; +----------+ | cl_from | +----------+ | 27345638 | +----------+ 1 row in set (0.35 sec) -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
