https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #27 from Asher Feldman <[email protected]> 2012-04-27 06:47:47 
UTC ---
Our current build of lsearchd won't go deeper than an offset of 100000
(SearchEngine.java:protected static int maxoffset = 100000;) so for categories
like Living People, we wouldn't be able to provide random results over the full
set, just the first 100k as they appear in the index, which appears to be
ordered on create time.

Actually getting the 100kth result (upper latency bound) takes ~280ms

asher@bast1001:~/srchtest$ curl
'http://search1001:8123/search/enwiki/incategory:%22Living%20people%22?limit=1&offset=99999&searchall=0'
567274
#info search=[search1001,search1001], highlight=[search1005] in 283 ms
#no suggestion
#interwiki 0 0
#results 1
1.4743276 0 Boris_Boillon

If you ditch the join and take the same approach with mysql, it's several times
faster than lucene:

mysql> select cl_from from categorylinks where cl_to='Living_people'  limit 1
offset 99999;
+----------+
| cl_from  |
+----------+
| 13546433 |
+----------+
1 row in set (0.06 sec)

The worst case for Living_people isn't great (~350ms), but still faster than
lucene would be if we upped lsearchd's max offset:

mysql> select cl_from from categorylinks where cl_to='Living_people'  limit 1
offset 560000;
+----------+
| cl_from  |
+----------+
| 27345638 |
+----------+
1 row in set (0.35 sec)

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to