https://bugzilla.wikimedia.org/show_bug.cgi?id=57176

Aaron Schulz <aschulz4...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aschulz4...@gmail.com

--- Comment #3 from Aaron Schulz <aschulz4...@gmail.com> ---
el_id would help when the el_index query prefix is the full el_index key length
(60 chars) or more. Even if all results are all actually < 60 chars, all rows
would have to be scanned to know that. In any case, if the query prefix is < 60
chars, then you no longer have the property of easily getting a range of index
records having an equal first part and are ordered by the second part. Ordering
by el_id in that case could be slower than the OFFSET query used now since
MySQL would have to quicksort for all matches just to return the top X.

Option A:
Maybe we could have two more or more el_index indexes, accept they'd be on
smaller prefixes (e.g. a 30 and 15). The index with the largest prefix that is
smaller than the query prefix could be used. Non-matching rows (were only the
prefix in the index matched, not the whole el_index field) would just be
eliminated via scanning.

Option B:
It could help to have a new field:

-- Bucket to page on for large prefix queries
el_bucket UNSIGNED INTEGER DEFAULT MOD(el_id,1024)

...and a new index:

el_bucket_index ON externallinks (el_bucket,el_index (60))

When a LIKE query is estimated to match many rows (100000+), the API query
could page through el_bucket from 0 to 1023, doing the LIKE query for each
bucket.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to