https://bugzilla.wikimedia.org/show_bug.cgi?id=56798
--- Comment #3 from Nik Everett <[email protected]> --- Something like this: while count > 0: SELECT MAX(pl_page_id), COUNT(*) FROM (SELECT pl_page_id FROM page_link WHERE pl_page_id > $last_max$ LIMIT 10000) ? We're sure we'll get rid of the sql based counting in the normal update case but in the population/outage recovery case (both in process and job queue based) I was thinking of keeping it (or modifying it like you suggest.) The idea being that SQL based counting will be right even if Elasticsearch is super out of date. And it'll certainly be out of date in the population case. Without it we'd need a second pass at populating Elasticsearch to count the links which just seems complicated/burdensome/nasty. I had a look at BacklinkCache a while ago but it looked like it was pulling all the backlinks into memory to count them. That didn't seem pretty. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
