https://bugzilla.wikimedia.org/show_bug.cgi?id=53577
Sean Pringle <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED |--- --- Comment #4 from Sean Pringle <[email protected]> --- Larger WMF wikis are seeing more spikes of LogPager queries of which an example follows. These have had some recent tuning by initially removing forced indexes that no longer suited data set sizes, which has helped (10min down to ~100s), but we need to do more. mysql wmf db73 dewiki> explain SELECT /* IndexPager::buildQueryInfo (LogPager) xxx.xxx.xxx.xxx */ log_id,log_type,log_action,log_timestamp,log_user,log_user_text,log_namespace,log_title,log_comment,log_params,log_deleted,user_id,user_name,user_editcount,ts_tags FROM `logging` LEFT JOIN `user` ON ((log_user=user_id)) LEFT JOIN `tag_summary` ON ((ts_log_id=log_id)) WHERE (log_type NOT IN ('suppress','spamblacklist')) AND log_user = ? AND ((log_deleted & 4) = 0) AND (log_type != 'review') ORDER BY log_timestamp DESC LIMIT 51\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: logging type: index possible_keys: type_time,user_time,log_user_type_time key: times key_len: 16 ref: NULL rows: 6705 Extra: Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: user type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: dewiki.logging.log_user rows: 1 Extra: *************************** 3. row *************************** id: 1 select_type: SIMPLE table: tag_summary type: ref possible_keys: ts_log_id,tag_summary_log_id key: ts_log_id key_len: 5 ref: dewiki.logging.log_id rows: 20172 <-- often much larger Extra: Using where 3 rows in set (0.29 sec) The remaining problems are: - The queries are fast enough until one hits cold data leading a slave to bottleneck on disk. The buffer pool churn has a flow-on effect on other normally fast queries which become false positives in the slow log. Especially problematic when a crawler like MSNBOT triggers a spike of these. - The tag_summary table often has a relatively low cardinality for ts_log_id (many nulls, few ids). - Sometimes a slave logs multiple identical copies of a query (always a form filtered by log_user) at intervals of several seconds. Presumably someone refreshing a page that is too slow to respond. Possibilities: - Pull out only log_id,log_user first to reduce the impact of the filesort and grab user and tag data in follow-up batch query. If the query plan sitches to using Index Condition Pushdown ("Using index condition" in explain), cold data is less of an issue as fewer table rows are hit and fewer pages read. - Limit the size of the dataset to be considered. By timestamp perhaps, last X months maybe. For heavier users only? - Improve the method of retrieving ts_tags using something other than tag_summary. Or remove it entirely. - Isolate these queries to specific slaves (like watchlist). -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
