https://bugzilla.wikimedia.org/show_bug.cgi?id=70558
Bug ID: 70558
Summary: categorylinks InnoDB extended indexes behavior differs
between enwiki slaves
Product: MediaWiki
Version: 1.24-git
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: Unprioritized
Component: Database
Assignee: [email protected]
Reporter: [email protected]
Web browser: ---
Mobile Platform: ---
The API runs the following query:
SELECT /* ApiQueryCategoryMembers::run */
cl_from,cl_sortkey,cl_type,page_namespace,page_title,cl_timestamp FROM
page,categorylinks FORCE INDEX (cl_timestamp) WHERE cl_to =
'Copy_to_Wikimedia_Commons_(bot-assessed)' AND (cl_from=page_id) ORDER BY
cl_timestamp,cl_from LIMIT 501;
CREATE TABLE categorylinks (
cl_from int(8) unsigned NOT NULL DEFAULT '0',
cl_to varbinary(255) NOT NULL DEFAULT '',
cl_sortkey varbinary(230) NOT NULL DEFAULT '',
cl_timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE
CURRENT_TIMESTAMP,
cl_sortkey_prefix varbinary(255) NOT NULL DEFAULT '',
cl_collation varbinary(32) NOT NULL DEFAULT '',
cl_type enum('page','subcat','file') NOT NULL DEFAULT 'page',
UNIQUE KEY cl_from (cl_from,cl_to),
KEY cl_timestamp (cl_to,cl_timestamp),
KEY cl_collation (cl_collation),
KEY cl_sortkey (cl_to,cl_type,cl_sortkey,cl_from)
) ENGINE=InnoDB DEFAULT CHARSET=binary
The forced cl_timestamp index is expected to utilize InnoDB extended indexes
(where secondary indexes also store the clustered primary key fields) to avoid
a filesort. Enwiki slaves up to MariaDB 5.5.34 do this correctly.
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: categorylinks
type: ref
possible_keys: cl_timestamp
key: cl_timestamp
key_len: 257
ref: const
rows: 552330
Extra: Using index condition; Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: page
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: enwiki.categorylinks.cl_from
rows: 1
Extra:
2 rows in set (0.38 sec)
Slaves running 5.5.36+ revert to using a filesort despite apparently having
identical categorylinks tables. Handler% stats suport the observation.
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: categorylinks
type: ref
possible_keys: cl_from,cl_timestamp,cl_sortkey
key: cl_sortkey
key_len: 257
ref: const
rows: 466736
Extra: Using index condition; Using where; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: page
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: enwiki.categorylinks.cl_from
rows: 1
Extra:
2 rows in set (0.32 sec)
Important to note that the first enwiki 5.5.36 slave, from which others were
cloned, did have data dumped and reloaded rather than being upgraded in place,
so categorylinks was definitely recreated. Possibly something has changed with
how the first UNIQUE index can be chosen as primary key, and the 6-byte rowid
is in use instead?
Simply rebuilding the table has no effect:
ALTER TABLE categorylinks ENGINE=InnoDB;
But making the primary key explicit restores the expected behavior:
ALTER TABLE categorylinks DROP INDEX cl_from, ADD PRIMARY KEY (cl_from, cl_to);
Need to investigate what caused the change in behavior, and whether this
affects any other mediawiki tables.
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l