Ladsgroup added a comment.

With the deduplication passing Q11M, the number of duplicate rows has not vanished but it's close to none:

mysql:[email protected] [wikidatawiki]> SELECT COUNT(*) FROM wb_terms AS t1 WHERE term_type != 'alias' AND EXISTS(SELECT 1 FROM wb_terms AS t2 USE INDEX(wb_terms_entity_id) WHERE t1.term_entity_id < 1000000 and t2.term_entity_id < 1000000 and t1.term_entity_type = t2.term_entity_type AND t1.term_entity_id = t2.term_entity_id AND t1.term_type = t2.term_type AND t1.term_language = t2.term_language AND t1.term_search_key = t2.term_search_key AND t1.term_row_id != t2.term_row_id);
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    3108283
Current database: wikidatawiki

+----------+
| COUNT(*) |
+----------+
|     1309 |
+----------+
1 row in set (6 hours 20 min 4.14 sec)

Note that it's only in the first Q1M, the real results when the script is done should be around 40 times bigger than that but comparing to the size of the table, it's nothing.


TASK DETAIL
https://phabricator.wikimedia.org/T163551

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, Ladsgroup
Cc: Lydia_Pintscher, Krinkle, Ladsgroup, gerritbot, daniel, Smalyshev, jcrespo, aude, Aklapper, hoo, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Lewizho99, Maathavan, Wikidata-bugs, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to