hoo created this task.
Herald added a subscriber: Aklapper.

TASK DESCRIPTION

There's apparently a huge number of duplicate rows in wb_terms, for example:

mysql:wikiadmin@db1045 [wikidatawiki]> SELECT * FROM wb_terms WHERE term_entity_type = 'item' AND term_entity_id = 2807 AND term_language = 'es' AND term_type = 'description'\G 
*************************** 1. row ***************************
     term_row_id: 709061768
  term_entity_id: 2807
term_entity_type: item
   term_language: es
       term_type: description
       term_text: ciudad, capital de España
 term_search_key: ciudad, capital de españa
     term_weight: 0.254
*************************** 2. row ***************************
     term_row_id: 709061771
  term_entity_id: 2807
term_entity_type: item
   term_language: es
       term_type: description
       term_text: ciudad, capital de España
 term_search_key: ciudad, capital de españa
     term_weight: 0.254
2 rows in set (0.01 sec)

I found the above example merely by looking for a single term, testing other combinations also quickly brought up duplicates, thus I assume there are a lot of these.

I'm currently trying to find out, how many such duplicates we have, but the query is still running.


TASK DETAIL
https://phabricator.wikimedia.org/T163551

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo
Cc: MediaWiki-extensions-WikibaseRepository, Wikidata, Aklapper, hoo
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to