Addshore added a comment.

  The current pattern of odd rows in the DB that we are currently seeing 
doesn't only remove the text row, but also the text in lang row.
  
    mysql:[email protected] [wikidatawiki]> SELECT * FROM 
wbt_property_terms LEFT JOIN wbt_term_in_lang ON wbpt_term_in_lang_id = wbtl_id 
LEFT JOIN wbt_type ON wbtl_type_id = wby_id LEFT JOIN
      wbt_text_in_lang ON wbtl_text_in_lang_id = wbxl_id LEFT JOIN wbt_text ON 
wbxl_text_id = wbx_id WHERE wby_name = 'label' AND wbx_text IS NULL ORDER BY 
wbpt_property_id;
    
+---------+------------------+----------------------+-----------+--------------+----------------------+--------+----------+---------+---------------+--------------+--------+----------+
    | wbpt_id | wbpt_property_id | wbpt_term_in_lang_id | wbtl_id   | 
wbtl_type_id | wbtl_text_in_lang_id | wby_id | wby_name | wbxl_id | 
wbxl_language | wbxl_text_id | wbx_id | wbx_text |
    
+---------+------------------+----------------------+-----------+--------------+----------------------+--------+----------+---------+---------------+--------------+--------+----------+
    |  325236 |              225 |            388713206 | 388713206 |           
 1 |            383127030 |      1 | label    |    NULL | NULL          |       
  NULL |   NULL | NULL     |
    |  325246 |              433 |            388715670 | 388715670 |           
 1 |            379975720 |      1 | label    |    NULL | NULL          |       
  NULL |   NULL | NULL     |
    
+---------+------------------+----------------------+-----------+--------------+----------------------+--------+----------+---------+---------------+--------------+--------+----------+
    2 rows in set (0.95 sec)
  
  The same patch touches the same sort of code in `cleanTermInLangIds` that 
probably causes the same issue.
  
  - `cleanTermInLangIds` is called with $termInLangIds which contains the 
termInlangIds that are not used in the property or items tables (this is 
correct)
    - Example: ID 999 (some ID from the edit that triggered the deletion in the 
above seen case)
  - text in lang Ids are then selected from `wbt_term_in_lang`  where the text 
in lang id is in `$potentiallyUnusedTextInLangIds`
    - `$potentiallyUnusedTextInLangIds` would contain all of the text in lang 
IDS for the term in lang 999, lets say 11,12,383127030
  - All of the `$termInLangIds` are then deleted (this is correct)
    - term id 999 has been deleted
  - each `$potentiallyUnusedTextInLangIds` which currently still looks fine is 
then select from `wbt_term_in_lang` a final time and `$unusedTextInLangIds` is 
built up when no rows are found for the text in lang id in the term in lang 
table. (correct)
    - `$unusedTextInLangIds` now contains all of the text in lang ids that are 
still in use, so this should be 11 and 12
  - `$unusedTextInLangIds` are then selected from wbt_term_in_lang FOR UPDATE, 
setting `$stillUsedTextInLangIds` to the resulting IDs
    - `stillUsedTextInLangIds` would then still include 11 and 12
  - A diff then occurs including things that are in 
`$potentiallyUnusedTextInLangIds` and not in `$stillUsedTextInLangIds`
    - so things that are in 11,12,383127030 and not in, 11 and 12, thus 
383127030 is passed down for deletion when it should not be.

TASK DETAIL
  https://phabricator.wikimedia.org/T237984

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: jcrespo, Marostegui, abian, JAnD, Ash_Crow, Addshore, PKM, Moebeus, 
alaa_wmde, VIGNERON, Aklapper, Lydia_Pintscher, Ladsgroup, Lea_Lacroix_WMDE, 
Hook696, Daryl-TTMG, RomaAmorRoma, 0010318400, E.S.A-Sheild, Iflorez, 
darthmon_wmde, Meekrab2012, joker88john, CucyNoiD, Nandana, NebulousIris, 
Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, 
Lahi, Gq86, Af420, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, 
Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, 
Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, 
aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to