aude added a comment.

i am looking at a few pages that I found on English Wikipedia.  These all are 
related topics, and would share some categories and templates, but not all 
pages in the category are affected.

they are missing sidebar links and the "Data item" link, as well as the 
wgWikibaseItemId js config variable.

the parser output (from cache) has entity usage of only X + T, but 
wbc_entity_usage table now has X + T + S for the page.

however, I also see:

Number of Wikibase entities loaded: 1

and the page is in Category:Coordinates on Wikidata

this indicates that the lookup of item id in the wb_items_per_site failed.

the relevant Wikidata item was edited on January 12, and the wikipedia page was 
lasted edited in September.  The page table entry for the page has 
page_links_update on January 18.   I first noticed the page later in the day on 
January 19, so not convinced me viewing the page triggered anything.

the item definitely has an entry for the page in the wb_items_per_site table, 
and also has X + T + S entries in the wbc_entiy_usage table, as well as an 
entry in the page_props table for "wikibase_item".

other things noticed:

- also notice that page_touched of the enwiki page is 18 seconds later than 
page_touched timestamp of the wikidata item.
- SiteLinkTable::getItemIdForLink uses a slave connection

possible ideas:

- could be just a database query error (e.g. timeout) when parsing the enwiki 
page
- could be a race condition that the site link was missing in the site link 
table (on the slave) at the moment lookup was done. (e.g. the site links get 
deleted and readded in process of parsing on wikidata? or were previously 
missing?)  Below is what is done when saving site links:

  public function saveLinksOfItem( Item $item ) {                               
                  
      //First check whether there's anything to update                          
                  
      $newLinks = $item->getSiteLinkList()->toArray();                          
                  
      $oldLinks = $this->getSiteLinksForItem( $item->getId() );                 
                  
                                                                                
                  
      $linksToInsert = array_udiff( $newLinks, $oldLinks, array( $this, 
'compareSiteLinks' ) );   
      $linksToDelete = array_udiff( $oldLinks, $newLinks, array( $this, 
'compareSiteLinks' ) );   
                                                                                
                  
      if ( !$linksToInsert && !$linksToDelete ) {                               
                  
          wfDebugLog( __CLASS__, __FUNCTION__ . ": links did not change, 
returning." );           
          return true;                                                          
                  
      }                                                                         
                  
                                                                                
                  
      $ok = true;                                                               
                  
      $dbw = $this->getConnection( DB_MASTER );                                 
                  
                                                                                
                  
      //TODO: consider doing delete and insert in the same callback, so they 
share a transaction. 
                                                                                
                  
      if ( $ok && $linksToDelete ) {                                            
                  
          wfDebugLog( __CLASS__, __FUNCTION__ . ": " . count( $linksToDelete ) 
. " links to delete." );
          $ok = $dbw->deadlockLoop( array( $this, 'deleteLinksInternal' ), 
$item, $linksToDelete, $dbw );
      }                                                                         
                  
                                                                                
                  
      if ( $ok && $linksToInsert ) {                                            
                  
          wfDebugLog( __CLASS__, __FUNCTION__ . ": " . count( $linksToInsert ) 
. " links to insert." );
          $ok = $dbw->deadlockLoop( array( $this, 'insertLinksInternal' ), 
$item, $linksToInsert, $dbw );
      }                                                                         
                  
                                                                                
                  
      $this->releaseConnection( $dbw );                                         
                  
                                                                                
                  
      return $ok;                                                               
                  
  } 

suggestions:

- probably would help to instead use a master connection for site link lookup 
when parsing is happening.
- and the way sitelinks get updated seems suboptimal and could be improved.


TASK DETAIL
  https://phabricator.wikimedia.org/T47839

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: MaxBioHazard, matmarex, Schnark, Glaisher, StudiesWorld, Base, Josve05a, 
Malenki, TTO, revi, YMS, IKhitron, Amire80, thiemowmde, hoo, Bugreporter, 
Aklapper, FriedhelmW, wikibugs-l-list, Wikidata-bugs, Abraham, Nemo_bis, 
Silvonen, aude, Lydia_Pintscher, Stryn, UV, Unknown Object (MLST), Mbch331



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to