Addshore added a comment.

This likely has to do with the switch to using cirrus / elastic for the search mechanism here instead of using our wb_terms table.

So it looks like ItemNotabilityFilter::selectPagePropsPage must be failing to select any page props for the given items.
As a result getPagePropsByItem returns an array of values without the given item id having a key.
getNotableEntityIds then assumes that all entities that we enquired about will have an entry in the returned array.


Looking specifically at a couple of error instances i spotted on bn.wikipedia.org:
https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2018.10.20/mediawiki/?id=AWaPoc2mN-apXXYu4qqy
https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2018.10.19/mediawiki/?id=AWaNNjeE00on8STvZ6qi

These refer to 2 keys / entity ids missing, Q56258025 and Q56508858

Looking at these on wikidata they are both redirects.
https://www.wikidata.org/w/index.php?title=Q56258025
https://www.wikidata.org/w/index.php?title=Q56508858

The redirect creation date however was far from the time of the error...

Looking at the error the page props are correctly been moved to the next entity.

mysql:[email protected] [wikidatawiki]> select * from page_props where pp_page = 56196493;
Empty set (0.02 sec)

mysql:[email protected] [wikidatawiki]> select * from page_props where pp_page = 13271670;
+----------+-----------------+-----------------------------+------------+
| pp_page  | pp_propname     | pp_value                    | pp_sortkey |
+----------+-----------------+-----------------------------+------------+
| 13271670 | page_image_free | Humayun_Ahmed_13Nov2010.jpg |       NULL |
| 13271670 | wb-claims       | 52                          |         52 |
| 13271670 | wb-identifiers  | 20                          |         20 |
| 13271670 | wb-sitelinks    | 11                          |         11 |
+----------+-----------------+-----------------------------+------------+
4 rows in set (0.01 sec)

My theroy is that the search takes some time to update, so for a time it could still return an old item ID that is now a redirect and thus has no page props.

We could:

  • Just ignore items like this, ignore keys that do not exist in the array returned by the get page props method.
  • Add some redirect resolving stuff into the mix here at some point, so that we get the pages props for the ID that our ID redirects to.

TASK DETAIL
https://phabricator.wikimedia.org/T207235

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: Addshore, Ladsgroup, Krinkle, Aklapper, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, cmadeo, LawExplorer, Wikidata-bugs, aude, jayvdb, Ricordisamoa, Jdforrester-WMF, Mbch331, Jay8g, Krenair
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to