Neil_P._Quinn_WMF added a comment.

Thank you so much, @Neil_P._Quinn_WMF! Really appreciate you catching that and correcting. I had incorrectly assumed that initial metadata would not be included. I'm currently looking into your suggested method of filtering revisions and comparing it to using revision_parent_id > 0, which should theoretically yield the same result but is not the case in practice.

Glad to help! mediawiki_history is full of so many gotchas; I'll fallen into plenty myself!

I don't understand the point of this, since the NOT revision_is_deleted should have already removed deleted files. (Also the page_id isn't necessarily null for deleted pages; after all the MediaWiki archive table has ar_page_id.)

https://commons.wikimedia.org/wiki/File:Box-Front.jpg is a deleted file with a null page_id and it gets included in summarized_revisions otherwise.

True, but its revisions do have revision_is_deleted set, so you've already filtered them out of your query.


TASK DETAIL
https://phabricator.wikimedia.org/T213597

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Neil_P._Quinn_WMF
Cc: Neil_P._Quinn_WMF, chelsyx, MNeisler, mpopov, kzimmerman, Ramsey-WMF, Abit, JKSTNK, Lahi, PDrouin-WMF, E1presidente, Cparle, Anooprao, SandraF_WMF, Tramullas, Acer, Silverfish, Susannaanas, Jane023, Wikidata-bugs, Base, matthiasmullie, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to