GoranSMilovanovic added a comment.
@Addshore Well, now it sounds even more complicated than in the ticket description. I am for a call on this too. Let me just provide a few observations in relation to what has been said and suggested until now. > I do not think we want to use the wb_changes table. Why? Its documentation <https://www.mediawiki.org/wiki/Wikibase/Schema/wb_changes> says that the `change_info` field "Stores the new full page data in JSON format", and given that this schema also holds the `change_revision_id` (docs: "This is equal to the rev_id of the edit made by user") this sounds exactly as what we need? Once again, under these assumptions, we (1) might use `wmf.mediawiki_history` to select the revisions that we are interested in (as @WMDE-leszek explains in the task description), then (2) use the `wb_changes` to fetch the JSON representations selecting by `rev_id` as a key, and then (3) compare the JSON representations to see if a change in statement value is followed by a change in a statement's reference. What is wrong with this approach, before we abandon it? > (2) use the API to collect the JSON representation of the revised entity by revision-id, > Note, this will have to be done using Special:EntityData and the revision parameter (wbgetentities doesn't have this functionality) @Addshore Thanks for a hint on this one. However, I am not sure if `wmf.mediawiki_history` and the API are the way to go at all, since: - the `wmf.mediawiki_history` receives a monthly update only, constraining our reporting to monthly updates too, while - making tons of API calls also does not sound feasible. > Without adding anything to wikibase i guess the general approach has to be: > Find revisions that touch statement mainsnak values and or references > Considerations: > These values can be touched using a variety of different api modules and with a variety of different summaries, so not just wbsetclaim-update, if anything working with a blacklist of summaries might be easier (eliminate things that only touch terms for examples) @Addshore Define "summaries" and "blacklist of summaries" please. > Fetch the entity either side of the change and see what happened and classify that? @Addshore Now this I don't even understand. > Once this has been done for a window of data try to figure out exactly what is happening to the statements based on the classifications? @Addshore Neither do I understand what do you mean here. TASK DETAIL https://phabricator.wikimedia.org/T240466 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Aklapper, Addshore, Jan_Dittrich, hoo, rosalieper, noarave, Tarrow, Lydia_Pintscher, GoranSMilovanovic, WMDE-leszek, Sarai-WMDE, darthmon_wmde, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
