WMDE-leszek added a comment.
@GoranSMilovanovic thanks for the update. This sounds indeed not great. Forgive my naive question: Would be significantly less efficient to use JSON dump of Wikidata lexemes and items (in this case historical one, as we are looking for data on the past state), e.g. the one from https://dumps.wikimedia.org/wikidatawiki/entities/20210222/? As I got it, you are kind of reconstructing the state of each lexeme from the wikitext history table to have its JSON structure at the requested point in time? I don't know if some kind of //offline// JSON dump processing would be "easier"/"faster", but mentioning this alternative in case it has not yet been considered. Apologies if this is stating the obvious and this option has been already ruled out a while ago. PS. For the reason unknown to me the "oldest" JSON dump is dated on Feb 22nd - it is not quite Jan 1st. @Lydia_Pintscher do you happen to remember if there was some interruption in the dump service? Or are the older dumps simply regularly wiped to save disk space? TASK DETAIL https://phabricator.wikimedia.org/T278698 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic, WMDE-leszek Cc: WMDE-leszek, Aklapper, GoranSMilovanovic, Lea_WMDE, Lydia_Pintscher, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs