Ladsgroup added subscribers: JAllemandou, Ladsgroup. Ladsgroup added a comment.
In T208425#5836066 <https://phabricator.wikimedia.org/T208425#5836066>, @jcrespo wrote: > Thanks a lot for the work on this- may I suggest a step before the next step (after rebuilding) of "checking all data, old and new, is consistent". This is a lot of data, and even on well thought processes missing rows were discovered after I requested a comparison on other well-though migration, which happened due to mistakes/existing inconsistencies/aborts. May I request such a step, which could be as fast as a simple join query (<5m to run) between old an new to check no rows are missing or extra, and have equivalent data? We already do this with sqoop and analytics team (thanks to @JAllemandou). here's an example <https://analytics.wikimedia.org/published/datasets/one-off/wikidata/addshore/T239470-notebook.html>. That's how we discovered things like T243944: Really large holes in the new term store (again) <https://phabricator.wikimedia.org/T243944> TASK DETAIL https://phabricator.wikimedia.org/T208425 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ladsgroup Cc: Ladsgroup, JAllemandou, ArielGlenn, jcrespo, Joe, Gehel, alaa_wmde, Marostegui, Jdforrester-WMF, Addshore, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
