Ladsgroup added subscribers: JAllemandou, Ladsgroup.
Ladsgroup added a comment.


  In T208425#5836066 <https://phabricator.wikimedia.org/T208425#5836066>, 
@jcrespo wrote:
  
  > Thanks a lot for the work on this- may I suggest a step before the next 
step (after rebuilding) of "checking all data, old and new, is consistent". 
This is a lot of data, and even on well thought processes missing rows were 
discovered after I requested a comparison on other well-though migration, which 
happened due to mistakes/existing inconsistencies/aborts. May I request such a 
step, which could be as fast as a simple join query (<5m to run) between old an 
new to check no rows are missing or extra, and have equivalent data?
  
  We already do this with sqoop and analytics team (thanks to @JAllemandou). 
here's an example 
<https://analytics.wikimedia.org/published/datasets/one-off/wikidata/addshore/T239470-notebook.html>.
 That's how we discovered things like T243944: Really large holes in the new 
term store (again) <https://phabricator.wikimedia.org/T243944>

TASK DETAIL
  https://phabricator.wikimedia.org/T208425

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ladsgroup
Cc: Ladsgroup, JAllemandou, ArielGlenn, jcrespo, Joe, Gehel, alaa_wmde, 
Marostegui, Jdforrester-WMF, Addshore, Aklapper, darthmon_wmde, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to