Ladsgroup created this task.
Ladsgroup added projects: Wikidata-Campsite (Wikidata-Campsite-Iteration-∞),
Wikidata.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: User-Ladsgroup.
TASK DESCRIPTION
The holes in items are around 50% of the total number, extracting them gives
us a file with 462 MB, we can't just feed this file to the maintenance script
so I split them to list of millions (from Q2M to Q3M for example) and put them
in a directory in /tmp/ in mwmaint, and wrote this bash script to go through
them one by one in a script:
for file in /tmp/wb_terms_T219123/*
do
if [[ -f $file ]]; then
mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php
--wiki=wikidatawiki --batch-size=100 --sleep=2 --file=$file
fi
done
This is not that bad, but if it fails for even one batch, it happily skips
the rest of the file and jumps to the next million:
Rebuilding Q2097701 till Q2097963
[ERROR] commitAndWaitForReplication() timed out, aborting
Done.
Rebuilding Q3000002 till Q3000212
We need to fix this.
My suggestion for now is to pipe all of the output/err to a file and then
check + clean up the mess afterwards
TASK DETAIL
https://phabricator.wikimedia.org/T242366
WORKBOARD
https://phabricator.wikimedia.org/project/board/3539/
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Ladsgroup
Cc: Aklapper, Addshore, WMDE-leszek, Ladsgroup, Iflorez, darthmon_wmde,
alaa_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer,
_jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Lydia_Pintscher,
Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs