hoo added a comment.

I profiled this in production (on mwdebug1001, using HHVM and dumping the first 40,000 entities from shard 0/4):

Command line:
sudo -u www-data php /srv/mediawiki/multiversion/MWScript.php extensions/Wikidata/extensions/Wikibase/repo/maintenance/dumpJson.php --wiki wikidatawiki --sharding-factor 4 --shard 0 --snippet --limit 40000 --profiler=text

Result extract:
43.96% 713097.698 814618 - Wikibase\DataModel\Deserializers\StatementDeserializer::deserialize

Given our dumps (all shards combined) have a average run time of about 65 hours, we spent almost 29 hours each time in this function… a small saving of 5% would save us more than 100 minutes for each dump run.


TASK DETAIL
https://phabricator.wikimedia.org/T157013

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo
Cc: Smalyshev, JeroenDeDauw, daniel, Aklapper, hoo, D3r1ck01, Izno, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to