Halfak added a comment. xz does not have the nice built in support in distributed processing frameworks that bz2 has.
It may be worth re-iterating that I am not concerned about compression ratio. The purpose of this task is to make wikidata JSON dumps easy to process in Hadoop/Spark. A quick reading suggests that Hadoop/Spark has no native support at all for xz. TASK DETAIL https://phabricator.wikimedia.org/T115222 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Halfak Cc: daniel, hoo, NealMcB, Halfak, Aklapper, Wikidata-bugs, aude, Svick, jeremyb _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
