Halfak added a comment.

xz does not have the nice built in support in distributed processing frameworks 
that bz2 has.

It may be worth re-iterating that I am not concerned about compression ratio.  
The purpose of this task is to make wikidata JSON dumps easy to process in 
Hadoop/Spark.

A quick reading suggests that Hadoop/Spark has no native support at all for xz.


TASK DETAIL
  https://phabricator.wikimedia.org/T115222

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Halfak
Cc: daniel, hoo, NealMcB, Halfak, Aklapper, Wikidata-bugs, aude, Svick, jeremyb



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to