Mitar added a comment.

  OK, so it seems the problem is in pbzip2. It is not able to decompress in 
parallel unless compression was made with pbzip2, too. But lbzip2 can 
decompress all of them in parallel.
  
  See:
  
    $ time bunzip2 -c -k latest-lexemes.json.bz2 > /dev/null
    
    real        1m0.101s
    user        0m59.912s
    sys 0m0.180s
    $ time pbzip2 -d -k -c latest-lexemes.json.bz2 > /dev/null
    
    real        0m57.662s
    user        0m57.792s
    sys 0m0.180s
    $ time lbunzip2 -c -k latest-lexemes.json.bz2 > /dev/null
    
    real        0m13.346s
    user        1m35.951s
    sys 0m2.342s
    $ lbunzip2 -c -k latest-lexemes.json.bz2 > serial.json
    $ pbzip2 -z < serial.json > parallel.json.bz2
    $ time lbunzip2 -c -k parallel.json.bz2 > /dev/null
    
    real        0m16.270s
    user        1m43.004s
    sys 0m2.262s
    $ time pbzip2 -d -c -k parallel.json.bz2 > /dev/null
    
    real        0m17.324s
    user        1m52.946s
    sys 0m0.659s
  
  Size is very similar:
  
    $ ll parallel.json.bz2 latest-lexemes.json.bz2 
    -rw-rw-r-- 1 mitar mitar 168657719 Jun 15 20:36 latest-lexemes.json.bz2
    -rw-rw-r-- 1 mitar mitar 168840138 Jun 20 07:35 parallel.json.bz2

TASK DETAIL
  https://phabricator.wikimedia.org/T222985

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mitar
Cc: Mitar, ImreSamu, hoo, Smalyshev, ArielGlenn, Liuxinyu970226, bennofs, 
Invadibot, maantietaja, jannee_e, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to