| Arkanosis added a comment. |
In T179681#3736916, @Lucas_Werkmeister_WMDE wrote:I ran the conversion directly from the ttl.gz file
Interesting, I couldn’t get that to work and had to pipe gunzip output into the program.
Interesting, indeed… Could it be that you added the -f ttl flag afterwards? I couldn't get it to accept a gzip file as input without this flag (I assume it does file format detection based on the file extension).
Also, I had to install zlib-devel to get rdfhdt to compile on a CentOS 6 container — there might be some non-zlib-enabled build on Debian that isn't available on RedHat.
I also tried converting the latest dump, and since I don’t have access to any system with that much RAM, I thought I could perhaps trade some execution time for swap space. Bad idea :) the process got through 20% of the input file and then slowed to a crawl, at data rates of single-digit kilobytes per second. It would’ve taken half a year to finish at that rate.
Thanks for testing! That would have required a hell lot of swap space anyway. Easy to setup for whoever does this on a regular basis, but for casual needs, I've never seen a machine with 200+ GiB of swap space.
But FWIW, here’s the command I used, with a healthy dose of systemd sandboxing since it’s a completely unknown program I’m running:
<snip>
Thanks for sharing the sandboxing bits! :-)
Cc: Addshore, Smalyshev, Ladsgroup, Arkanosis, Tarrow, Lucas_Werkmeister_WMDE, Aklapper, Lahi, GoranSMilovanovic, QZanden, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
