Nikerabbit created this task. Nikerabbit added projects: Wikibase-Containers, User-Nikerabbit. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata.
TASK DESCRIPTION Symptoms: - The script uses all CPU for hours, without producing any output. Steps to reproduce: 1. Install wdqs using wikibase-docker (version 0.3.10) 2. docker-compose exec wdqs mkdir -p data/split 3. time docker-compose exec wdqs curl -L https://nimiarkisto.fi/dumps/nimiarkisto.fi-CC-BY-4.0_2020-09-09.rdf.bz2 -o data/dump.rdf.bz2 4. time docker-compose exec wdqs ./munge.sh -c 50000 -f data/dump.rdf.bz2 -d data/split -l en,fi,sv -s I have checked that this is not just slow. With Wikidata Lexemes dump it does output to the log and to the split files. With Nimiarkisto dump I only get: root@nimiarkisto-qs:~/nimiarkisto-qs# time docker-compose exec wdqs ./munge.sh -c 5000 -f data/dump.rdf -d data/split -l en,fi,sv -s #logback.classic pattern: %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n 10:03:13.441 [main] INFO org.wikidata.query.rdf.tool.Munge - Switching to data/split/wikidump-000000001.ttl.gz ^C And the file `data/split/wikidump-000000001.ttl.gz` contains no output. Is it possible to enable more verbose logging to debug this further? TASK DETAIL https://phabricator.wikimedia.org/T263427 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Nikerabbit Cc: Aklapper, Nikerabbit, Samantha_Alipio_WMDE, Akuckartz, darthmon_wmde, Jelabra, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Asahiko, despens, Wikidata-bugs, aude, Nemo_bis, Addshore, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs