bking created this task. bking added projects: Wikidata-Query-Service, Data-Platform-SRE. Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION Blazegraph (the application that serves WDQS) stores all its data in a single JNL file. The WDQS file is very large (~1.2TB) so moving it on and off the hosts tends to be difficult (see T344732 <https://phabricator.wikimedia.org/T344732> and this blog post <https://addshore.com/2023/08/wikidata-query-service-blazegraph-jnl-file-on-cloudflare-r2-and-internet-archive/> . ) We've had to do this more than once, and my general rule is that if you have to do something more than twice, you need to automate it. Creating this ticket to: - Document the process of extracting a JNL file from a wdqs hosts - Solicit feedback from co-workers/community members, and make a decision on whether to automate this process. Note that this **does not** mean we'll constantly run this process like we do for the TTL dumps. Just that we'll have a ready-made script to run that starts with a JNL file on a wdqs server and ends up with a file that can be publicly downloaded. TASK DETAIL https://phabricator.wikimedia.org/T347605 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: bking Cc: Aklapper, bking, AWesterinen, BTullis, Namenlos314, Gq86, Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org