topic: removed semantic-web, guile, and added wikidata-tech. Let's move the conversation to wikidata-tech@
Please remove [email protected] next time you reply. Le dim. 22 déc. 2019 à 23:35, Ted Thibodeau Jr <[email protected]> a écrit : > > > On Dec 22, 2019, at 03:17 PM, Amirouche Boubekki > <[email protected]> wrote: > > > > Hello all ;-) > > > > > > I ported the code to Chez Scheme to do an apple-to-apple comparison > > between GNU Guile and Chez and took the time to launch a few queries > > against Virtuoso available in Ubuntu 18.04 (LTS). > > Hi, Amirouche -- > > Kingsley's points about tuning Virtuoso to use available > RAM [1] and other system resources are worth looking into, > but a possibly more important first question is -- > > Exactly what version of Virtuoso are you testing? > > If you followed the common script on Ubuntu 18.04, i.e., -- > > sudo apt update > > sudo apt install virtuoso-opensource > > -- then you likely have version 6.1.6 of VOS, the Open Source > Edition of Virtuoso, which shipped 2012-08-02 [2], and is far > behind the latest version of both VOS (v7.2.5+) and Enterprise > Edition (v8.3+)! > > The easiest way to confirm what you're running is to review > the first "paragraph" of output from the command corresponding > to the name of your Virtuoso binary -- > > virtuoso-t -? $ virtuoso-t -? Virtuoso Open Source Edition (multi threaded) Version 6.1.6.3127-pthreads as of Feb 6 2018 > > virtuoso-iodbc-t -? > I do not have that command. I use isql-vt: $ isql-vt --help OpenLink Interactive SQL (Virtuoso), version 0.9849b. > If I'm right, and you're running 6.x, you'll get much better > test results just by running a current version of Virtuoso. > > You can build VOS 7.2.6+ from source [3] (we'd recommend the > develop/7 branch [4] for the absolute latest), or download a > precompiled binary [5] of VOS 7.2.5.1 or 7.2.6.dev. > > You can also try Enterprise Edition at no cost for 30 days [5]. > Next round I will try the develop branch. Like I said, previously, somewhere, those benchmark must be taken with a grain of salt: For one, the Virtuoso timings are reported by Virtuoso. Second, nomuofu side, I do not convert the internal representation into the external representation, third and most important point, this is just a glimpse into the full picture. My mails are mainly trying to spark some interest or discussion with wikidata and wikimedia, so that I can work full time on this. I already described my intents, that is to create a benchmark tool based wikidata SPARQL logs [*], then use those to reallistically benchmark Virtuoso, the current solution and a new solution (nomunofu) that I am working on. [*] https://iccl.inf.tu-dresden.de/web/Wissensbasierte_Systeme/WikidataSPARQL/en Raw benchmarks would not tell all the thruth, because nomunofu can rely on both WiredTiger and FoundationDB, which, as far as I know, claim stronger guarantees than Virtuoso. The only way to know whether Virtuoso is comparable to FoundationDB or WiredTiger, will be for Virtuoso to pass the Jespen harness tests (https://jepsen.io/). I did not put all the eggs in the same basket, I am considering other options. But I think working for wikimedia by contract or permanent position would be best overall. I will make another WDQS proposal, based on some feedback I have been given on IRC to add more technical details (and improve the road map). > > [1] http://vos.openlinksw.com/owiki/wiki/VOS/VirtRDFPerformanceTuning > > [2] > http://vos.openlinksw.com/owiki/wiki/VOS/VOSNews2012#2012-08-02%20--%20Announcing%20Virtuoso%20Open-Source%20Edition%20v6.1.6. > > [3] http://vos.openlinksw.com/owiki/wiki/VOS/VOSBuild > > [4] https://github.com/openlink/virtuoso-opensource/tree/develop/7 > > [5] https://sourceforge.net/projects/virtuoso/files/virtuoso/ > > > > > > > Spoiler: the new code is always faster. > > > > The hard disk is SATA, and the CPU is dubbed: Intel(R) Xeon(R) CPU > > E3-1220 V2 @ 3.10GHz > > > > I imported latest-lexeme.nt (6GB) using guile-nomunofu, chez-nomunofu > > and Virtuoso: > > > > - Chez takes 40 minutes to import 6GB > > - Chez is 3 to 5 times faster than Guile > > - Chez is 11% faster than Virtuoso > > > How did you load the data? Did you use Virtuoso's build-load > facilities? This is the recommended method [6]. > > [6] http://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader > > > > Regarding query time, Chez is still faster than Virtuoso with or > > without cache. The query I am testing is the following: > > > > SELECT ?s ?p ?o > > FROM <http://fu> > > WHERE { > > ?s <http://purl.org/dc/terms/language> > > <http://www.wikidata.org/entity/Q150> . > > ?s <http://wikiba.se/ontology#lexicalCategory> > > <http://www.wikidata.org/entity/Q1084> . > > ?s <http://www.w3.org/2000/01/rdf-schema#label> ?o > > }; > > > > Virtuoso first query takes: 1295 msec. > > The second query takes: 331 msec. > > Then it stabilize around: 200 msec. > > > > chez nomunofu takes around 200ms without cache. > > > > There is still an optimization I can do to speed up nomunofu a little. > > > > > > Happy hacking! > > > I'll be interested to hear your new results, with a current build, > and with proper INI tuning in place. What will be the INI options I need to use? Thanks! > > Regards, > > Ted > > > > -- > A: Yes. http://www.idallen.com/topposting.html > | Q: Are you sure? > | | A: Because it reverses the logical flow of conversation. > | | | Q: Why is top posting frowned upon? > > Ted Thibodeau, Jr. // voice +1-781-273-0900 x32 > Senior Support & Evangelism // mailto:[email protected] > // http://twitter.com/TallTed > OpenLink Software, Inc. // http://www.openlinksw.com/ > 20 Burlington Mall Road, Suite 322, Burlington MA 01803 > Weblog -- http://www.openlinksw.com/blogs/ > Community -- https://community.openlinksw.com/ > LinkedIn -- http://www.linkedin.com/company/openlink-software/ > Twitter -- http://twitter.com/OpenLink > Facebook -- http://www.facebook.com/OpenLinkSoftware > Universal Data Access, Integration, and Management Technology Providers > > > > Regards, Amirouche ~ zig ~ https://hyper.dev _______________________________________________ Wikidata mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata
