On 1/12/23 3:39 AM, Larry Gonzalez wrote:
Dear Kingsley,Let me start saying that I appreciate and thank the effort of loading complete wikidata over a graph database and make and sparql endpoint available. I know it is not an easy task to doI just tried out the new virtuoso-hosted sparql endpoint with some queries. My experiments are not exhaustive at all, but I just wanted to raise two concern that I detectedConsidering a (very simple) query that count all humans: ''' SELECT (count(?human) as ?c) WHERE { ?human wdt:P31 wd:Q5 . } '''I get a result of 10396057, which is ok considering the dataset that you are usingBut if we try to export all instances of human (on a tsv file) with the following query:''' SELECT ?human WHERE { ?human wdt:P31 wd:Q5 . } '''Then I only get 100000 results. Is there a limit over the number of results that a query can have?
Yes, because these services are primarily for ad-hoc querying rather than wholesale data exports. If you want to export massive amounts of data then you can do so using OFFSET and LIMIT.
Alternatively, you can instantiate your own instance in the Azure or AWS cloud and use as you see fit.
Like what we provide regarding DBpedia, there's a server side configuration in place for enforcing a "fair use" policy :)
Furthermore, if we want to get all humans ordered by id, then the endpoint times out. The following is the query:''' SELECT ?human WHERE { ?human wdt:P31 wd:Q5 . } ORDER BY DESC(?human) '''
If you set the query timeout to a value over 1000 msecs, the Virtuoso Anytime Query feature will provide you with a partial solution which you can use in conjunction with OFFSET and LIMIT to creative an interactive cursor (or scrollable cursor). Beyond that, its back to the "fair use" policy and option to instantiate your own service-specific instance using our cloud offerings.
Regards, Kingsley
Thank you again for all your efforts. I am looking forward to see how this new endpoint work, :)Are you planning to update regularly the dataset? All the best! Larry https://iccl.inf.tu-dresden.de/web/Larry_Gonzalez On 11.01.23 21:51, Kingsley Idehen via Wikidata wrote:All,We are pleased to announce immediate availability of an new Virtuoso-hosted Wikidata instance based on the most recent datasets. This instance comprises 17 billion+ RDF triples.Host Machine Info: Item Value CPU |2x Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz| Cores |24| Memory |378 GB| SSD |4x Crucial M4 SSD 500 GB| Cloud related costs for a self-hosted variant, assuming: * dedicated machine for 1 year without upfront costs * 128 GiB memory * 16 cores or more * 512GB SSD for the database * 3T outgoing internet traffic (based on our DBpedia statistics)vendor machine type memory vCPUs monthly machine monthly disk monthly network monthly totalAmazon r5a.4xlarge 128 GiB 16 $479.61 $55.96 $276.48 $812.05 Google e2highmem-16 128 GiB 16 $594.55 $95.74 $255.00 $945.30 Azure D32a 128 GiB 32 $769.16 $38.40 $252.30 $1,060.06 SPARQL Query and Full Text Search service endpoints: * https://wikidata.demo.openlinksw.com/sparql -- SPARQL Query Services Endpoint *https://wikidata.demo.openlinksw.com/fct -- Faceted Search & BrowsingAdditional Information * Loading the Wikidata dataset 2022/12 into Virtuoso Open Source - Announcements - OpenLink Software Community (openlinksw.com) <https://community.openlinksw.com/t/loading-the-wikidata-dataset-2022-12-into-virtuoso-open-source/3580> Happy New Year! -- Regards, Kingsley Idehen Founder & CEO OpenLink Software Home Page:http://www.openlinksw.com Community Support:https://community.openlinksw.com Weblogs (Blogs): Company Blog:https://medium.com/openlink-software-blog Virtuoso Blog:https://medium.com/virtuoso-blogData Access Drivers Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-driversPersonal Weblogs (Blogs): Medium Blog:https://medium.com/@kidehen Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/ http://kidehen.blogspot.com Profile Pages: Pinterest:https://www.pinterest.com/kidehen/ Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen Twitter:https://twitter.com/kidehen Google+:https://plus.google.com/+KingsleyIdehen/about LinkedIn:http://www.linkedin.com/in/kidehen Web Identities (WebID): Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i :http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this _______________________________________________ Wikidata mailing list -- [email protected]Public archives at https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/TI7U5Q6ZBEEPCNSTZ2KYLEXEDO4E4GMG/To unsubscribe send an email to [email protected]_______________________________________________ Wikidata mailing list -- [email protected]Public archives at https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/I5BQ5ORIAENKE5RTJWM4JSAL52JXWP3F/To unsubscribe send an email to [email protected]
-- Regards, Kingsley Idehen Founder & CEO OpenLink Software Home Page: http://www.openlinksw.com Community Support: https://community.openlinksw.com Weblogs (Blogs): Company Blog: https://medium.com/openlink-software-blog Virtuoso Blog: https://medium.com/virtuoso-blog Data Access Drivers Blog: https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers Personal Weblogs (Blogs): Medium Blog: https://medium.com/@kidehen Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/ http://kidehen.blogspot.com Profile Pages: Pinterest: https://www.pinterest.com/kidehen/ Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen Twitter: https://twitter.com/kidehen Google+: https://plus.google.com/+KingsleyIdehen/about LinkedIn: http://www.linkedin.com/in/kidehen Web Identities (WebID): Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i : http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
OpenPGP_signature
Description: OpenPGP digital signature
_______________________________________________ Wikidata mailing list -- [email protected] Public archives at https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/SPZNSYFEVRK6NYA5YO7ORZDA4EHSP37R/ To unsubscribe send an email to [email protected]
