Dear Kingsley,

Let me start saying that I appreciate and thank the effort of loading complete wikidata over a graph database and make and sparql endpoint available. I know it is not an easy task to do

I just tried out the new virtuoso-hosted sparql endpoint with some queries. My experiments are not exhaustive at all, but I just wanted to raise two concern that I detected

Considering a (very simple) query that count all humans:

'''
SELECT (count(?human) as ?c)
WHERE
{
  ?human wdt:P31 wd:Q5 .
}
'''

I get a result of 10396057, which is ok considering the dataset that you are using

But if we try to export all instances of human (on a tsv file) with the following query:

'''
SELECT ?human
WHERE
{
  ?human wdt:P31 wd:Q5 .
}
'''

Then I only get 100000 results. Is there a limit over the number of results that a query can have?


Furthermore, if we want to get all humans ordered by id, then the endpoint times out. The following is the query:

'''
SELECT ?human
WHERE
{
  ?human wdt:P31 wd:Q5 .
}
ORDER BY DESC(?human)
'''

Thank you again for all your efforts. I am looking forward to see how this new endpoint work, :)

Are you planning to update regularly the dataset?

All the best!
Larry

https://iccl.inf.tu-dresden.de/web/Larry_Gonzalez



On 11.01.23 21:51, Kingsley Idehen via Wikidata wrote:
All,

We are pleased to announce immediate availability of an new Virtuoso-hosted Wikidata instance based on the most recent datasets. This instance comprises 17 billion+ RDF triples.

Host Machine Info:

Item    Value

CPU

        

|2x Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz|

Cores

        

|24|

Memory

        

|378 GB|

SSD

        

|4x Crucial M4 SSD 500 GB|


Cloud related costs for a self-hosted variant, assuming:

  *

    dedicated machine for 1 year without upfront costs

  *

    128 GiB memory

  *

    16 cores or more

  *

    512GB SSD for the database

  *

    3T outgoing internet traffic (based on our DBpedia statistics)


vendor machine type memory vCPUs monthly machine monthly disk monthly network monthly total

Amazon

        

r5a.4xlarge

        

128 GiB

        

16

        

$479.61

        

$55.96

        

$276.48

        

$812.05

Google

        

e2highmem-16

        

128 GiB

        

16

        

$594.55

        

$95.74

        

$255.00

        

$945.30

Azure

        

D32a

        

128 GiB

        

32

        

$769.16

        

$38.40

        

$252.30

        

$1,060.06


SPARQL Query and Full Text Search service endpoints:

  *

    https://wikidata.demo.openlinksw.com/sparql -- SPARQL Query Services
    Endpoint

  *

    https://wikidata.demo.openlinksw.com/fct -- Faceted Search & Browsing


Additional Information

  *

    Loading the Wikidata dataset 2022/12 into Virtuoso Open Source -
    Announcements - OpenLink Software Community (openlinksw.com)
    
<https://community.openlinksw.com/t/loading-the-wikidata-dataset-2022-12-into-virtuoso-open-source/3580>


Happy New Year!

--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
               http://kidehen.blogspot.com

Profile Pages:
Pinterest:https://www.pinterest.com/kidehen/
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:https://twitter.com/kidehen
Google+:https://plus.google.com/+KingsleyIdehen/about
LinkedIn:http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
         
:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this


_______________________________________________
Wikidata mailing list -- [email protected]
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/TI7U5Q6ZBEEPCNSTZ2KYLEXEDO4E4GMG/
To unsubscribe send an email to [email protected]
_______________________________________________
Wikidata mailing list -- [email protected]
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/I5BQ5ORIAENKE5RTJWM4JSAL52JXWP3F/
To unsubscribe send an email to [email protected]

Reply via email to