Ah, thanks Jerven.  How do you deal with http-layer query timeouts?  Are
you able to predict them for certain common queries rather than waiting
for the timeout to hit?  S

On Fri, Jan 13, 2023 at 4:02 AM Jerven Tjalling Bolleman
<[email protected]> wrote:

> Hi All,
>
> Regarding these FAIR use settings. They are tuneable and maybe turned off,
> so the specific
> values that Openlink uses may or may not be used if wikidata would host
> itself a virtuoso instance.
>
> e.g. for sparql.uniprot.org you are unlikely to run into these limits (as
> the values are set very high indeed)
> and are more likely to suffer from settings around the http layer that
> limit query run time due to connection issues.
>
> Regards,
> Jerven
>
> On 1/12/23 11:45 PM, Kingsley Idehen via Wikidata wrote:
>
>
> On 1/12/23 3:39 AM, Larry Gonzalez wrote:
>
> Dear Kingsley,
>
> Let me start saying that I appreciate and thank the effort of loading
> complete wikidata over a graph database and make and sparql endpoint
> available. I know it is not an easy task to do
>
> I just tried out the new virtuoso-hosted sparql endpoint with some
> queries. My experiments are not exhaustive at all, but I just wanted to
> raise two concern that I detected
>
> Considering a (very simple) query that count all humans:
>
> '''
> SELECT (count(?human) as ?c)
> WHERE
> {
>   ?human wdt:P31 wd:Q5 .
> }
> '''
>
> I get a result of 10396057, which is ok considering the dataset that you
> are using
>
> But if we try to export all instances of human (on a tsv file) with the
> following query:
>
> '''
> SELECT ?human
> WHERE
> {
>   ?human wdt:P31 wd:Q5 .
> }
> '''
>
> Then I only get 100000 results. Is there a limit over the number of
> results that a query can have?
>
>
>
> Yes, because these services are primarily for ad-hoc querying rather than
> wholesale data exports. If you want to export massive amounts of data then
> you can do so using OFFSET and LIMIT.
>
> Alternatively, you can instantiate your own instance in the Azure or AWS
> cloud and use as you see fit.
>
> Like what we provide regarding DBpedia, there's a server side
> configuration in place for enforcing a "fair use" policy :)
>
>
>
>
> Furthermore, if we want to get all humans ordered by id, then the endpoint
> times out. The following is the query:
>
> '''
> SELECT ?human
> WHERE
> {
>   ?human wdt:P31 wd:Q5 .
> }
> ORDER BY DESC(?human)
> '''
>
>
>
> If you set the query timeout to a value over 1000 msecs, the Virtuoso
> Anytime Query feature will provide you with a partial solution which you
> can use in conjunction with OFFSET and LIMIT to creative an interactive
> cursor (or scrollable cursor). Beyond that, its back to the "fair use"
> policy and option to instantiate your own service-specific instance using
> our cloud offerings.
>
>
> Regards,
>
> Kingsley
>
>
>
> Thank you again for all your efforts. I am looking forward to see how this
> new endpoint work, :)
>
> Are you planning to update regularly the dataset?
>
> All the best!
> Larry
>
> https://iccl.inf.tu-dresden.de/web/Larry_Gonzalez
>
>
>
> On 11.01.23 21:51, Kingsley Idehen via Wikidata wrote:
>
> All,
>
> We are pleased to announce immediate availability of an new
> Virtuoso-hosted Wikidata instance based on the most recent datasets. This
> instance comprises 17 billion+ RDF triples.
>
> Host Machine Info:
>
> Item     Value
>
> CPU
>
>
>
> |2x Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz|
>
> Cores
>
>
>
> |24|
>
> Memory
>
>
>
> |378 GB|
>
> SSD
>
>
>
> |4x Crucial M4 SSD 500 GB|
>
>
> Cloud related costs for a self-hosted variant, assuming:
>
>   *
>
>     dedicated machine for 1 year without upfront costs
>
>   *
>
>     128 GiB memory
>
>   *
>
>     16 cores or more
>
>   *
>
>     512GB SSD for the database
>
>   *
>
>     3T outgoing internet traffic (based on our DBpedia statistics)
>
>
> vendor     machine type     memory     vCPUs     monthly machine
> monthly disk monthly network     monthly total
>
> Amazon
>
>
>
> r5a.4xlarge
>
>
>
> 128 GiB
>
>
>
> 16
>
>
>
> $479.61
>
>
>
> $55.96
>
>
>
> $276.48
>
>
>
> $812.05
>
> Google
>
>
>
> e2highmem-16
>
>
>
> 128 GiB
>
>
>
> 16
>
>
>
> $594.55
>
>
>
> $95.74
>
>
>
> $255.00
>
>
>
> $945.30
>
> Azure
>
>
>
> D32a
>
>
>
> 128 GiB
>
>
>
> 32
>
>
>
> $769.16
>
>
>
> $38.40
>
>
>
> $252.30
>
>
>
> $1,060.06
>
>
> SPARQL Query and Full Text Search service endpoints:
>
>   *
>
>     https://wikidata.demo.openlinksw.com/sparql -- SPARQL Query Services
>     Endpoint
>
>   *
>
>     https://wikidata.demo.openlinksw.com/fct -- Faceted Search & Browsing
>
>
> Additional Information
>
>   *
>
>     Loading the Wikidata dataset 2022/12 into Virtuoso Open Source -
>     Announcements - OpenLink Software Community (openlinksw.com)
>
> <https://community.openlinksw.com/t/loading-the-wikidata-dataset-2022-12-into-virtuoso-open-source/3580>
> <https://community.openlinksw.com/t/loading-the-wikidata-dataset-2022-12-into-virtuoso-open-source/3580>
>
>
> Happy New Year!
>
> --
> Regards,
>
> Kingsley Idehen
> Founder & CEO
> OpenLink Software
> Home Page:http://www.openlinksw.com
> Community Support:https://community.openlinksw.com
> Weblogs (Blogs):
> Company Blog:https://medium.com/openlink-software-blog
> Virtuoso Blog:https://medium.com/virtuoso-blog
> Data Access Drivers Blog:
> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium Blog:https://medium.com/@kidehen
> Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
>                http://kidehen.blogspot.com
>
> Profile Pages:
> Pinterest:https://www.pinterest.com/kidehen/
> Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter:https://twitter.com/kidehen
> Google+:https://plus.google.com/+KingsleyIdehen/about
> LinkedIn:http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
> :
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
>
> _______________________________________________
> Wikidata mailing list -- [email protected]
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/TI7U5Q6ZBEEPCNSTZ2KYLEXEDO4E4GMG/
> To unsubscribe send an email to [email protected]
>
> _______________________________________________
> Wikidata mailing list -- [email protected]
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/I5BQ5ORIAENKE5RTJWM4JSAL52JXWP3F/
> To unsubscribe send an email to [email protected]
>
>
>
>
> _______________________________________________
> Wikidata mailing list -- [email protected]
> Public archives at 
> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/SPZNSYFEVRK6NYA5YO7ORZDA4EHSP37R/
> To unsubscribe send an email to [email protected]
>
>
> _______________________________________________
> Wikidata mailing list -- [email protected]
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/IM4IODBJ3FGR3QT2AIATCJTXRHM4E2AX/
> To unsubscribe send an email to [email protected]
>


-- 
Samuel Klein          @metasj           w:user:sj          +1 617 529 4266
_______________________________________________
Wikidata mailing list -- [email protected]
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/26CGOE5LZESW5Q5ADJP4Z3ZDX6MU6SBT/
To unsubscribe send an email to [email protected]

Reply via email to