Re: [Wikidata-tech] Thoughts on (not) exposing a SPARQL endpoint

Daniel Kinzler Wed, 11 Mar 2015 03:24:29 -0700

Am 11.03.2015 um 10:08 schrieb Markus Krötzsch:
> What I don't see is how the use of a WDQ API on top of SPARQL would make the
> overall setup any less vulnerable; it mainly introduces an additional 
> component
> on top of SPARQL, and we can have a simpler SPARQL-based filter component 
> there
> if we want, which is likely to be more effective in controlling usage.


I disagree on both points: I believe it would be neither simpler, nor more
effective. That's pretty much the core of it.

However, I admit that this is currently a gut feeling, a concern I want to share
and discuss. It should be investigated before making a decision.

> There is a huge cost to
> designing a query API from scratch, and I would really like to avoid this.

Which is why I want to use one that already exists (WDQ), and back it by
something that already exists (SPARQL).

> Supporting WDQ on top of SPARQL would retain WDQ in its current form and still
> support standards -- 

That's exactly what I propose.

> if we want to develop an official custom API, we will give
> up on both of these benefits, and at the same time push the ETA for Wikidata
> queries far into the future.

I disagree. If, as I believe, sandboxing WDQ is simpler than sandboxing SPARQL,
using WDQ would allow us  to have a public query API sooner. But whether my
believe is correct needs to be investigated, of course.

> All of this has been discussed and considered in the past. I don't see why one
> would be kicking off discussions now that question everything decided in
> meetings and telcos over the past weeks. There is absolutely no new 
> information
> compared to what has led to the consensus that we all (including Daniel) had
> reached.

The consensus as I remember it was "we should be able to expose SPARQL safely,
if we invest enough time to sandbox it". The issue of lock-in was mentioned but
not really assessed. The relative cost for sandboxing WDQ vs SPARQL, and the
impact on the ETA, was not discussed much. The ad-hoc evaluation spreadsheet
shows WDQ as a second to SPARQL (before MQL and ASK), mainly because SPARQL is
more powerful.

The downside of that power doesn't factor into the evaluation, nor does the
factor of lock-in. Shifting the relative weight in the spreadsheet from power to
sustainability makes WDQ come out at the top.

After the initial enthusiasm, this has made me increasingly uneasy over the last
weeks. Hence my mail to this list.


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

_______________________________________________
Wikidata-tech mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

Re: [Wikidata-tech] Thoughts on (not) exposing a SPARQL endpoint

Reply via email to