Hi Dmitriy,
Yes, you are right. There are some countries that have a value for
"dissolved or abolished" (P576) but no end date for "instance of"
"sovereign state". This includes the "United Kingdom of the
Netherlands". Probably these should get an end date for their P31, or we
should use P576 in the query.
A list of all states that have no end date to their P31 sovereign state
but that do have a P576 date:
PREFIX : <http://www.wikidata.org/entity/>
SELECT ?country ?countryName
WHERE {
?country :P576c ?disolutionDate .
?country :P31s ?statement .
?statement :P31v :Q3624078 .
FILTER NOT EXISTS { ?statement :P582q ?endDate }
?country rdfs:label ?countryName FILTER(lang(?countryName)="en")
}
A list of all states that have an end date to their P31 sovereign state
but that do not have a P576 date:
PREFIX : <http://www.wikidata.org/entity/>
SELECT ?country ?countryName
WHERE {
?country :P31s ?statement .
?statement :P31v :Q3624078 .
?statement :P582q ?endDate .
FILTER NOT EXISTS { ?country :P576c ?disolutionDate }
?country rdfs:label ?countryName FILTER(lang(?countryName)="en")
}
Of course, this is all based on the dump we use. Some of these things
might have been fixed already.
Cheers,
Markus
On 20.03.2015 20:04, Dmitriy Sintsov wrote:
Hi Markus,
Is ?statement :P31v :Q3624078 .
FILTER NOT EXISTS { ?statement :P582q ?endDate }
really enough to filter off currently non-existing countries?
Because I have such code in my Python bot:
http://paste.debian.net/162319/
And even with so many filters, there is a bit strange "Kingdom of
Netherlands" which duplicates "Netherlands" but having only few cities.
Dmitriy
On Fri, Mar 20, 2015 at 9:08 PM, Markus Kroetzsch
<markus.kroetz...@tu-dresden.de <mailto:markus.kroetz...@tu-dresden.de>>
wrote:
Dear all,
Thanks to the people at the Center of Semantic Web Research in Chile
[1], we have a very first public SPARQL endpoint for Wikidata
running. This is very preliminary, so do not rely on it in
applications and expect things to fail, but you may still enjoy some
things.
http://milenio.dcc.uchile.cl/__sparql
<http://milenio.dcc.uchile.cl/sparql>
The endpoint has all the data from our current RDF exports in one
big database [2]. Below this email are some example queries to get
you started (this is a bit of a learning-by-doing crash course in
SPARQL too, but you may want to consult a tutorial if you don't know
it ;-).
There are some known bugs in the RDF that we will hopefully fix soon
[3]. Also, the service uses a dump that is already a few weeks old
now. We are more interested in testing functions right now before
going production. Also, this is a raw API interface, not a proposal
for a nice UI.
Feedback (and other interesting queries) are welcome :-)
Cheers,
Markus
[1] http://ciws.cl/ -- a joint team from University of Chile and
Pontificia Universidad Catolica de Chile
[2] http://tools.wmflabs.org/__wikidata-exports/rdf/
<http://tools.wmflabs.org/wikidata-exports/rdf/>
[3]
https://github.com/Wikidata/__Wikidata-Toolkit/issues?q=is%__3Aopen+is%3Aissue+label%3A%__22RDF+export%22
<https://github.com/Wikidata/Wikidata-Toolkit/issues?q=is%3Aopen+is%3Aissue+label%3A%22RDF+export%22>
==Lighthouses (Q39715) with their English label (LIMIT 100 for demo)==
PREFIX : <http://www.wikidata.org/__entity/
<http://www.wikidata.org/entity/>>
SELECT *
WHERE {
?lighthouse a :Q39715 .
?lighthouse rdfs:label ?label FILTER(LANG(?label) = "en")
} LIMIT 100
(Just paste the query into the box at
http://milenio.dcc.uchile.cl/__sparql
<http://milenio.dcc.uchile.cl/sparql>)
The actual query condition is in the WHERE {...} part. Things
starting with ? are variables. Basic conditions take the form of
triples: "subject property value". For example, "?lighthouse a
:Q39715" looks for things that are a lighthouse ("a" is short for
"rdf:type" which we use to encode P31 statements without
qualifiers). The dot "." is used as a separator between triples.
Note that the label output is a bit cumbersome because you want to
filter by language (without the FILTER you get all labels in all
languages). A future UI would better fetch the labels after the
query, similar to WDQ, to get smaller & faster queries.
==People born in the same place that they died in==
PREFIX : <http://www.wikidata.org/__entity/
<http://www.wikidata.org/entity/>>
SELECT ?person ?personname ?placename
WHERE {
?person a :Q5 .
?person :P19c ?place .
?person :P20c ?place .
?person rdfs:label ?personname FILTER(LANG(?personname) = "en") .
?place rdfs:label ?placename FILTER(LANG(?placename) = "en")
} LIMIT 100
Here we use a few actual Wikidata properties. Properties in their
simple form (Entity->Value) use ids with a "c" in the end, like
:P19c here. Only qualifier-free statements will be available in this
form right now. Note that we use the variable ?place in two places
as a value. This is how we query for things that have the same place
in both cases.
==People who have Wikipedia (Q52) accounts==
PREFIX : <http://www.wikidata.org/__entity/
<http://www.wikidata.org/entity/>>
SELECT ?person ?personname ?username
WHERE {
?person :P553s ?statement .
?statement :P553v :Q52 .
?statement :P554q ?username .
?person rdfs:label ?personname FILTER(LANG(?personname) = "en") .
} LIMIT 100
This query needs to access qualifiers of a statement for "website
account on" (P553). To do this in RDF (and SPARQL), we access the
statement object instead of using simple property :P553c (which
would only give us the value). The statement is found through an
"...s" property; its value is found through a "...v" property; its
qualifiers are found through "...q" properties. Check out the graph
in our paper to get the picture
(http://korrekt.org/page/__Introducing_Wikidata_to_the___Linked_Data_Web
<http://korrekt.org/page/Introducing_Wikidata_to_the_Linked_Data_Web>).
There you can also find how references are accessed.
==Currently existing countries==
PREFIX : <http://www.wikidata.org/__entity/
<http://www.wikidata.org/entity/>>
SELECT ?country ?countryName
WHERE {
?country :P31s ?statement .
?statement :P31v :Q3624078 .
FILTER NOT EXISTS { ?statement :P582q ?endDate }
?country rdfs:label ?countryName FILTER(lang(?countryName)="en"__)
}
Similar pattern as with the Wikipedia accounts, but now we check
that a certain qualifier (end time) does not exist. You could also
find currently married people in this way, etc.
==Descendants of Queen Victoria (Q9439) ==
PREFIX : <http://www.wikidata.org/__entity/
<http://www.wikidata.org/entity/>>
SELECT DISTINCT *
WHERE {
:Q9439 ((^:P25c|^:P22c)+) ?person .
?person rdfs:label ?label
FILTER(LANG(?label) = "en")
} LIMIT 1000
Here, ((^:P25c|^:P22c)+) is a regular expression; ^ is for changing
the direction of a property (has mother -> mother of ...); | is for
"or", + is for one or more repetitions.
==Currently existing countries, ordered by the number of their
current neighbours==
PREFIX : <http://www.wikidata.org/__entity/
<http://www.wikidata.org/entity/>>
SELECT ?countryName (COUNT (DISTINCT ?neighbour) AS ?neighbours)
WHERE {
?country :P31s ?statement .
?statement :P31v :Q3624078 .
FILTER NOT EXISTS { ?statement :P582q ?endDate }
?country rdfs:label ?countryName FILTER(lang(?countryName)="en"__)
OPTIONAL { ?country (:P47s/:P47v) ?neighbour .
?neighbour :P31s ?statement2 .
?statement2 :P31v :Q3624078 .
FILTER NOT EXISTS { ?statement2 :P582q ?endDate2 }
}
} ORDER BY DESC(?neighbours)
Just to give an example of a slightly more complex query ;-) Note
how we use the expression (:P47s/:P47v) rather than :P47c to access
the value of potentially qualified statements here (since qualified
statements are currently not converted to direct :P47c statements).
--
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/
_________________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
https://lists.wikimedia.org/__mailman/listinfo/wikidata-l
<https://lists.wikimedia.org/mailman/listinfo/wikidata-l>
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
--
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l