Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why?
Oh... maybe quantity values are sorted in alphanumeric order, because they are decimal strings? They should be xsd:decimal... Am 20.04.2015 um 22:18 schrieb Markus Krötzsch: > Hi all, > > For many years, Denny and I have been giving talks about why we need to > improve > the data management in Wikipedia. To explain and motivate this, we have often > asked the simple question: "What are the world's largest cities with a female > mayor?" The information to answer this is clearly in Wikipedia, but it would > be > painfully hard to get the result by reading articles. > > I recently had the occasion of actually phrasing this in SPARQL, so that an > answer can now, finally, be given. The query to run at > > http://milenio.dcc.uchile.cl/sparql > > is as follows (with some explaining comments inline): > > PREFIX : <http://www.wikidata.org/entity/> SELECT DISTINCT ?city ?citylabel > ?mayorlabel WHERE { > ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city > ?city :P6s ?statement . # with a P6 (head of goverment) statement > ?statement :P6v ?mayor . # ... that has the value ?mayor > ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) > female > FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no > P582 > (end date) qualifier > > # Now select the population value of the ?city > # (the number is reached through a chain of three properties) > ?city :P1082s/:P1082v/<http://www.wikidata.org/ontology#numericValue> > ?population . > > # Optionally, find English labels for city and mayor: > OPTIONAL { > ?city rdfs:label ?citylabel . > FILTER ( LANG(?citylabel) = "en" ) > } > OPTIONAL { > ?mayor rdfs:label ?mayorlabel . > FILTER ( LANG(?mayorlabel) = "en" ) > } > } ORDER BY DESC(?population) LIMIT 100 > > To see the results, just paste this into the box at > http://milenio.dcc.uchile.cl/sparql and press "Run query". > > The query does not filter the most recent population but relies on Virtuoso to > pick the biggest value for DESC sorting, and on the world to have (mostly) > cities with increasing population numbers over time. This is also the reason > why > the population is not printed (it would give you more than one match per city > then, even with DISTINCT). Picking the current population will become easier > once ranks are used more widely to mark it. > > There might also be some inaccuracies in cases where a past mayor does not > have > an "end date" set in Wikidata (Madrid has a suspiciously large number of > current > mayors ...), but a query can only ever be as good as its input data. > > I hope this is inspiring to some of you. One could also look for the world's > youngest or oldest current mayors with similar queries, for example. > > Cheers, > > Markus > > > _______________________________________________ > Wikidata-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Daniel Kinzler Senior Software Developer Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V. _______________________________________________ Wikidata-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-l
