On 21.04.2015 02:05, James Douglas wrote:
This is super cool, thanks for sharing!  Would you mind if I write it up
for the Wikidata Query Service docs?


No, of course not. We could certainly use some more documentation. Be aware, however, that the RDF export format is still subject to change, so the query will have to change accordingly in the future.

Markus

On Mon, Apr 20, 2015 at 3:50 PM, Markus Krötzsch
<[email protected] <mailto:[email protected]>>
wrote:

    On 20.04.2015 23:47, Daniel Kinzler wrote:

        Something seems to be wrong with the order, though. Munich (pop
         > 1m in all
        statements) is listed way after Chemnitz (pop < 300k in all
        statements). Any
        idea why?


    Good catch. My query was too simple (using one "random" population
    instead of the biggest one). Here is a better query, this time even
    with populations given:

    PREFIX : <http://www.wikidata.org/entity/>
    SELECT ?city (MAX(?population) AS ?max_population)  ?citylabel
    ?mayorlabel WHERE {
      ?city :P31c/:P279c* :Q515 .  # find instances of subclasses of city
      ?city :P6s ?statement .      # with a P6 (head of goverment) statement
      ?statement :P6v ?mayor .     # ... that has the value ?mayor
      ?mayor :P21c :Q6581072 .     # ... where the ?mayor has P21 (sex
    or gender) female
      FILTER NOT EXISTS { ?statement :P582q ?x }  # ... but the
    statement has no P582 (end date) qualifier

      # Now select the population value of the ?city
      # (the number is reached through a chain of three properties)
      ?city
    :P1082s/:P1082v/<http://www.wikidata.org/ontology#numericValue>
    ?population .

      # Optionally, find English labels for city and mayor:
      OPTIONAL {
        ?city rdfs:label ?citylabel .
        FILTER ( LANG(?citylabel) = "en" )
      }
      OPTIONAL {
        ?mayor rdfs:label ?mayorlabel .
        FILTER ( LANG(?mayorlabel) = "en" )
      }
    } GROUP BY ?city ?citylabel ?mayorlabel
    ORDER BY DESC(?max_population) LIMIT 100


        Oh... maybe quantity values are sorted in alphanumeric order,
        because they are
        decimal strings? They should be xsd:decimal...


    They are.

    Markus



        Am 20.04.2015 um 22:18 schrieb Markus Krötzsch:

            Hi all,

            For many years, Denny and I have been giving talks about why
            we need to improve
            the data management in Wikipedia. To explain and motivate
            this, we have often
            asked the simple question: "What are the world's largest
            cities with a female
            mayor?" The information to answer this is clearly in
            Wikipedia, but it would be
            painfully hard to get the result by reading articles.

            I recently had the occasion of actually phrasing this in
            SPARQL, so that an
            answer can now, finally, be given. The query to run at

            http://milenio.dcc.uchile.cl/sparql

            is as follows (with some explaining comments inline):

            PREFIX : <http://www.wikidata.org/entity/> SELECT DISTINCT
            ?city ?citylabel
            ?mayorlabel WHERE {
               ?city :P31c/:P279c* :Q515 .  # find instances of
            subclasses of city
               ?city :P6s ?statement .      # with a P6 (head of
            goverment) statement
               ?statement :P6v ?mayor .     # ... that has the value ?mayor
               ?mayor :P21c :Q6581072 .     # ... where the ?mayor has
            P21 (sex or gender) female
               FILTER NOT EXISTS { ?statement :P582q ?x }  # ... but the
            statement has no P582
            (end date) qualifier

               # Now select the population value of the ?city
               # (the number is reached through a chain of three properties)
               ?city
            :P1082s/:P1082v/<http://www.wikidata.org/ontology#numericValue>
            ?population .

               # Optionally, find English labels for city and mayor:
               OPTIONAL {
                 ?city rdfs:label ?citylabel .
                 FILTER ( LANG(?citylabel) = "en" )
               }
               OPTIONAL {
                 ?mayor rdfs:label ?mayorlabel .
                 FILTER ( LANG(?mayorlabel) = "en" )
               }
            } ORDER BY DESC(?population) LIMIT 100

            To see the results, just paste this into the box at
            http://milenio.dcc.uchile.cl/sparql and press "Run query".

            The query does not filter the most recent population but
            relies on Virtuoso to
            pick the biggest value for DESC sorting, and on the world to
            have (mostly)
            cities with increasing population numbers over time. This is
            also the reason why
            the population is not printed (it would give you more than
            one match per city
            then, even with DISTINCT). Picking the current population
            will become easier
            once ranks are used more widely to mark it.

            There might also be some inaccuracies in cases where a past
            mayor does not have
            an "end date" set in Wikidata (Madrid has a suspiciously
            large number of current
            mayors ...), but a query can only ever be as good as its
            input data.

            I hope this is inspiring to some of you. One could also look
            for the world's
            youngest or oldest current mayors with similar queries, for
            example.

            Cheers,

            Markus


            _______________________________________________
            Wikidata-l mailing list
            [email protected]
            <mailto:[email protected]>
            https://lists.wikimedia.org/mailman/listinfo/wikidata-l





    _______________________________________________
    Wikidata-l mailing list
    [email protected] <mailto:[email protected]>
    https://lists.wikimedia.org/mailman/listinfo/wikidata-l




_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l



_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to