Here's the (nearly) equivalent query for the statements dump[1] loaded into
Blazegraph:

PREFIX wd: <http://www.wikidata.org/entity/>
SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE {
  ?city      wd:P31s/wd:P31v wd:Q515     .      # find instances of
subclasses of city
  ?city      wd:P6s          ?statement  .      # with a P6 (head of
goverment) statement
  ?statement wd:P6v          ?mayor      .      # ... that has the value
?mayor
  ?mayor     wd:P21s/wd:P21v wd:Q6581072 .      # ... where the ?mayor has
P21 (sex or gender) female
  FILTER NOT EXISTS { ?statement wd:P582q ?x }  # ... but the statement has
no P582 (end date) qualifier

  # Now select the population value of the ?city
  # (the number is reached through a chain of three properties)
  ?city wd:P1082s/wd:P1082v/<http://www.wikidata.org/ontology#numericValue>
?population .

  # Optionally, find English labels for city and mayor:
  OPTIONAL {
    ?city wd:P373s/wd:P373v ?citylabel .
    # FILTER ( LANG(?citylabel) = "en" )
  }
  OPTIONAL {
    ?mayor wd:P373s/wd:P373v ?mayorlabel .
    # FILTER ( LANG(?mayorlabel) = "en" )
  }

} ORDER BY DESC(?population) LIMIT 100

Free beer to anyone who can figure out how to use those language filters.
Would we need to also load property definitions[2]?

1.
http://tools.wmflabs.org/wikidata-exports/rdf/exports/20150223/wikidata-statements.nt.gz
2.
http://tools.wmflabs.org/wikidata-exports/rdf/exports/20150223/wikidata-properties.nt.gz

On Tue, Apr 21, 2015 at 11:13 AM, Jeremy Baron <[email protected]>
wrote:

> Hi,
>
> On Tue, Apr 21, 2015 at 5:05 PM, Thad Guidry <[email protected]> wrote:
> > We had US Census, World Bank, and UN Data as our primary data sources
> for any /statistics/ of a City/Town/Village.  Here's Houston -
> https://www.freebase.com/m/03l2n#/location/statistical_region
>
> I don't understand where a lot of those numbers are from.
>
> Also, maybe Houston is a bad example because the Census Bureau revised
> numbers after the data was released.[0] Even some official Census
> Bureau sites still report the old, pre-appeal number.[1]
>
> There are multiple years that have duplicate conflicting values after
> clicking "65 values total »" at your link. At first I was thinking it
> may be something like estimates base vs. estimate vs. decennial.
> However, for 2010 and 2011 there's one value that matches estimate
> from [1] (source = [2]) and a larger value (source = [3]) that does
> not match any other data I've seen. [2] and [3] both use the same
> "Attribution URI" [4].
>
> In any case, why take this from freebase instead of importing directly
> from Census Bureau data? It's available in bulk. Format isn't great
> but isn't horrible either. (at least the 5-year ACS is inconsistent
> about upper/lower case for state two letter abbreviations. and, I
> think, most humans would prefer something like a geoid as a key rather
> than a dataset specific key used to look up the geoid in a different
> file. and other quirks)
>
> -Jeremy
>
> [0]
> http://www.chron.com/news/houston-texas/houston/article/City-wins-census-appeal-count-adjusted-4087372.php
> [1]
> http://factfinder.census.gov/bkmk/table/1.0/en/PEP/2013/PEPANNRES/1620000US4835000
> [2] https://www.freebase.com/g/11x1k306j
> [3] https://www.freebase.com/m/0jst35z
> [4] http://www.census.gov/popest/about/terms.html
>
> _______________________________________________
> Wikidata-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to