Here's the (nearly) equivalent query for the statements dump[1] loaded into Blazegraph:
PREFIX wd: <http://www.wikidata.org/entity/> SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city wd:P31s/wd:P31v wd:Q515 . # find instances of subclasses of city ?city wd:P6s ?statement . # with a P6 (head of goverment) statement ?statement wd:P6v ?mayor . # ... that has the value ?mayor ?mayor wd:P21s/wd:P21v wd:Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement wd:P582q ?x } # ... but the statement has no P582 (end date) qualifier # Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city wd:P1082s/wd:P1082v/<http://www.wikidata.org/ontology#numericValue> ?population . # Optionally, find English labels for city and mayor: OPTIONAL { ?city wd:P373s/wd:P373v ?citylabel . # FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor wd:P373s/wd:P373v ?mayorlabel . # FILTER ( LANG(?mayorlabel) = "en" ) } } ORDER BY DESC(?population) LIMIT 100 Free beer to anyone who can figure out how to use those language filters. Would we need to also load property definitions[2]? 1. http://tools.wmflabs.org/wikidata-exports/rdf/exports/20150223/wikidata-statements.nt.gz 2. http://tools.wmflabs.org/wikidata-exports/rdf/exports/20150223/wikidata-properties.nt.gz On Tue, Apr 21, 2015 at 11:13 AM, Jeremy Baron <[email protected]> wrote: > Hi, > > On Tue, Apr 21, 2015 at 5:05 PM, Thad Guidry <[email protected]> wrote: > > We had US Census, World Bank, and UN Data as our primary data sources > for any /statistics/ of a City/Town/Village. Here's Houston - > https://www.freebase.com/m/03l2n#/location/statistical_region > > I don't understand where a lot of those numbers are from. > > Also, maybe Houston is a bad example because the Census Bureau revised > numbers after the data was released.[0] Even some official Census > Bureau sites still report the old, pre-appeal number.[1] > > There are multiple years that have duplicate conflicting values after > clicking "65 values total »" at your link. At first I was thinking it > may be something like estimates base vs. estimate vs. decennial. > However, for 2010 and 2011 there's one value that matches estimate > from [1] (source = [2]) and a larger value (source = [3]) that does > not match any other data I've seen. [2] and [3] both use the same > "Attribution URI" [4]. > > In any case, why take this from freebase instead of importing directly > from Census Bureau data? It's available in bulk. Format isn't great > but isn't horrible either. (at least the 5-year ACS is inconsistent > about upper/lower case for state two letter abbreviations. and, I > think, most humans would prefer something like a geoid as a key rather > than a dataset specific key used to look up the geoid in a different > file. and other quirks) > > -Jeremy > > [0] > http://www.chron.com/news/houston-texas/houston/article/City-wins-census-appeal-count-adjusted-4087372.php > [1] > http://factfinder.census.gov/bkmk/table/1.0/en/PEP/2013/PEPANNRES/1620000US4835000 > [2] https://www.freebase.com/g/11x1k306j > [3] https://www.freebase.com/m/0jst35z > [4] http://www.census.gov/popest/about/terms.html > > _______________________________________________ > Wikidata-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata-l >
_______________________________________________ Wikidata-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-l
