Dear any Wikidata Query Service expert,

In connection with an editathon, I have made statistics of the number of women and men on the Danish Wikipedia. I have used WDQS for that and the query is listed below:

SELECT ?count ?gender ?genderLabel
WITH {
  SELECT ?gender (COUNT(*) AS ?count) WHERE {
    ?item wdt:P31 wd:Q5 .
    ?item wdt:P21 ?gender .
    ?article schema:about ?item.
    ?article schema:isPartOf <https://da.wikipedia.org/>
  }
  GROUP BY ?gender
} AS %results
WHERE {
  INCLUDE %results
  SERVICE wikibase:label { bd:serviceParam wikibase:language "da,en". }
}
ORDER BY DESC(?count)
LIMIT 25

http://tinyurl.com/y8twboe5

As the statistics could potentially create some discussion (and ready seems to have) I am wondering whether there are some experts that could peer review the SPARQL query and tell me if there are any issues. I hope I have not made a blunder...

The minor issues I can think of are:

- Missing gender in Wikidata. We have around 360 of these.

- People on the Danish Wikipedia not on Wikidata. Probably tens-ish or hundreds-ish!?

- People not being humans. The gendered items I sampled were all fictional humans.


We previously reached 17.2% females. Now we are below 17% due to mass-import of Japanese football players, - as far as we can see.


best regards
Finn Årup Nielsen
http://people.compute.dtu.dk/faan/

_______________________________________________
Wikidata mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to