Very useful, Amir, thanks! I just ran it for occupation=painter (p=P106&q=Q1028181) Am I correct in my interpretation that in general painters have fewer claims than the entire population of items with the property occupation?
On Tue, Dec 8, 2015 at 6:48 PM, Amir Ladsgroup <ladsgr...@gmail.com> wrote: > Hey, > There has been several discussion regarding quality of information in > Wikidata. I wanted to work on quality of wikidata but we don't have any > source of good information to see where we are ahead and where we are > behind. So I thought the best thing I can do is to make something to show > people how exactly sourced our data is with details. So here we have > *http://tools.wmflabs.org/wd-analyst/index.php > <http://tools.wmflabs.org/wd-analyst/index.php>* > > You can give only a property (let's say P31) and it gives you the four > most used values + analyze of sources and quality in overall (check this > out <http://tools.wmflabs.org/wd-analyst/index.php?p=P31>) > and then you can see about ~33% of them are sources which 29.1% of them > are based on Wikipedia. > You can give a property and multiple values you want. Let's say you want > to compare P27:Q183 (Country of citizenship: Germany) and P27:Q30 (US) > Check this out > <http://tools.wmflabs.org/wd-analyst/index.php?p=P27&q=Q30%7CQ183>. And > you can see US biographies are more abundant (300K over 200K) but German > biographies are more descriptive (3.8 description per item over 3.2 > description over item) > > One important note: Compare P31:Q5 (a trivial statement) 46% of them are > not sourced at all and 49% of them are based on Wikipedia **but* *get > this statistics for population properties (P1082 > <http://tools.wmflabs.org/wd-analyst/index.php?p=P1082>) It's not a > trivial statement and we need to be careful about them. It turns out there > are slightly more than one reference per statement and only 4% of them are > based on Wikipedia. So we can relax and enjoy these highly-sourced data. > > Requests: > > - Please tell me whether do you want this tool at all > - Please suggest more ways to analyze and catch unsourced materials > > Future plan (if you agree to keep using this tool): > > - Support more datatypes (e.g. date of birth based on year, > coordinates) > - Sitelink-based and reference-based analysis (to check how much of > articles of, let's say, Chinese Wikipedia are unsourced) > > > - Free-style analysis: There is a database for this tool that can be > used for way more applications. You can get the most unsourced statements > of P31 and then you can go to fix them. I'm trying to build a playground > for this kind of tasks) > > I hope you like this and rock on! > <http://tools.wmflabs.org/wd-analyst/index.php?p=P136&q=Q11399> > Best > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata