follow up: I now see all of the extra information in the ontology project https://www.wikidata.org/wiki/Wikidata:WikiProject_Ontology and will look into contributing there.
On Sat, Jun 15, 2019 at 5:10 PM Gabriel Altay <[email protected]> wrote: > Hello everyone, > > I was playing around with a recent wikidata dump and extracted the items > that "looked" like classes based on the definition here, > > https://www.wikidata.org/wiki/Wikidata:WikiProject_Ontology/Classes > > Specifically, an item is a class-item if any of the following are true, > * the item is the value of a P31 ("instance of") statement > > * the item has a P279 ("subclass of") statement (subclass) > > * the item is the value of a P279 ("subclass of") statement (superclass) > > Once I extracted all items that met these criteria (2,399,621 items > from wikidata-20190603-all.json.bz2) I started examining the results. One > of the things I found slightly surprising is that there are about 23k > badminton events that are classes b/c they have "subclass of > https://www.wikidata.org/wiki/Q13357858" statements. SPARQL query > below. > > > https://query.wikidata.org/#SELECT%20%3Fitem%20%3FitemLabel%20%0AWHERE%20%0A%7B%0A%20%20%3Fitem%20wdt%3AP31%20wd%3AQ57733494.%0A%20%20%3Fitem%20wdt%3AP279%20wd%3AQ13357858.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%7D > > It also looks like there is a badminton project page, > https://www.wikidata.org/wiki/Category:WikiProject_Badminton > https://www.wikidata.org/wiki/Wikidata:WikiProject_Badminton/Subclass > > > I'd like to remove these statements as it seems that a particular instance > of a badminton tournament > https://www.wikidata.org/wiki/Q121940 > is not a class. > > It seems that this pattern is also in place for about 1,000,000 items > which are instance of gene (e.g. https://www.wikidata.org/wiki/Q40108). > > I had a couple questions for the mailing list, > > 1) do folks know if there is an active group working on wikidata ontology > 2) i've read a few messages about shape expressions. would it be > worthwhile to setup a shape expression that prevents most items from having > both "instance of" and "subclass of" statements? > 3) if these entries are generated by bots, what is the best way to get in > touch with the owner, their user talk page? > > I am probably missing a lot of information about what has been done so far > in the community, but I'm happy to read anything someone points me towards. > > best, > -Gabriel >
_______________________________________________ Wikidata mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata
