Thanks Jan, I will pursue the badminton discussion on the talk page.

On Sat, Jun 15, 2019 at 5:49 PM Jan Ainali <[email protected]> wrote:

> Hello Gabriel,
>
> I agree with you about the badminton tournaments, that seems odd. It
> appears to already be a discussion about that on the talk page of the only
> participant in the badminton project:
> https://www.wikidata.org/wiki/User_talk:Florentyna#subclass_of:_badminton_tournament
>
> Perhaps it is best to continue the discussion there?
>
> /Jan Ainali
> http://ainali.com
>
>
> Den lör 15 juni 2019 kl 23:11 skrev Gabriel Altay <[email protected]
> >:
>
>> Hello everyone,
>>
>> I was playing around with a recent wikidata dump and extracted the items
>> that "looked" like classes based on the definition here,
>>
>> https://www.wikidata.org/wiki/Wikidata:WikiProject_Ontology/Classes
>>
>> Specifically, an item is a class-item if any of the following are true,
>>   * the item is the value of a P31 ("instance of") statement
>>
>>   * the item has a P279 ("subclass of") statement (subclass)
>>
>>   * the item is the value of a P279 ("subclass of") statement (superclass)
>>
>> Once I extracted all items that met these criteria (2,399,621 items
>> from wikidata-20190603-all.json.bz2) I started examining the results.  One
>> of the things I found slightly surprising is that there are about 23k
>> badminton events that are classes b/c they have "subclass of
>> https://www.wikidata.org/wiki/Q13357858"; statements.  SPARQL query
>> below.
>>
>>
>> https://query.wikidata.org/#SELECT%20%3Fitem%20%3FitemLabel%20%0AWHERE%20%0A%7B%0A%20%20%3Fitem%20wdt%3AP31%20wd%3AQ57733494.%0A%20%20%3Fitem%20wdt%3AP279%20wd%3AQ13357858.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%7D
>>
>> It also looks like there is a badminton project page,
>> https://www.wikidata.org/wiki/Category:WikiProject_Badminton
>> https://www.wikidata.org/wiki/Wikidata:WikiProject_Badminton/Subclass
>>
>>
>> I'd like to remove these statements as it seems that a particular
>> instance of a badminton tournament
>> https://www.wikidata.org/wiki/Q121940
>> is not a class.
>>
>> It seems that this pattern is also in place for about 1,000,000 items
>> which are instance of gene (e.g. https://www.wikidata.org/wiki/Q40108).
>>
>> I had a couple questions for the mailing list,
>>
>>  1) do folks know if there is an active group working on wikidata ontology
>>  2) i've read a few messages about shape expressions.  would it be
>> worthwhile to setup a shape expression that prevents most items from having
>> both "instance of" and "subclass of" statements?
>>  3) if these entries are generated by bots, what is the best way to get
>> in touch with the owner, their user talk page?
>>
>> I am probably missing a lot of information about what has been done so
>> far in the community, but I'm happy to read anything someone points me
>> towards.
>>
>> best,
>> -Gabriel
>> _______________________________________________
>> Wikidata mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
> _______________________________________________
> Wikidata mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
_______________________________________________
Wikidata mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to