In case anyone is a bit lost, Tom is proposing an approach to
classification we've been calling "explicit metamodeling".  Simply put,
let's say you have a class hierarchy:

A subclass of B
B subclass of C
C subclass of D

The proposal, as I understand it, is to add "instance of" claims for almost
all classes in Wikidata, which would yield classifications like:

A subclass of B
A instance of 'type of B'
B subclass of 'C'
B instance of 'type of C'
C subclass of 'D'
C instance of 'type of D'

The rationale for this is enable querying "direct subclasses" or "immediate
subclasses" of any given class.  This approach might be theoretically valid
in all classification, but I don't think it's a sensible solution for most
classification.  As you can see, explicit metamodeling introduces claims
that are rather redundant.  Tom's idea seems to be to use this approach for
almost all classification on Wikidata.

I am not enthusiastic about pervasively using that approach to
classification throughout Wikidata.  There are other ways to get direct
subclasses, several of which are described in
http://answers.semanticweb.com/questions/14699/get-immediate-subclasses-of-a-class.
For example, you could turn off entailments / inferencing in a query engine
you're using.  You could do a typical subclass query and filter out
non-direct subclasses.

Those querying approaches seem much simpler and more conventional than
saturating a concept hierarchy with redundant 'instance of 'type of Foo'"
statements to enable the querying direct subclasses.

The extended discussions on this Tom refers can be found at
https://www.wikidata.org/wiki/Wikidata_talk:Country_subdivision_task_force#layers.
In addition to introducing substantial redundancy, you might get the
feeling as I did when reading through that discussion that widespread use
of "explicit metamodeling" would be quite confusing for users.

What are others' thoughts?

Thanks,
Eric
https://www.wikidata.org/wiki/User:Emw




On Wed, Jun 11, 2014 at 6:56 AM, Thomas Douillard <
thomas.douill...@gmail.com> wrote:

>
>
>
> I'm still talking of the model I proposed in my first post in this thread.
> I did give an advantage : you can really simply query the type of units
> used by a country to class his administrative units, like <Region>,
> <Departement> (as two items) for France with just simple in one request of
> the future simple query module : just retrieve the instances of the class
> <French type of administrative units>. This model also apply to any country
> administrative territorial division.
>
> I think <French type of administrative units> is the "auxiliary item"
> markus did mention. We can define precisely what it is becaus this class
> <French type of administrative units> regroups the types used by france to
> class cities, departments ... so clearly if we talk of "Paris", there is
> several possibility, like the region of Paris, the City of Paris ... <The
> city of Paris> item is clearly the only one who is clearly defined by
> french law, hence it is an instance of the class <French ville>, who in
> turn is an instance of <French type of administrative units>.
>
> This seems to me a useful model, who can generalise easily to class things
> like "Urban units", who are used for statistical purpose and are defined by
> national statistical organism in each country, such as INSEE in france
> Then we could also have a <Urban unit> class in Wikidata, but this is
> ambiguous.
>
> This class could have a subclass <Urban unit as defined by INSEE>, with
> instances such as <Parisian urban unit>, for hich it gives statistical
> information. <Urban unit as defined by INSEE> in turn may be a subclass of
> <any geographical unit defined by INSEE>. Then if you want INSEE
> geographical units, you query all instances of both <Urban unit> and <any
> geographical unit defined by INSEE>.
>
> But now let's say you want to find the definition of urban unit by the
> INSEE itself, not the instances. One way to do that would be to look at the
> subclass tree of  <any geographical unit defined by INSEE> or the one of
> <Urban unit>, or compute the intersection of both trees.
>
> One alternative, using metamodelling this time, would be to have a class
> regrouping all definitions of statistical units used by INSEE. Those
> definitions identify to some class we already have, such as <Urban unit as
> defined by INSEE> identifies to the definition of urban unit by INSEE. Then
> the class of all definitions used by INSEE would identify to the class of
> all classes with a name <... unit defined by INSEE>. I propose to create
> the item <any type of unit defined by INSEE> ; with <Urban unit as defined
> by INSEE> an instance of it (actually I might have already have done it /o\)
>
> This is an (non mutually exclusive, more complementary) alternative to
> just class administrative units instances. In a way, this is just
> identifying (reifying) a classification system and putting an ''instance
> of'' <this item> statements to its classes.
>
> I think this is interesting in Wikidata as we actually are using a lot of
> different classification system, having a clear and practicable pattern
> like this one, using just a few tools like reasonator, is useful.
>
>
> For example if I want to compute the subclass tree of statistical units,
> but only the classes used by INSEE, I can compute the subclass tree of the
> <statistical geographical unit> class but filtering only the instances of
> <any type of unit defined by INSEE>, instead of computing a tree
> intersection.
>
> This kind of patterns occurs also for example in biological taxonomy. They
> use classes such as "Genera", "taxon", "clade". Imo any instance of taxon
> can be seen as a class of living organisms. Then Taxon is a class of class,
> just as <any type of unit defined by INSEE> can be seen by a class of
> class. <Clade> is a subclass of taxon because any clade is a taxon. So
> clade is also a class of class.
>
> This makes this pattern actually already used, and studied in the world of
> semantic web. As it's simple and can be found in a lot of domains, I think
> we should use it in Wikidata.
>
>
>
> 2014-06-11 11:51 GMT+02:00 Gerard Meijssen <gerard.meijs...@gmail.com>:
>
> Hoi,
>> You lost me. What have classes to do with this?
>>
>> The system is flexible and you assume that there are special cases...
>> Name one and it can be fit in.
>> Thanks,
>>      GerardM
>>
>>
>> On 11 June 2014 11:48, Thomas Douillard <thomas.douill...@gmail.com>
>> wrote:
>>
>>> There is always special cases that needs special treatments, inference
>>> is done again and again, there is no point to recompute everything with an
>>> algorithm or a query with hard coded exceptions when we have a simple and
>>> regular system who can handle and put the exceptions in the datas.
>>>
>>> Class of classes is such a system. I'll quote the python programming
>>> language motto here "explicit is better than implicit", which is not really
>>> different from "avoid redundances at all cost is not always a good thing".
>>>
>>> The same reason we have classes in the first time, can also apply to
>>> classes of classes. Some class membership can be inferred by a query, I
>>> don't think it's always a bad idea to state the membership explicitely in
>>> all cases.
>>>
>>>
>>> 2014-06-11 11:21 GMT+02:00 Gerard Meijssen <gerard.meijs...@gmail.com>:
>>>
>>> Hoi,
>>>> A bot run by Amir has remedied many of the issues that resulted from an
>>>> import of data for the United States. The fix was to only point to one
>>>> level up and not have a reference to the state from every location. It is
>>>> implicitly there.. in the final analysis we do not need to know in what
>>>> country something is as it can be inferred.
>>>>
>>>> This system assumes that we build the upper layers as is relevant to a
>>>> specific country,. So yes it is usable for any country, type of
>>>> administrative or territorial entity including how for instance the Roman
>>>> Catholic church does its thing.
>>>> Thanks,
>>>>      Gerard
>>>>
>>>>
>>>> On 11 June 2014 11:08, Thomas Douillard <thomas.douill...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi, Gerard, I don't understand, As needed for what ?
>>>>> In your example it is enough to retrieve all the territorial entities
>>>>> a location is in.
>>>>>
>>>>> But let's say I want to get the administrative territorial
>>>>> organisation of France (Wikipedias probably ), I mean like "france is
>>>>> divided in regions, regions are divided in departments, and so on), for
>>>>> example, do we have enough in your model ?
>>>>>
>>>>> I propose to add to the classes like <French Region> for example an
>>>>> "instance of" claim that states <French Region> instance of <French
>>>>> administrative division type> to reflect that in Wikidata.
>>>>>
>>>>> Then if I want to know how france is administratively divided, I query
>>>>> all the instances in that class.
>>>>>
>>>>> This is a complement to <Pays de la Loire> instance of <French region>
>>>>> for example.
>>>>>
>>>>>
>>>>>
>>>>> 2014-06-11 10:37 GMT+02:00 Gerard Meijssen <gerard.meijs...@gmail.com>
>>>>> :
>>>>>
>>>>> Hoi,
>>>>>> Important to recognise is that there can be as many layers as are
>>>>>> needed.. ie a roller coaster can be in a park, a park can be in a
>>>>>> settlement, a settlement in a municipality, a municipality in a county, a
>>>>>> county in a province, a province in a state and finally a state in a
>>>>>> country (that is on a continent)...
>>>>>>
>>>>>> This is how it effectively is already in Wikidata for many "locations"
>>>>>> Thanks,
>>>>>>       Gerard
>>>>>>
>>>>>>
>>>>>> On 11 June 2014 09:48, Thomas Douillard <thomas.douill...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi, I basically proposed a two layers model in extended discussions
>>>>>>> :
>>>>>>> Administrative units | Administrative unit type | Administrative
>>>>>>> unit classes by country
>>>>>>> City Of London       | City of the UK           | Type of
>>>>>>> administrative unit of the UK
>>>>>>> Lorraine             | French Region            | Type of
>>>>>>> administrative unit of France
>>>>>>>
>>>>>>> Where going one step left in the table reads ''instance of''. This
>>>>>>> seem close to your ''helper item'' model.
>>>>>>>
>>>>>>>
>>>>>>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to