Hi Eric,

I have been following this issue for a long time, and I haven't found any
satisfactory answer from anyone or from any source. Totally disappointed I
decided to scrap everything and start again from the beginning, working on
a foundation that eventually would allow to represent emergent systems and
natural language. Pulling from that thread I got some interesting insights,
that I hope you find useful:
https://docs.google.com/document/d/1dc99zyxdPX8t6Ept_JjSxNalij2PNdYt-A3TG2e4cIw/edit?usp=sharing

If you don't have time or patience to read the whole 27 pages (which is
just an introduction that needs refinement and to be expanded):
- in real life there is no difference between calling something
"metaclass", "identifier" or "information", they all refer to an observer
system encoding an observed signal together with a pattern recognition model
- in real life instantiation is an observer system putting a frame on an
observed system. This frame might be more or less justified considering the
intrinsic qualities of what is placed inside, but in the end it is entirely
observer-defined, so quite useless other than to point to an "agreed or
arbitrary bottom concept chosen by the observer"

( I want to emphasize "in real life" because sometimes I notice a focus in
devising "mathematical sound" approaches and not so much approaches based
on reality itself.  )

In my opinion the solution to atoms is to separate identifiers from the
entity with a new property ("has/identifier of" or "has/manifestation of")
and use instance_of only when really necessary.

I hope we don't have to resort to Captain Metaphysics :)
http://existentialcomics.com/comic/47

Cheers,
Micru

On Fri, Sep 26, 2014 at 2:59 PM, Emw <emw.w...@gmail.com> wrote:

> The statement "ethanol *instance of* chemical compound" is ontologically
> incorrect.  Importantly, it is also incompatible with ChEBI, the most
> widely-used chemistry ontology.
>
> The matter of how to apply *instance of* (P31, rdf:type) and *subclass of*
> (P279, rdfs:subClassOf) on Wikidata in relation to chemical entities has
> been, as Thomas puts it, a long discussion [1-5].  Hopefully with a wider
> audience and experts like Markus Krötzsch and Denny Vrandečić now
> interested, we can come to a resolution at least in the particular domain
> of chemical compounds.  Since it concerns interoperability with another
> large Semantic Web project, I have copied Janna Hastings and Alan
> Ruttenberg on this discussion.  Janna coordinates ChEBI.  Alan coordinates
> BFO, the upper ontology used by ChEBI and many other major ontologies in
> the natural sciences, like Gene Ontology and Disease Ontology.
>
> Denny indicates how the statement "Porsche 356 *instance of *car" would
> be incorrect in Wikidata even though "Porsche 356 *is a* car" is
> acceptable in everyday speech.  Similarly, "ethanol *instance of*
> chemical compound" is incorrect in Wikidata even though "ethanol *is a*
> chemical compound" is acceptable in less formal contexts.
>
> A key difference between talk about cars and talk about chemicals is that,
> with cars, we have familiar terms like "car model" that distinguish
> concrete instances (that *particular* car you see on the street) from
> abstract "instances" (i.e. metaclasses, classes that are also instances,
> the *kind* of car that you see on the street).  We do not have a
> well-known term like "chemical model" or "chemical compound type" to
> distinguish classes (types) of chemicals and instances (tokens) of
> chemicals.  When one speaks of the properties of ethanol or hydrogen, it is
> understood that the subject is *all concrete, particular, spatiotemporal
> tokens, i.e. instances *of ethanol and hydrogen -- not just a specific
> ethanol molecule floating in that container before you on a Saturday with
> friends, but all molecules that we label "ethanol" everywhere.
>
> Thus, in order to formally classify ethanol itself as opposed to some
> particular ethanol molecule, we must say for an item like
> http://www.wikidata.org/wiki/Q153: "ethanol *subclass of* chemical
> compound" and not "ethanol *instance of* chemical compound".  (On
> Wikidata, the statement is more precisely "ethanol *subclass of *alcohol",
> but it is entailed from the statements "alcohol *subclass of* organic
> compound" and "organic compound *subclass of* chemical compound" that
> "ethanol *subclass of* chemical compound".)
>
> A common defense of statements like "ethanol *instance of* chemical
> compound" is that Wikidata will never have items about any concrete
> molecules of ethanol, so, since ethanol is a "leaf node" in our concept
> taxonomy, it makes sense to state that ethanol is an instance.  That
> interpretation of "instance" is short-sighted.  It precludes us from ever
> talking about particular tokens of ethanol, or particular aggregates of
> such objects, without overhauling our chemistry ontology.  Excluding
> consideration of metaclasses like "chemical compound type", the fact that
> an entity is a leaf node in a concept hierarchy is a necessary but not
> sufficient condition for using *instance of*.
>
> Another common suggestion is that we should state something like "ethanol 
> *instance
> of* chemical compound type" and "ethanol *subclass of* chemical
> compound".
>
> To see where that gets us, try wrapping your head around this:
> https://commons.wikimedia.org/wiki/File:Atom_classes.svg.  Really, take a
> look.  If we want Wikidata's concept hierarchy to be seen as of dauntingly
> complex, pervasively applying that kind of three-layer classification
> scheme will do.
>
> The kind of explicit metamodeling seen when punning things like cars and
> car models, ships and ship classes, biological taxa and organisms, etc.
> works reasonably well in certain domains.  But, while we hold that hammer
> in one hand, we should be careful not to see everything as a nail.  Outside
> domains that have established vocabulary for metaclasses, imposing explicit
> metamodeling with statements like "ethanol *instance of* chemical
> compound type" or "hydrogen *instance of* atom type" will strike users as
> unduly complex.
>
> Without such metamodeling, though, querying for a list of chemical
> compounds becomes murkier.  Surely we would want to return "ethanol" and
> not "organic compound" in such a list.  How about "alcohol"?  Relatedly, if
> we don't state "oxygen *instance of *chemical element", then how can we
> easily query for all the elements in the Periodic Table of Elements without
> including in the results of any potential subclasses of oxygen (e.g.,
> isotopes of oxygen like oxygen-16, oxygen-17, etc.)?
>
> There are ways to achieve that in SPARQL using rdfs:subClassOf / P279 / 
> *subclass
> of*, but they require adhering to certain conventions.  When faced with
> requiring many potential query users to learn some Wikidata MetaObject
> Protocol, though, I'm inclined to make some sacrifices for simplicity,
> ontological correctness, and consistency with major existing ontologies.
>
> In summary, this ball has punted for over a year now.  Because of the
> impasse in how to classify chemical entities, we now have showcase items
> that have obvious problems like entailing that something is both a class
> and an instance of chemical compound.   We need input from a wider group of
> people knowledgeable about ontology or chemistry, ideally both.  Hopefully
> with a Wikimedian in Residence at the Royal Society of Chemistry [6] we'll
> get some more focused resources on this.  All major scientific ontologies
> use *subclass of* (rdfs:subClassOf), not *instance of* (rdf:type), to
> classify such things.  In my opinion, Wikidata should maintain technical
> and philosophical compatibility with ontologies like ChEBI and remove
> statements like "ethanol *instance of* chemical compound".  This would
> improve interoperability between Wikidata and the rest of the Semantic Web.
>
> Thanks,
> Eric
>
> https://www.wikidata.org/wiki/User:Emw
>
> 1.
> https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2014/07#Forth_and_back_conversions_of_items_between_class_and_instance
> 2.
> https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2014/05#chemical_element
> .
> 3.
> https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Chemistry#Germanium_subclass_tree.
>
> 4.
> https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2014/07#Subclass_of_two_different_things
> 5.
> https://www.wikidata.org/wiki/Help_talk:Basic_membership_properties#Proposition_of_definition
> 6.
> http://pigsonthewing.org.uk/wikimedian-residence-royal-society-chemistry/
>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>


-- 
Etiamsi omnes, ego non
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to