Hi Eric,
I have been following this issue for a long time, and I haven't found any
satisfactory answer from anyone or from any source. Totally disappointed I
decided to scrap everything and start again from the beginning, working on
a foundation that eventually would allow to represent emergent systems and
natural language. Pulling from that thread I got some interesting insights,
that I hope you find useful:
https://docs.google.com/document/d/1dc99zyxdPX8t6Ept_JjSxNalij2PNdYt-A3TG2e4cIw/edit?usp=sharing
If you don't have time or patience to read the whole 27 pages (which is
just an introduction that needs refinement and to be expanded):
- in real life there is no difference between calling something
"metaclass", "identifier" or "information", they all refer to an observer
system encoding an observed signal together with a pattern recognition model
- in real life instantiation is an observer system putting a frame on an
observed system. This frame might be more or less justified considering the
intrinsic qualities of what is placed inside, but in the end it is entirely
observer-defined, so quite useless other than to point to an "agreed or
arbitrary bottom concept chosen by the observer"
( I want to emphasize "in real life" because sometimes I notice a focus in
devising "mathematical sound" approaches and not so much approaches based
on reality itself. )
In my opinion the solution to atoms is to separate identifiers from the
entity with a new property ("has/identifier of" or "has/manifestation of")
and use instance_of only when really necessary.
I hope we don't have to resort to Captain Metaphysics :)
http://existentialcomics.com/comic/47
Cheers,
Micru
On Fri, Sep 26, 2014 at 2:59 PM, Emw <[email protected]> wrote:
> The statement "ethanol *instance of* chemical compound" is ontologically
> incorrect. Importantly, it is also incompatible with ChEBI, the most
> widely-used chemistry ontology.
>
> The matter of how to apply *instance of* (P31, rdf:type) and *subclass of*
> (P279, rdfs:subClassOf) on Wikidata in relation to chemical entities has
> been, as Thomas puts it, a long discussion [1-5]. Hopefully with a wider
> audience and experts like Markus Krötzsch and Denny Vrandečić now
> interested, we can come to a resolution at least in the particular domain
> of chemical compounds. Since it concerns interoperability with another
> large Semantic Web project, I have copied Janna Hastings and Alan
> Ruttenberg on this discussion. Janna coordinates ChEBI. Alan coordinates
> BFO, the upper ontology used by ChEBI and many other major ontologies in
> the natural sciences, like Gene Ontology and Disease Ontology.
>
> Denny indicates how the statement "Porsche 356 *instance of *car" would
> be incorrect in Wikidata even though "Porsche 356 *is a* car" is
> acceptable in everyday speech. Similarly, "ethanol *instance of*
> chemical compound" is incorrect in Wikidata even though "ethanol *is a*
> chemical compound" is acceptable in less formal contexts.
>
> A key difference between talk about cars and talk about chemicals is that,
> with cars, we have familiar terms like "car model" that distinguish
> concrete instances (that *particular* car you see on the street) from
> abstract "instances" (i.e. metaclasses, classes that are also instances,
> the *kind* of car that you see on the street). We do not have a
> well-known term like "chemical model" or "chemical compound type" to
> distinguish classes (types) of chemicals and instances (tokens) of
> chemicals. When one speaks of the properties of ethanol or hydrogen, it is
> understood that the subject is *all concrete, particular, spatiotemporal
> tokens, i.e. instances *of ethanol and hydrogen -- not just a specific
> ethanol molecule floating in that container before you on a Saturday with
> friends, but all molecules that we label "ethanol" everywhere.
>
> Thus, in order to formally classify ethanol itself as opposed to some
> particular ethanol molecule, we must say for an item like
> http://www.wikidata.org/wiki/Q153: "ethanol *subclass of* chemical
> compound" and not "ethanol *instance of* chemical compound". (On
> Wikidata, the statement is more precisely "ethanol *subclass of *alcohol",
> but it is entailed from the statements "alcohol *subclass of* organic
> compound" and "organic compound *subclass of* chemical compound" that
> "ethanol *subclass of* chemical compound".)
>
> A common defense of statements like "ethanol *instance of* chemical
> compound" is that Wikidata will never have items about any concrete
> molecules of ethanol, so, since ethanol is a "leaf node" in our concept
> taxonomy, it makes sense to state that ethanol is an instance. That
> interpretation of "instance" is short-sighted. It precludes us from ever
> talking about particular tokens of ethanol, or particular aggregates of
> such objects, without overhauling our chemistry ontology. Excluding
> consideration of metaclasses like "chemical compound type", the fact that
> an entity is a leaf node in a concept hierarchy is a necessary but not
> sufficient condition for using *instance of*.
>
> Another common suggestion is that we should state something like "ethanol
> *instance
> of* chemical compound type" and "ethanol *subclass of* chemical
> compound".
>
> To see where that gets us, try wrapping your head around this:
> https://commons.wikimedia.org/wiki/File:Atom_classes.svg. Really, take a
> look. If we want Wikidata's concept hierarchy to be seen as of dauntingly
> complex, pervasively applying that kind of three-layer classification
> scheme will do.
>
> The kind of explicit metamodeling seen when punning things like cars and
> car models, ships and ship classes, biological taxa and organisms, etc.
> works reasonably well in certain domains. But, while we hold that hammer
> in one hand, we should be careful not to see everything as a nail. Outside
> domains that have established vocabulary for metaclasses, imposing explicit
> metamodeling with statements like "ethanol *instance of* chemical
> compound type" or "hydrogen *instance of* atom type" will strike users as
> unduly complex.
>
> Without such metamodeling, though, querying for a list of chemical
> compounds becomes murkier. Surely we would want to return "ethanol" and
> not "organic compound" in such a list. How about "alcohol"? Relatedly, if
> we don't state "oxygen *instance of *chemical element", then how can we
> easily query for all the elements in the Periodic Table of Elements without
> including in the results of any potential subclasses of oxygen (e.g.,
> isotopes of oxygen like oxygen-16, oxygen-17, etc.)?
>
> There are ways to achieve that in SPARQL using rdfs:subClassOf / P279 /
> *subclass
> of*, but they require adhering to certain conventions. When faced with
> requiring many potential query users to learn some Wikidata MetaObject
> Protocol, though, I'm inclined to make some sacrifices for simplicity,
> ontological correctness, and consistency with major existing ontologies.
>
> In summary, this ball has punted for over a year now. Because of the
> impasse in how to classify chemical entities, we now have showcase items
> that have obvious problems like entailing that something is both a class
> and an instance of chemical compound. We need input from a wider group of
> people knowledgeable about ontology or chemistry, ideally both. Hopefully
> with a Wikimedian in Residence at the Royal Society of Chemistry [6] we'll
> get some more focused resources on this. All major scientific ontologies
> use *subclass of* (rdfs:subClassOf), not *instance of* (rdf:type), to
> classify such things. In my opinion, Wikidata should maintain technical
> and philosophical compatibility with ontologies like ChEBI and remove
> statements like "ethanol *instance of* chemical compound". This would
> improve interoperability between Wikidata and the rest of the Semantic Web.
>
> Thanks,
> Eric
>
> https://www.wikidata.org/wiki/User:Emw
>
> 1.
> https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2014/07#Forth_and_back_conversions_of_items_between_class_and_instance
> 2.
> https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2014/05#chemical_element
> .
> 3.
> https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Chemistry#Germanium_subclass_tree.
>
> 4.
> https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2014/07#Subclass_of_two_different_things
> 5.
> https://www.wikidata.org/wiki/Help_talk:Basic_membership_properties#Proposition_of_definition
> 6.
> http://pigsonthewing.org.uk/wikimedian-residence-royal-society-chemistry/
>
> _______________________________________________
> Wikidata-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
--
Etiamsi omnes, ego non
_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l