Hi Eric, I have been following this issue for a long time, and I haven't found any satisfactory answer from anyone or from any source. Totally disappointed I decided to scrap everything and start again from the beginning, working on a foundation that eventually would allow to represent emergent systems and natural language. Pulling from that thread I got some interesting insights, that I hope you find useful: https://docs.google.com/document/d/1dc99zyxdPX8t6Ept_JjSxNalij2PNdYt-A3TG2e4cIw/edit?usp=sharing
If you don't have time or patience to read the whole 27 pages (which is just an introduction that needs refinement and to be expanded): - in real life there is no difference between calling something "metaclass", "identifier" or "information", they all refer to an observer system encoding an observed signal together with a pattern recognition model - in real life instantiation is an observer system putting a frame on an observed system. This frame might be more or less justified considering the intrinsic qualities of what is placed inside, but in the end it is entirely observer-defined, so quite useless other than to point to an "agreed or arbitrary bottom concept chosen by the observer" ( I want to emphasize "in real life" because sometimes I notice a focus in devising "mathematical sound" approaches and not so much approaches based on reality itself. ) In my opinion the solution to atoms is to separate identifiers from the entity with a new property ("has/identifier of" or "has/manifestation of") and use instance_of only when really necessary. I hope we don't have to resort to Captain Metaphysics :) http://existentialcomics.com/comic/47 Cheers, Micru On Fri, Sep 26, 2014 at 2:59 PM, Emw <emw.w...@gmail.com> wrote: > The statement "ethanol *instance of* chemical compound" is ontologically > incorrect. Importantly, it is also incompatible with ChEBI, the most > widely-used chemistry ontology. > > The matter of how to apply *instance of* (P31, rdf:type) and *subclass of* > (P279, rdfs:subClassOf) on Wikidata in relation to chemical entities has > been, as Thomas puts it, a long discussion [1-5]. Hopefully with a wider > audience and experts like Markus Krötzsch and Denny Vrandečić now > interested, we can come to a resolution at least in the particular domain > of chemical compounds. Since it concerns interoperability with another > large Semantic Web project, I have copied Janna Hastings and Alan > Ruttenberg on this discussion. Janna coordinates ChEBI. Alan coordinates > BFO, the upper ontology used by ChEBI and many other major ontologies in > the natural sciences, like Gene Ontology and Disease Ontology. > > Denny indicates how the statement "Porsche 356 *instance of *car" would > be incorrect in Wikidata even though "Porsche 356 *is a* car" is > acceptable in everyday speech. Similarly, "ethanol *instance of* > chemical compound" is incorrect in Wikidata even though "ethanol *is a* > chemical compound" is acceptable in less formal contexts. > > A key difference between talk about cars and talk about chemicals is that, > with cars, we have familiar terms like "car model" that distinguish > concrete instances (that *particular* car you see on the street) from > abstract "instances" (i.e. metaclasses, classes that are also instances, > the *kind* of car that you see on the street). We do not have a > well-known term like "chemical model" or "chemical compound type" to > distinguish classes (types) of chemicals and instances (tokens) of > chemicals. When one speaks of the properties of ethanol or hydrogen, it is > understood that the subject is *all concrete, particular, spatiotemporal > tokens, i.e. instances *of ethanol and hydrogen -- not just a specific > ethanol molecule floating in that container before you on a Saturday with > friends, but all molecules that we label "ethanol" everywhere. > > Thus, in order to formally classify ethanol itself as opposed to some > particular ethanol molecule, we must say for an item like > http://www.wikidata.org/wiki/Q153: "ethanol *subclass of* chemical > compound" and not "ethanol *instance of* chemical compound". (On > Wikidata, the statement is more precisely "ethanol *subclass of *alcohol", > but it is entailed from the statements "alcohol *subclass of* organic > compound" and "organic compound *subclass of* chemical compound" that > "ethanol *subclass of* chemical compound".) > > A common defense of statements like "ethanol *instance of* chemical > compound" is that Wikidata will never have items about any concrete > molecules of ethanol, so, since ethanol is a "leaf node" in our concept > taxonomy, it makes sense to state that ethanol is an instance. That > interpretation of "instance" is short-sighted. It precludes us from ever > talking about particular tokens of ethanol, or particular aggregates of > such objects, without overhauling our chemistry ontology. Excluding > consideration of metaclasses like "chemical compound type", the fact that > an entity is a leaf node in a concept hierarchy is a necessary but not > sufficient condition for using *instance of*. > > Another common suggestion is that we should state something like "ethanol > *instance > of* chemical compound type" and "ethanol *subclass of* chemical > compound". > > To see where that gets us, try wrapping your head around this: > https://commons.wikimedia.org/wiki/File:Atom_classes.svg. Really, take a > look. If we want Wikidata's concept hierarchy to be seen as of dauntingly > complex, pervasively applying that kind of three-layer classification > scheme will do. > > The kind of explicit metamodeling seen when punning things like cars and > car models, ships and ship classes, biological taxa and organisms, etc. > works reasonably well in certain domains. But, while we hold that hammer > in one hand, we should be careful not to see everything as a nail. Outside > domains that have established vocabulary for metaclasses, imposing explicit > metamodeling with statements like "ethanol *instance of* chemical > compound type" or "hydrogen *instance of* atom type" will strike users as > unduly complex. > > Without such metamodeling, though, querying for a list of chemical > compounds becomes murkier. Surely we would want to return "ethanol" and > not "organic compound" in such a list. How about "alcohol"? Relatedly, if > we don't state "oxygen *instance of *chemical element", then how can we > easily query for all the elements in the Periodic Table of Elements without > including in the results of any potential subclasses of oxygen (e.g., > isotopes of oxygen like oxygen-16, oxygen-17, etc.)? > > There are ways to achieve that in SPARQL using rdfs:subClassOf / P279 / > *subclass > of*, but they require adhering to certain conventions. When faced with > requiring many potential query users to learn some Wikidata MetaObject > Protocol, though, I'm inclined to make some sacrifices for simplicity, > ontological correctness, and consistency with major existing ontologies. > > In summary, this ball has punted for over a year now. Because of the > impasse in how to classify chemical entities, we now have showcase items > that have obvious problems like entailing that something is both a class > and an instance of chemical compound. We need input from a wider group of > people knowledgeable about ontology or chemistry, ideally both. Hopefully > with a Wikimedian in Residence at the Royal Society of Chemistry [6] we'll > get some more focused resources on this. All major scientific ontologies > use *subclass of* (rdfs:subClassOf), not *instance of* (rdf:type), to > classify such things. In my opinion, Wikidata should maintain technical > and philosophical compatibility with ontologies like ChEBI and remove > statements like "ethanol *instance of* chemical compound". This would > improve interoperability between Wikidata and the rest of the Semantic Web. > > Thanks, > Eric > > https://www.wikidata.org/wiki/User:Emw > > 1. > https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2014/07#Forth_and_back_conversions_of_items_between_class_and_instance > 2. > https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2014/05#chemical_element > . > 3. > https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Chemistry#Germanium_subclass_tree. > > 4. > https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2014/07#Subclass_of_two_different_things > 5. > https://www.wikidata.org/wiki/Help_talk:Basic_membership_properties#Proposition_of_definition > 6. > http://pigsonthewing.org.uk/wikimedian-residence-royal-society-chemistry/ > > _______________________________________________ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > -- Etiamsi omnes, ego non
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l