Re: RDF Semantics - Intuitive summary needs to be scoped to interpretations (ISSUE-149)

Pat Hayes Wed, 30 Oct 2013 23:34:15 -0700

Hi David

Rather than respond point-by-point, I will again try to summarize. However, 
there are a few responses that are needed first:


> ... at least in principle, anything that can be described in, say, English 
> prose could instead be described in RDF.

Most emphatically, no. Even if you substitute the most expressive formal logic 
available (say, full higher-order modal tense logic) , this would not be even 
remotely correct. RDF is so inexpressive that it cannot manage something as 
simple as "Fathers are not mothers."  OWL cannot define the idea of an uncle, 
and full first-order logic cannot define the idea of a natural number. 

> But AFAICT, the trend is inevitably *toward* mismatch as more statements are 
> published, assuming that: (a) parties publish data independently (without 
> knowledge of each other)

But why would you assume this? Usually, if B is publishing data using A's IRIs 
(the only case that is of interest here) then B will have access to *some* 
published information which will help determine what A's intentions were 
regarding A's intended meaning. For example, if you use DBpedia IRIs then there 
are large pages of information available, in multiple languages. The entire 
Semantic Web/linked-data enterprise is predicated on the idea that IRIs both 
denote entities and also provide links to sources of more information about 
those entities (or, if you like, more information about what the IRI is 
intended to denote.) So the idea of two RDF authors using the same IRI but 
without any knowledge of what the owner of the IRI intended it to refer to, is 
SW/LD-pathological.

>   If the problem is disagreement then yes, you would have to choose between 
> the source graphs.  But if the problem is divergence then you have to do some 
> more work -- resource identity splitting -- but can still use both source 
> graphs after splitting.

Changing the IRIs in a graph gives you a different graph. So you would not be 
using both source graphs, but some modification of the source graphs. And you 
would be obliged to *not* use - that is, to reject - at least one of the source 
graphs, when they are mutually inconsistent. 

>  ... RDF data does not generally describe the real world, it describes a 
> particular *conceptualization* of the real world

It describes the world *using* a conceptualization (is there any other way to 
describe anything?). It does not (usually) describe the conceptualization.

>  false graphs aren't very useful, because they entail everything

Just as a technical point: *logically false* graphs – contradictions, false in 
*every* interpretation – entail everything. Mere falsity does not get you 
quodlibet. 

---------

There are two substantive points of disagreement between us, and one complete 
mismatch (divergence?) where I fail to understand what you are saying. Let me 
deal with the two points first. 

1.  The reality thesis: that the real world is one of the satisfying 
interpretations, and data is (usually) about the real world. I find this 
obvious, so obvious indeed that it should not even need to be said. You 
apparently find it either mistaken or meaningless, and in any case think it is 
misleading as a guide to intuition. I am not sure how to persuade you to my way 
of thinking, but let me ask you: if all this linked data is not about reality, 
what do you think it *is* about? And why do we find it useful, if it does not 
provide us with information about the actual world? Are the records of the 
transactions in your bank account about your actual wealth? Would that change 
if the bank started using RDF?

Your objections to the idea include the observation cited above about 
conceptualizations. Yes, of course data is stated *using* a conceptualization, 
just like all assertions in every language or formalism. But that does not make 
it any less about reality. It really is a fact about the real world that Hilary 
and Tensing climbed Everest in 1953; that we conceptualize the world here in 
terms of people and mountains does not make this any less true. I am not sure 
what the point of your "toucan" example is, but apparently the real world can 
satisfy both the bird assertions and the website assertions, by appropriate 
choice of an interpretation mapping. (If the complete set of assertions is 
inconsistent then of course nothing can satisfy it.) Your third point concerned 
approximations and idealizations, such as the flat-earth geography of road 
maps. But examples like this do not argue against the reality thesis. An 
approximate or idealized description of X is still a description of X. Bear in 
mind that if some RDF can be satisfied by an approximation or simplification of 
the real world, then it can also be satisfied by the more complicated real 
world, since one can add (an infinite amount of) structure to an interpretation 
freely without making any RDF triples false. (This is a consequence of RDF 
being a positive logic without negation.) The map example is quite instructive, 
as quite a lot of geolocation information (eg lat/long coordinates) is in fact 
describing spherical space rather than flat space, even though we project it 
onto flat surfaces. 

To say that some assertion is about the real world, or that it is factual, is 
not to claim that it is in some metaphysical sense the final truth or the 
definitive description, or that it is the last word, or that its truth has to 
have ended science. It is just saying that it is true.

You say: 
> ... The "real world" interpretation is largely irrelevant -- both to the 
> formal semantics and to understanding how the Semantic Web *actually* works.

I strongly disagree. Many IRIs have fixed interpretations in the actual world, 
determined by all kinds of social, technical and linguistic conventions and 
meanings entirely outside RDF. We still want to be able to use RDF to describe 
these referents. For example, I am a consultant on a project 
(http://www.imagesnippets.com/) to add RDF markup to images. These RDF 
descriptions use IRIs which identify (and in the RDF refer to) images, regions 
in the images, people and places and colors and objects described in DBpedia 
and many other real (no scare quotes) things in the real world. None of these 
denotation mappings are specified by RDF descriptions, and most of them could 
not be. Most – I would claim, virtually all – RDF linked data uses IRIs like 
this to refer to real things. It is centrally important that the formal 
semantics works with such identifying IRIs. 

'Edmund Hilary climbed Everest in 1953' says something true about the actual, 
real, world. It expresses a fact. Just a mundane, simple bit of data. So, how 
is this factuality of this fact related to model-theory semantics? By the 
actual, real, world being one of the satisfying interpretations of it. Because 
if the real world was not a satisfying interpretation of this sentence, then it 
*couldn't possibly* be true (in the real world.) 

But we can, if you like, simply agree to disagree about this, as it has no 
direct bearing on the basic point we have been arguing about, which is...  

2.  The idea of an IRI denoting something "in a graph".  Your gloss on this 
phase, as I now understand it from your email (the first time you have 
explained your intended meaning) is as follows: you take all the 
interpretations which satisfy the graph (and there will be different such sets 
for different graphs, of course) and then you ask, what does the IRI denote in 
those interpretations? And that is what the IRI denotes "in the graph". (Do I 
have that right?)

But that does not define anything, because for any consistent graph G, and any 
IRI U in that graph, there are interpretations which satisfy G and in which U 
denotes things different from what it denotes in other interpretations 
satisfying G. There is no graph which 'pins  down' the interpretations of the 
URIs which occur in it in the way that your definition requires.  (Here is a 
simple proof. Let x be something which is not an IRI. The interpretation I with 
universe {x} and IEXT(x)={<x,x>} and I(u)=x for every URI u, satisfies G. The 
Herbrand interpretation H of G also satisfies G. But H(U) = U =/= x = I(U), by 
construction. QED.) In fact, one can make a stronger statement: truth in an 
interpretation does not depend on the identity of the referents of IRIs *at 
all*, because one can take *any* satisfying interpretation and produce another 
isomorphic one with the identities permuted in any way one likes, as long as 
the IEXT mappings are permuted to match. (In fact, this applies to *any* 
axiomatizable, complete formal logic, no matter how expressive.) In a nutshell: 
model theory does not determine reference. 

This should not be too surprising, actually, if you think about how model 
theory is defined. The very definition of interpretation presumes complete 
referential freedom: any IRI can denote anything. And truth is determined 
solely by how those things stand in relations to one another. The entire 
apparatus of model theory makes no reference to the *actual identity* of the 
things in the universe being described. So creating real constraints on 
reference - attaching, as it were, a name to a thing - has to be done by other 
means. In practice, we rely on notions of naming and reference already in use 
in the larger world (as I did when using "Everest" to refer to the highest 
mountain, and how ImageSnippets does when using 'http://schema.org/Person' to 
refer to the class of human beings) and sometimes on predefined mappings (as we 
do when fixing the referents of literals using datatypes) and perhaps even by 
ostention (arguably, http-range-14 can be seen as declaring HTTP GET/200 to be 
a form of ostention.) And this all works quite nicely (a lot of the time) 
because we can all (more or less) agree on what these referring names actually 
refer to, at least well enough to transfer meanings successfully by using them 
as referring names in sentences. 

So, as I believe I have said several times, phrases such as "interpretation of 
an IRI in a graph" are not meaningful. It is not that this is a different 
perspective on model theory, or an alternative viewpoint. It is that it, quite 
literally, does not mean anything. 

---------

Now the place where I fail to understand what it is you are saying. 

At the end of your email you list all the advantages of an "other way" of 
approaching model theory. But as far as I can tell, this "alternative" is 
simply standard model theory. For example:

> The other way to think of the RDF Semantics is in terms of *multiple* 
> interpretations

This is the only correct way. As I have said to you before, *of course* we 
think in terms of multiple interpretations. That is the entire point of 
defining the notion of interpretation. The very definition of entailment refers 
to multiple interpretations. 

> , instead of attempting to assume or impose a single "real world" 
> interpretation.

Well, it is fine to assume that the real world is *one* interpretation, but 
nobody has ever suggested "imposing" a single interpretation. Certainly, 
nothing in the RDF Semantics document speaks of anything like this.

>  By this I mean, for example, that:
> 
> - Two different graph authors may have different sets of intended 
> interpretations in mind when they publish their RDF graphs, and the same URI 
> may indeed denote different resources in those interpretations.

Different sets of interpretations in mind, yes, of course (standard).  URIs 
denoting different things in different interpretations, yes of course 
(standard). URIs denoting different things in different *sets* of 
interpretations, yes, if we are talking about sets of interpetations an author 
*has in mind*. But URIs denoting things in a set of all interpretations which 
satisfy a given graph? No, for the reasons described above. That idea is 
incoherent. 

> - The most accurate way to understand a graph is to interpret it in the way 
> that the author intended it to be interpreted. Since we have no other 
> reliable way of knowing what that might be, we can assume that the author's 
> intended interpretations for a graph are a subset of the graph's **satisfying 
> interpretations**.  I.e., we take the graph's meaning at face value, rather 
> than attempting to interpret it according to some hidden, assumed "real 
> world" interpretation.

Yes, this is exactly what the RDF model theoretic semantics presumes. Asserting 
a graph effectively claims that interpretations must be such as to make it 
true, i.e. to satisfy it. Each graph makes some claims about how the world is 
structured, and the claims made by multiple graphs are connected by their 
common use of global IRIs. 

> Some benefits of looking at the formal semantics this way

What "way" are you talking about? Look, *of course* each graph has a set of 
satisfying interpretations, and asserting the graph is saying that the world 
being described by the graph is one of those satisfying interpretations. (Or if 
we want to give authors the ability to be vague about exactly what they are 
talking about, then the interpretations of whatever the author had in mind are 
a subset of the satisfying interpretations.) And of course we should take a 
graph at face value, as you put it, as saying exactly this. All this is 
*exactly* what the current semantics itself says (or presumes). As far as I can 
see, you are simply agreeing with standard model-theoretic intuitions here. 

> Is this making any more sense to you?

No. I don't know what the "it", that is supposed to provide all these 
advantages, actually is. If it is the idea that asserting a graph amounts to 
saying that the intended interpretation is one of those satisfying the graph, 
then this is what model theory says already. If it is the idea that an IRI can 
refer to one thing in one graph and a different thing in a different graph, 
then that is false (by definition) but in any case would not provide all these 
claimed advantages that 'it' is supposed to have, even if it could be made 
somehow true. 

>   Have I explained myself in sufficient detail, or do you still think that 
> "David . . . does not properly understand the intuitive foundations of 
> semantics" and my points are mere "inanity", as you previously concluded?

I regret if my usage here seemed impolite, but I do (still) find your posts, 
including this one, to be a strange mixture of basic ideas about model theory 
(re)stated as though they were somehow a new insight or an alternative to the 
standard view (which I referred to as "inanity") and strangely stubborn basic 
mistakes which do, I am afraid, strongly suggest that you have not grokked the 
basic ideas of model theory. 

> And do you *still* think I merely need to go read a book on model theory, or 
> have we now (I hope) got past that?  If not, what aspects of model theory do 
> you still think I misunderstand?

Well, I guess, the basic idea of an interpretation. An RDF interpretation, by 
definition, is a mapping from IRIs to referents. It is not a mapping from 
IRIs-in-graphs or from IRI-occurrences or from IRIs-in-a-context. Ergo, every 
interpretation treats all occurrences of an IRI in the same way, as referring 
to the same thing, regardless of which graphs the IRI happens to occur in. 
Therefore, the notion of what an IRI denotes "in a graph" is meaningless. This 
basic fact – and it is a very basic and foundational point – still seems to 
elude you. To emphasize, this is not a "perspective" which admits alternatives, 
it is simply a fact about how interpretations are defined. 

> The bottom line here is that some of the statements -- and intuition -- in 
> the existing RDF drafts are just plain *wrong* and need to be corrected.  In 
> particular, the statement in RDF Concepts that says "IRIs have global scope: 
> Two different appearances of an IRI denote the same resource" is just 
> factually *wrong*.

It is a presumption of the RDF data model.  The semantics, in particular, is 
based on it. I don't quite see how it can be factually wrong, since RDF 
*defines* the notion of denotation. (If it had said "identiifies" then it might 
be factually wrong, but it doesn't.) 

Pat

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
pha...@ihmc.us       http://www.ihmc.us/users/phayes

Re: RDF Semantics - Intuitive summary needs to be scoped to interpretations (ISSUE-149)

Reply via email to