Re: Summary of the QName to URI Mapping Problem from Drew McDermott on 2001-08-29 (www-rdf-logic@w3.org from August 2001)

From: Drew McDermott <drew.mcdermott@yale.edu>
Date: Wed, 29 Aug 2001 15:56:23 -0400 (EDT)
To: www-rdf-logic@w3.org
Message-Id: <200108291956.f7TJuNB09975@pantheon-po01.its.yale.edu>
   [Patrick Stickler]
   Perhaps you could do me a favor. It is my understanding that e.g. HTTP 
   URI scheme semantics applicable to the structural components of an 'http:' 
   URL are irrelevant and invisible to an RDF processor which is using that URI
   as the identity of a resource within an RDF graph, and in fact, within RDF 
   space, that URI is used as an opaque symbol to which is attached semantics 
   which is disjunct from any semantics meaningful to or associated with the 
   URI Scheme of the URI, such as the semantics of an 'http:' URL.

   Now, we have here two functional layers: RDF, and HTTP. A SW agent
   may interact with that URI at either level, and the semantics at
   one level does not have significance at the other. When applying
   some axiom or inferring some relation, the HTTP semantics are totally
   irrelevant. When dereferencing that URI for perhaps some auxilliary
   knowledge, the semantics that is defined for the URI in RDF-space is 
   irrelevant to the HTTP server.

   Now, if my understanding of the division of semantics between functional
   layers in such a context is incorrect, I would very much appreciate
   understanding why.

I think it is basically correct.

The problem is statements such as "the semantics at the lower level is
machinery at the higher level."  This is a close relative of
statements of this sort: "RDF ontologies provide a semantics for (pick
one) the Web, XML, ...."  The problem with both of these is that they
confuse notations and algorithms with semantics.

This confusion is so widespread that it might be better to stop
fighting it and find a new word for what "semantics" traditionally
meant.  However, that would make it hard to talk to people (a large
community of people) who still think "semantics" means "what a
notation means."

When we say that RDF or HTTP has a semantics, we mean that for every
expression in either notation there is a well defined and clearly
specified thing that it denotes.  I don't know much about the details
of HTTP, but symbols like GET and POST, and URI specifications, all
have an intended meaning, just as Patrick S. said.  There are W3C
documents that spell the meanings out.  Anyone who implements a
program that uses the HTT protocol must comply with the semantics as
specified.  It's true that the semantics of HTTP are not expressed in
a formal notation such as set theory, but that's because they don't
need to be, being fairly simple (nonrecursive, for instance).

The problems arise when people confuse the mechanism that complies
with the semantics with the semantics itself.  The most painful
example is the one I mentioned above, where it's casually assumed that
an RDF or DAML ontology specifies the semantics of any data base that
uses the symbols introduced by the ontology.  What they have in mind
(I think) is that the ontology will allow inference engines to notice
linkages between expressions that would otherwise be missed.  If the
ontology says that fathers are male and mothers are female, and no one
is both male and female, then an inference engine can raise a flag
when it sees a database that says Amby Secksual is both the mother and
father of little Anxiety Secksual.  However, this inference can occur
in just this way whether the symbols mean what they appear to mean or
not.  That is, the symbol 'father' might mean "favorite cigarette brand,"
and 'mother' might mean "favorite beverage brand," and 'male' might
mean "inhaled," and 'female' might mean "drunk," and our exclusionary
axiom might mean "Nothing is both inhaled and drunk"; and the
inference would still work exactly the same way.

When put this way, I know a lot of practical developer types are going
to say, "Who cares about the esoteric meaning of "semantics"?  The
inferences are where the rubber meets the road."  That's a reasonable
thing to say.  Most of the time semantics is unimportant.  The
exceptions are when people realize they have different ideas of what
a class of expressions mean.  Then semantics becomes very important
indeed.  Such conflicts aren't usually about the meaning of "mother";
they're more like to be about more fundamental things like the meaning 
of URIs, or the meaning of a quoted expression, or whether your mother
was your mother before you were conceived.

   My discussion of semantic layers was specifically focused on the fact
   that if a QName in an XML serialization is mapped to a QName URI (not
   a URI following the URI Scheme of the namespace URI, as is now the case),
   the structure of the original XML QName remains explicitly defined in the 
   resultant QName URI, and hence QName semantics can be applied without
   limitation
   to that URI if and as needed; yet even though the URI Scheme maintains
   the QName structure and hence "preserves" the validity of QName semantics,
   that QName URI does *not* introduce QName semantics into RDF, since all URIs
   in RDF are merely opaque identifiers, to which is attached *additional*
   semantics, and it is only that additional semantics at the RDF level
   that is relevant to RDF and RDF based tools operating within the realm
   of the RDF conceptual graph. 

   I.e. No URI Scheme can introduce any semantics into RDF. The use of any
   URI Scheme for resource URIs has no relevance whatsoever to semantics
   associated with an RDF graph. Right?

Right.  Another way of saying it (I hope) is that an XML serialization
of RDF can be looked at through two different syntactic/semantic
lenses, as a piece of XML or a piece of RDF.  Granted, the semantics
of XML are underdeveloped, so that not every expression is assigned a
meaning; but for the sake of argument we can suppose that the URIs
are, and that that meaning is slightly different from the RDF meaning.

Note, however, that part of the ambiguity is syntactic, if I
understand correctly, and not semantic at all.  That is, through the
XML lens one sees *different symbols in different arrangements* than
one sees through the RDF lens.  Isn't this what the QName controversy
is about?  A given expression of the form QName:id is broken down into
components in different ways in XML and RDF.

Even so, I think Patrick's point is correct: Nothing prevents a piece
of software from reading QNames in the XML way at the same time it
reads them in the RDF way, and the resulting synergy might be quite
valuable.  

   Secondly, I was referring mostly to semantics associated with ontologies
   and identified by both URIs in the graph and QNames in serializations,
   and not the semantics of RDF itself -- which I see as yet a third
   layer/level
   of semantics that is disjunct from either URI Scheme semantics or specific
   ontological semantics. 

You've lost me here.  I suspect the problem lies in the phrase
"semantics associated with ontologies," which sounds like it's
infected with the confusion I described above.

My guess is that most of the time there is no (new) semantics
associated with an ontology.  That is, it's written in a notation with
a semantics (e.g., RDF), and when we introduce a new symbol (e.g.,
'mother') we do it in a context where it's clear that the symbol is,
e.g., a predicate.  The axioms and other stuff in the ontology
constrain it to denote a predicate in a particular set of possible
predicates, but it's a very big set, and which member it is is not
important, and not accessible to the inference machinery anyway.
If someone points out a bug (e.g., the ontology allows mothers to be
females of different species than their offspring), the bug gets fixed
by adding more axioms, not by posting a notice somewhere that "the
intended meaning is everyday biological motherhood, neglecting mules."

   I.e. The semantics associated with a particular ontology which is
   represented
   by and processed according to the RDF conceptual model does not add to the
   semantics of the RDF conceptual model, and visa versa. Both are needed, but
   depending on perspective and the level at which a given operation is being
   performed, one or the other may be irrelevant. The semantics that defines
   what
   a resource is, or what a statement is, or the relation subPropertyOf, is in
   no way dependent on, nor modifies in any way the semantics associated with
   a given URI. No? Or have I just headed off to la la land?

All these semantic issues are interrelated.  If we spell out formally
what sorts of things a URI denotes, then a given URI must denote one
of those things.

                                             -- Drew McDermott
Received on Wednesday, 29 August 2001 15:56:36 UTC