Re: review of July 15 draft of RDF Semantics document from pat hayes on 2003-07-29 (www-rdf-comments@w3.org from July to September 2003)

From: pat hayes <phayes@ihmc.us>
Date: Tue, 29 Jul 2003 02:31:46 -0500
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Cc: www-rdf-comments@w3.org
Message-Id: <p06001a18bb45de65f265@[10.0.100.23]>
<snip>
>  >
>>  The change list is not part of the document. Please review the document.
>
>I strongly differ.  I was told by Brian McBride, in various messages to me
>also sent to www-rdf-comments@w3.org, that
>http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-mt-20030117/
>is suitable for review, so I did review it.  This document includes a
>section headed ``Change List since Last Call''.  How can this prominent
>section not be part of the document?

Because it is a change list? We must understand 
English differently. If you prefer, please delete 
the grey box labelled 'change list ' and review 
what remains.

<snip>
>  >
>>  >
>>  >Drastic Problem:
>>  >
>>  >There has been a significant conceptual change to simple interpretations.
>>  >IP is not required to be a subset of IR.  This does not appear to be in
>>  >response to any comment to the RDF Core Working Group nor to be in response
>>  >to any problem with the RDF model theory.  This change may have
>>  >consequences for other formalisms, including OWL, but no announcement about
>>  >it has been made.
>>  >
>>
>>  I would not describe this as a significant conceptual change, so much
>>  a small technical improvement to the mathematical machinery. It was
>>  mentioned in an informative email which you received and replied to.
>
>I take it that you are referring to this exchange:


Yes.

>
>** Subject: Re: possible semantic tweak
>** From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
>** To: phayes@ai.uwf.edu
>** Cc: horrocks@cs.man.ac.uk
>** Date: Wed, 04 Jun 2003 06:52:36 -0400 (EDT)
>** X-Mailer: Mew version 2.2 on Emacs 21.1 / Mule 5.0 (SAKAKI)
>**
>** From: pat hayes <phayes@ai.uwf.edu>
>** Subject: possible semantic tweak
>** Date: Mon, 2 Jun 2003 19:02:38 -0500
>**
>** > Guys, some (very) recent work on the SCL project has suggested a
>** > possible semantic modification which may be relevant to the RDF/OWL

<snip>

>** >
>** > My question to you is, SUPPOSE that I were to tweak the RDF MT so as
>** > to allow this; would that possibly simplify the OWL-DL correspondence
>** > theorem? Because as far as I can see it would make no difference at
>** > all to the RDFS documents: the difference between this and the
>** > current MT is invisible in RDFS, as far as I can tell. So if you
>** > think this would be worth doing, let me know (off-list) and I will do
>** > it editorially.
>** >
>** > Pat
>**
>** I don't see that it would make the correspondence theorem significantly
>** simpler.  There already is a similar feature in the RDFS model
>** theory for OWL, and it was very easy to set up. 
>**
>** Further, I am very skeptical that you would get any mileage out of this
>** trick in RDF.  Remember that most elements of the RDF and RDFS vocabulary
>** are subjects or objects of some triple in RDFS models (either rdfs:domain,
>** rdfs:range, rdfs:subClassOf, or rdfs:subPropertyOf) and thus would have to
>** belong to IR. 
>**
>** peter
>
>I don't view this exchange as an announcement or even an indication that
>the change will be made nor do I view my response as either an endorsement
>of the change or an indication that the change would not cause any problems
>for OWL.

I did not say that it was an announcement. And I 
did not ask for, or expect to get, an 
*endorsement*. I asked you if the change, which I 
described as editorial and indicated that I was 
planning to make, would be of help in the OWL 
layering. Ian, the other editor, reacted 
positively. Your reaction, though negative to my 
request, did not indicate that you found the 
change likely to cause problems.

>
>>  It was not made capriciously; it reflects a recent observation that
>>  this slight weakening of the basic (not RDF) graph model theory makes
>>  'layering' of the sort requested by Jeff Pan and others somewhat
>>  easier to achieve, since the basic model theory now allows a
>  > conventional first-order structure of an interpretation of a graph
>>  which satisfies the conventional syntactic layering: that is, if a
>  > URIref occurs in a graph only in predicate position, it is no longer
>>  required to denote something in the universe of quantification.  This
>>  allows the basic model theory to be more conventional, since it no
>>  longer requires the use of non-well-founded structures in all cases.
>>  The credit for this idea is due to Chris Menzel, and it arose as
>>  consequence of the SCL project working to eliminate the 'Horrocks
>>  sentences' which had different satisfiability conditions in SCL and
>>  FOL; this is of course closely related to the RDF/OWL layering
>>  issues. Using a similar device, SCL has now achieved full FOL
>>  compatibility.
>>
>>  This does not change any RDF or RDFS entailments or semantic
>>  conditions, since these require that IP and IR overlap on the parts
>>  of the RDF and RDFS vocabularies to which semantic conditions apply,
>>  as the text notes; and since it weakens rather than strengthens the
>>  conditions on simple interpretations, I do not believe that it will
>>  have any significant effects on OWL.  Other members of the Webont
>>  working group had reacted favorably to this change.  If you feel that
>>  there are any problems arising from this change, please say what they
>>  are.
>
>The point is that I don't see that there are no problems resulting from
>this change.  This determination could require considerable effort.

Allow me to suggest a way to determine this.  Do 
the OWL specs at any point refer to or rely on 
the structure of simple interpretations which are 
not RDF interpretations?  If not, there is 
nothing further to discuss. If so, do the specs 
at that point rely in any way on IP being a 
subset of IR in such non-RDF simple 
interpretations? I do not believe this is the 
case, although I concede that there may be parts 
of your proof of the correspondence theorem that 
I have not checked in sufficient detail.

>
>>  >Problem:
>>  >
>>  >The definition of a proper instance admits a switch of blank nodes in the
>>  >graph, e.g., replacing _:a with _:b and vice versa, as a proper instance,
>>  >but this shouldn't be a proper instance.
>>
>>  It isn't a proper instance according the definition given:
>>
>>  "A proper instance of a graph is an instance in which a blank node is
>>  mapped to a name or to some other blank node in the graph, so that in
>>  the instance a blank node has been replaced by a name or two blank
>>  nodes in the graph have been identified. "
>>
>>  On re-reading this I see that the comma may be 
>>misleading, and have deleted it.
>
>Removing the comma changes the meaning of a proper instance.

Not in the way I read the English, but OK, the change has been made.

<snip>

>
>>  >There are other problems in the definition of the merge as well.
>>
>>  I am unable to respond to that.
>
>The definition needs to be rewritten.  I have had to read the defining
>sentence numerous times to figure out just what is going on, often with
>different results, and I'm still not sure that there isn't some problem in
>the definition.

I have rewritten the definition to be hopefully more coherent.

>  > >Problem:
>>  >
>>  >In Section 1.3 a vocabulary is defined as a ``set of URIrefs''.
>>
>>  It is not defined there; the text refers to such a set as being a
>>  vocabulary, which is correct. However it could be better worded: I
>>  have changed this to "set of names".
>
>OK, Section 1.3 only defines the notion of a vocabulary of an
>interpretation.  However,
>	All interpretations will be relative to a set of URIrefs, called
>	the vocabulary of the interpretation, ...
>defines the vocabulary of an interpretation to be a set of URIrefs, not a
>set of names.

Right; as I say, I have changed that.

>
>>  >However, in the change log and in Section 0.3, a vocabulary is supposed to
>>  >be able to contain typed literals.
>>
>>  A set of URIs without typed literals is a vocabulary, however.
>
>Agreed, but I don't see the point you are trying to make here.
>
>>  >Problem:
>>  >
>>  >There is no definition of a ``literal character string'' or a ``language
>>  >tag'', used in the definition of simple interpretations.
>>
>>    "literal character string"  changed to  "character string".
>  >
>>  Language tag is used in the sense of RFC3066. I have inserted a
>  > reference link to the concepts document
>> 
>>http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-concepts-20030117/#section-Graph-Literal
>>  which should clarify the intended meaning.
>
>Is this consistent with RDF Concepts?  I should think that you should
>instead defer to this other document.

The link is to that document.

>
>[...]
>
>>  >Problem:
>>  >
>>  >The conditions for denotations should be augmented with more conditions
>>  >like ``if I(p) is in IP''.    I suggest adding as well ``if s, p, and o are
>>  >in V''.
>>
>>  Why do you feel this is necessary? This wording has not changed in
>>  many versions of the document.
>
>Well, you have the one condition, why include if you don't include the others?

Because it isn't necessary. But....

>
>>  But since you insist, I have added the condition explicitly.
>
>
>
>>  >Problem:
>>  >
>>  >The example in Section 1.4 is incomplete in that it does not define LV.
>>
>>  True; it is only an example. LV can be any suitable set.
>
>The example should say this.

OK, done.

>
>>  >Also, IL is necessarily the empty map as there are no typed literals in the
>>  >vocabulary of the example.
>>
>>  Ah, point taken. I have added "plus all typed literals with one of
>>  these as the type URI"
>>
>>  >  This makes the fourth triple false, not true.
>>  >
>>  >The ``oddity'' of having a typed literal denote a non literal is not ruled
>>  >out in datatyped interpretations.
>>
>>  That isn't what was meant by 'oddity', but I have deleted this comment.
>>
>>  >
>>  >The explanation of why triples involving plain literals are false is
>>  >incomplete, as plain literals do not have to denote character strings.
>>
>>  Changed to "containing a plain literal."
>
>I don't think that this does the trick.  The point is that plain literals
>include those with language tags, which do not denote strings.

Right. Since they are self-denoting (with or 
without language tags) and there are none in the 
universe, all those triples will be false for 
lack of a denotation.

>
>>  >Silliness:
>>  >
>>  >rdf-interpretations do not just ``impose extra semantic conditions on crdfV
>>  >and typed literals with the type rdf:XMLLiteral''.  Why not just say that
>>  >rdf-interpretations impose extra semantic conditions?
>>
>>  Because this draws attention to the fact that they do not impose any
>>  extra conditions on the rest of the RDF vocabulary.
>
>Well, sort of, but I consider the use of crdfV misleading.

Im afraid I disagree.

>It is true that there are rdf-interpretations that do not impose conditions
>on (the denotation of) rdf:subject.  However, any rdf-interpretation that
>includes rdf:subject in its vocabulary does impose conditions on (the
>denotation of) rdf:subject.

Only those which arise from its being a simple 
interpretation. The text uses the word "extra 
semantic conditions" to indicate this distinction.

>Further, not all rdf-interpretations impose
>conditions on every typed literal with the type rdf:XMLLiteral, as not all
>such literals need be in the vocabulary of the interpretation.
>
>So I suggest some different wording here.
>

The text does not say *all* typed literals. I 
think the intended meaning is sufficiently clear 
from the context, since all interpretations, of 
any kind, can impose conditions only on the 
vocabulary of the interpretation. The sentence is 
only intended to be an introductory remark, the 
actual conditions in the formal definition are 
quite precise.


>  > >Problem:
>>  >
>>  >The document states several times that it is agnostic as to whether XML
>>  >literals are strings.
>>
>>  The document  refers to XML values, ie 
>>whatever it is that XML literals denote.
>>
>>  >However, the claimed completeness of the RDF entailment
>>  >rules means that XML literals are not strings.
>>
>>  The strings in the actual XML literals themselves are strings, as
>>  clearly stated several times in this and other RDF documents.
>  > Whether or not an XML literal denotes a string is where the
>>  agnosticism comes in.  I am not sure which of these you mean here.
>
>[From a separate email exchange, with new comments
>
>** Subject: Re: pfps-04
>** From: pat hayes <phayes@ihmc.us>
>** To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
>** Date: Wed, 23 Jul 2003 21:55:41 -0500
>**
>** >From: pat hayes <phayes@ihmc.us>
>** >Subject: Re: pfps-04
>** >Date: Wed, 23 Jul 2003 14:56:07 -0500
>[...]
>** >>  >There is still a mismatch between the RDF Entailment Rules, which, if
>** >>  >complete, determine that XML Literals are not the same as strings
>** >>
>** >>  ?? How can they possible determine that? Please explain.
>** >>
>** >>  Pat
>** >
>** >[The following assumes that certain aspects 
>of the RDF model theory will be
>** >fixed up appropriately.]
>** >
>** >Let x be any well-typed XML literal string.  Then "xx"^^rdf:XMLLiteral,
>** >where xx is the appropriate encoding of x, 
>has the same denotation in every
>** >interpretation of a vocabulary that includes "xx"^^rdf:XMLLiteral, namely
>** >the canonical XML object corresponding to x.
>**
>** That assumes that 'canonical XML object' is uniquely welldefined. I
>** don't think it is, as I have observed differences of opinion about
>** what exactly it is. So I don't accept that it does have the same
>** denotation in every interpretation. It denotes whatever someone
>** thinks the specs mean. Opinions seem to differ.
>
>If this is not well defined, then it would not be possible to create an RDF
>datatype for XML Literals, as RDF datatypes require a well-defined mapping
>from lexical to value spaces.

No. If the phrase is not welldefined then there 
will be alternative possible interpretations of 
it. For example, one might identify XML objects 
with character strings; another might not. Both 
possibilities are welldefined in the mathematical 
sense,  of course, but people might legitimately 
disagree about which of them is 'correct', given 
the English text of the various specifications. 
In the meanwhile, the RDF model theory is capable 
of treating either or both of them as possible 
interpretations.

If we - that is, the various WGs - can come to a 
clear determination of this issue then I would be 
happy to state the result in the document. In the 
meantime, however, the only options are to be 
silent on the topic, which I think is not helpful 
to the reader, since some readers will read it 
one way and others read it another; or to be open 
and explicit about the agnosticism being assumed, 
to make it clear that the model theory as stated 
works with either interpretation, which is what I 
tried to do.

In fact, I have removed the 'agnosticism' text 
from the latest draft, but I note that the issue 
seems to still be unresolved. If it stays 
unresolved, then I plan to put the statement of 
explicit agnosticism back into the document, 
possibly with some explanatory text to clarify 
what it means and the reason for stating it.

>
>** >Let this object be x'.
>** >
>** >If x' is a string
>**
>** Nobody said it was a string.  x is a string in a XML literal, ie a
>** syntactic entity.  x' is an XML Literal value, something in an
>** interpretation.  Whether or not those are strings is left open: as
>** far as the MT is concerned, they might not be, ie there are
>** satisfying interpretations where they are not.
>**
>** Pat
>
>Note that the above says that ``*If* x' is a string''.  This does not mean
>that x' has to be a string.  However if x' is a string (of Unicode
>characters), then
>
>[Taken from my first response to Pat.]
>
>* If x' is a string then 
>*	ex:a ex:p x^^rdf:XMLLiteral .
>* rdf-entails
>*	ex:a ex:p "xx'" .
>* where xx' is the n-triple encoding of x'.
>*
>* Therefore for the RDF entailment rules to be complete, no XML Literal can
>* have a character string as its denotation.
>

No, look. That *entailment* holds only if it can 
be shown to follow in *every* interpretation. My 
point, in interjecting that x' might not be a 
string, was (and still is) precisely that this 
question remains unresolved: so there can be 
interpretations satisfying the conditions, as far 
as we know, in which x' is not a string.  If it 
is a string, then indeed *in that interpretation* 
your argument goes through: but one cannot 
justify a conclusion of entailment by considering 
only a subset of the possible interpretations.


>  > >Problem:
>>  >
>  > >The treatment of quoted strings in LBase is so bad that I can't even begin
>  > >to figure it out.  However, it is definitely the case that the translation
>>  >to LBase changes the denotation of character strings.
>>
>>  Indeed there was an error in the table at this point, left over from
>>  an earlier edit, my apologies.  I also see, on checking, that the
>>  character-escaping convention in the published Lbase note is not in
>>  fact the version I was following when writing the appendix. No wonder
>>  you were unable to follow it.
>>
>>  Let me suggest that I simply ignore all the character-escaping
>>  complexities and insert a remark in the text as follows:
>>
>>  "Note, these translation rules ignore issues of character escaping in
>>  encoding character strings in literals: an implementation based on
>>  these rules might need to use more care with strings containing the
>>  characters ' and \."
>>
>>  The mapping now simply puts single quote marks around the literal
>>  string, with no attempts at character escaping.
>>
>>  I have made these changes.
>>
>>  Bear in mind that, as the text states, this translation is provided
>>  only as an informative alternative for readers who prefer this style.
>>  The Lbase document emphasizes that Lbase is not intended as an
>>  implementation language or for direct use as a SWEL.
>
>As you know, I am unhappy with the presence of this appendex in the
>document.  This new issue only increases my unhappiness.

Well, to be brutally frank, your *happiness* is 
not really at issue here. If you would like us to 
start trading mutual unhappinesses I am willing 
to do so, but I fear the results might be rather 
sordid and not conducive to making progress on 
the various technical issues that we have to try 
to get completed.

>  > I have also weakened the claim in the 5th 
>paragraph of section 0.1 to read :
>>
>>  "The translation technique offers some advantages and may be more
>  > readable, so is described here as a convenience. The axiomatic
>>  semantic description differs slightly from the normative model theory
>>  in the body of the text, as noted in the appendix."
>
>Does this mean that the claims of ``same semantic theory'' and ``exact
>correspondence'' are gone?   If so, what then remains?

What the text states: it is non-normative and 
provided as a convenience for some readers. Like 
it or not, many people do prefer this style of 
reading a semantic specification. I am confident 
that anyone who does prefer it would be capable 
of making it exact enough for their purposes.

>  > >Whether this causes
>>  >problems I cannot determine.
>
>
>
>>  >Problem:
>>  >
>>  >The translation to LBase seems to assume in some places that LBase uses
>>  >URIrefs of some sort, e.g., the expansion of Lbase:String.  However, the
>>  >LBase document itself uses non-URIref names for these things, e.g., String.
>>
>>  Whoops. Sorry, indeed that is a mistake, arising from having too many
>>  versions of the document lying around.  The 'Lbase:' prefixes should
>>  not be there. Fixed.
>
>There are Lbase: prefixes in
>http://www.w3.org/TR/2003/NOTE-lbase-20030123/, which are at best a source
>of confusion.

I see that indeed they are used in the example 
given there. That is ugly, and I apologize for it 
and will contrive to have it corrected in an 
upgrade of the note; but as they are not used in 
the actual text, and the example is stated 
emphatically to only be an illustration, and the 
Semantics appendix is clearly a better example, I 
do not feel that this is a fatal problem.

>  > >Problem:
>>  >
>>  >The translation to LBase ignores some of the aspects of URI references, I
>>  >believe.  In particular, I believe that RDF URI references can include
>>  >whitespace, which is not allowed in LBase names.
>>
>>  Really?? Well, I was unaware of that possibility, I confess. If true,
>>  that would require us to change the Lbase syntax to allow for this
>>  possibility. The intention was always that URIrefs could be used as
>>  Lbase identifiers.
>
>This needs to be investigated, I think.

I will undertake to investigate it and correct 
the Lbase note appropriately if required. The 
intention is that any URIref can be a Lbase 
identifier.

>
>>  >  I note also that LBase
>>  >doesn't even bother to define character strings.
>  >
>>  What would count as a definition? The Lbase document refers to
>>  sequences of Unicode characters.
>
>I just did a search for ``character'' in
>http://www.w3.org/TR/2003/NOTE-lbase-20030123/, which is the document
>referred to by http://www.w3.org/sw/RDFCore/TR/WD-rdf-mt-20030117/, and did
>not find any definition of what a character string is.
>
>A definition would be something like
>	A character string is a finite, possibly empty sequence of Unicode
>	characters [ref].
>The point is that character strings are only defined for some notion of
>characters, and there are quite a few possibilities to choose from.

That document is not written to be a formal spec document.

>
>>  >Problem:
>>  >
>>  >The translation to LBase can be broken by use of suitable URI references in
>>  >the RDF graph.
>>  >  For example the translation of
>>  >
>>  >	ex:a rdf:type LBase:String .
>>  >
>>  >would imply the translation of
>>  >
>>  >	ex:a rdf:type rdfs:Literal .
>>  >
>>  >which is not a valid rdfs-entailment.
>>
>>  The intention was that the Lbase special names cannot be generated
>>  from URIrefs.
>>  This is fixed now, see above, since the corrected special names are
>>  not legal URIs or Qnames.
>
>The document http://www.w3.org/TR/rdf-concepts/, referred to by
>http://www.w3.org/sw/RDFCore/TR/WD-rdf-mt-20030117/ as the definition of
>URIref, only requires that a URIref be a Unicode string that would produce
>a valid URI under a certain encoding. 
>
>This appears to allow for any sort of URI, including relative URIs, which
>could clash with the Lbase special names.
>

This entire issue was beneath the radar when the 
Lbase note was written. I do not consider it to 
be a matter worthy of discussion, since Lbase is 
not intended to be a language for processing by 
machines, and the note says explicitly that the 
exact syntax is not important. If any such 
syntactic accident should arise it can treated in 
an ad-hoc manner, eg by writing all URIrefs in 
one font and the special names in a different 
font.

Pat
-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 29 July 2003 03:31:51 UTC