Re: Semantics issues from pat hayes on 2003-01-21 (www-rdf-comments@w3.org from January to March 2003)

From: pat hayes <phayes@ai.uwf.edu>
Date: Tue, 21 Jan 2003 12:07:49 -0700
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Cc: bwm@hplb.hpl.hp.com, www-rdf-comments@w3.org, w3c-rdfcore-wg@w3.org
Message-Id: <p05111b0bba5344f44538@[10.0.100.247]>
>From: pat hayes <phayes@ai.uwf.edu>
>Subject: Semantics issues (was:Re: email problems)
>Date: Mon, 20 Jan 2003 10:46:02 -0800
>
>>  >If the current version of the Semantics document is the one pointed to from
>>  >the RDF Core WG home page as the LCC, then there are still lots of errors,
>>  >many significant.
>>  >
>>  >
>>  >From: pat hayes <phayes@ai.uwf.edu>
>>  >Subject: Re: email problems
>>  >Date: Sun, 19 Jan 2003 21:13:56 -0800
>>  >
>>  >[...]
>>  >
>>  >>  >It appears to me that RDF(S) literals are now broken.  (I'm 
>>working from
>>  >>  >the LCC candidate at
>>  >>http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-mt-20030117/)
>>  >>  >
>>  >>  >The only semantic constraints that mention rdfs:Literal are
>>  >>  >1/ I(rdfs:Literal) is a member of IC
>>  >>  >2/ rdfs:comment rdfs:range rdfs:Literal.
>>  >>  >3/ rdfs:label rdfs:range rdfs:Literal.
>>  >>
>>  >>  Now in addition we have that ICEXT(I(rdfs:Literal) ) is a subset of
>>  >>  LV. Also in datatyped interpretations, all datatype classes are
>>  >>  subclasses ofICEXT(I rdfs:Literal)). So I think that this covers your
>>  >>  problems, except as noted below.
>>  >
>>  >No.  Now typed literals are handled somewhat better, but untyped literals
>>  >are still rather strange, because the denotation of "a" is not necessarily
>>  >in CEXT(I(rdfs:Literal)).
>>
>>  True, it is not necessarily in that class extension. Why does that
>>  bother you? There is no way in any piece of RDF syntax to assert that
>>  it (or any other particular literal) is or is not in that class, so
>>  to impose a semantic condition which cannot be expressed
>>  axiomatically would serve no purpose other than to guarantee
>>  incompleteness of any inference system.
>
>Well, so what?  Is this any different than the situation with respect to
>RDF datatypes?  How is this any different from the situation with respect
>to rdf:XMLLiteral?  It seems to me that if you can handle rdf:XMLLiteral
>you should be able to handle untyped literals being in
>CEXT(I(rdfs:Literal)).

Well, we could handle them, as you put it, by insisting that 
ICEXT(I(rdfs:Literal)) = LV. I did consider that alternative, and it 
would be easy to make the modification. I presume you would prefer 
that.

>
>>  If you want OWL to have that
>>  extra condition (eg that ICEXT(I(rdfs:Literal)) = LV, say) then OWL
>>  is free to add that as an extra condition. Until RDF syntax provides
>>  some way to have literals as ubjects, however, I would recommend
>>  against it. (IF RDF did have literals as subjects I would have had
>>  the class rdfs:Literal tightly defined from the very start.)
>>
>>  ....
>
>Right now the situation is very unusual with respect to literals in RDF.
>In RDFS, the following is not valid
>
>	ex:john ex:foo "a" .
>
>entails
>
>	ex:john ex:foo :_a .
>	:_a rdf:type rdfs:Literal .
>
>but any datatype entailment that includes a datatype that has the string
>"a" in its value space (even if the lexical-to-value mapping does not map
>the syntactic construct "a" to this string) the entailment follows.

True. I guess I see a kind of sense to this situation, but perhaps I 
was being too finicky. It would certainly be more coherent, as it 
were, if untyped literals were treated similarly to typed literals 
with a kind of built-in trivial datatype whose lexical space was 
literals and whose L2V mapping was identity.

OK, let us assume that change will be made before publication. Then 
all plain literals and all well-formed typed literals denote 
something in rdfs:Literal.

>
>>  >[...]
>>  >
>>  >In the LCC document, datatypes are still broken.  For starters, the
>>  >document is inconsistent with respect to just what is a datatype; sometimes
>>  >it is a member of the domain, sometimes it is a URIref.
>>
>>  I have not detected that inconsistency, can you point me at where it
>>  occurs? The document uses datatype urirefs to refer to datatypes,
>>  which seems consistent to me.
>>
>>  The intent is that recognized datatypes are always members of the domain.
>
>Then how do you justify
>
>      The set of recognized datatypes always includes rdf:XMLLiteral
>
>in Section 3.4?

Er.. What kind of justification do you need? rdf:XMLLiteral *is* a 
member of the domain, that follows from the general assumption that 
URIrefs always denote. The quote that you want justified is supposed 
to be part of the definition of 'recognized', not a lemma that needs 
to be established.

I didnt say 'rdf:XMLLiteral' was a member of the domain, notice 
(though it is, as a matter of fact.)

>
>>  >The treatment of
>>  >rdf:XMLLiteral is very suspect, for example what happens to "a"^^ex:foo if
>>  >I(rdf:XMLLiteral) = I(ex:foo)?
>>
>>  If that identity holds (which would be rather extraordinary) then
>>  ex:foo would in fact be rdf:XMLLiteral, ie
>>
>>  owl:sameIndividualAs ex:foo rdf:XMLLiteral .
>>
>>  would be true. So in that case indeed, the typed literal you have
>>  written would have the same value as "a"^^rdf:XMLLiteral in that
>>  interpretation.
>
>Not according to the RDF model theory.  In this model theory rdf:XMLLiteral
>is handled very differently than other datatypes.

The intention is that it should be handled exactly similarly (apart 
from the lang tags complication) . But I see that there is a crack, 
indeed, since I do not define L2V for XML literals.  See below.

>rdf:XMLLiteral is
>handled (in Section 3.1) by
>
>	if xxx is a well-formed XML document,
>	then IL("xxx"^^rdf:XMLLiteral) is the XML canonical form of xxx
>	...
>
>Other datatypes are (very awkwardly) handled in Section 3.4, by
>
>	... satisfies the following extra conditions on all datatypes other
>	than the built-in datatype:
>	...
>	For any typed literal "sss"^ddd or "sss"@ttt^^ddd, if I(ddd) is in
>	D and 'sss' is a valid lexical form for I(ddd) then IL("sss"^^ddd)
>	= L2V(ddd)(sss)
>
>(By the way, this appears to imply that the only lexical forms that matter
>are ones that start and stop with a single quote.

Yes, the use of those single quotes is a mistake, I will remove 
those. And you are right, I should have said IL("sss"^^ddd) = 
IL("sss"@ttt^^ddd) = L2V(ddd)(sss), an editorial slip, sorry.

>  Also, this is missing
>the interpretation of "sss"@ttt^^ddd.)
>
>So "a"^^ex:foo falls through the cracks.  It doesn't match the conditions
>in Section 3.1 because it has the wrong URI ref and it doesn't match the
>conditions in Section 3.4 because they don't hold for I(rdf:XMLLiteral).

They do if the L2V mapping of rdf:XMLLiteral is defined properly. I 
see that there is a editorial crack here, in that I do not mention 
the L2V mapping when describing rdf:XMLLiteral, since the concept of 
a lexical-to-value mapping is not introduced until the section on 
datatyping. This needs to be fixed, indeed, but its purely an 
expository point.

(The reason for this is the decision, taken rather late in the day, 
to treat XML literals as a built-in datatype rather than a 
syntactically distinct form of literal. Unfortunately this screws up 
the expository structure of the document somewhat.)

>
>>  Of course, there is no way in RDF(S) or even OWL-DL to establish such
>>  an identity, so the question only makes sense in OWL-Full.  I have no
>>  problems with this, myself.
>>
>>  >The translation into Lbase is still broken with respect to rdf:XMLLiteral.
>>
>>  Can you be more specific?
>
>TR("sss"@ttt^^rdf:XMLLiteral) = L2V(TR["sss"],TR[ddd])
>which incorrectly ignores the language tag
>
>I have pointed this out at least one before in a message to w3c-rdfcore-wg

Right, sorry. There is at least one version in which this has been 
fixed, but it might have got lost in the last-minute document flurry. 
Mia culpa. The best version of the translation gives the explicit 
translation from Jeremy's document for each distinct type of literal, 
which is obviously what should have been done in the first place.

Pat


>
>>  Pat
>
>Peter F. Patel-Schneider
>Bell Labs Research
>Lucent Technologies


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola              			(850)202 4440   fax
FL 32501           				(850)291 0667    cell
phayes@ai.uwf.edu	          http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu   for spam
Received on Friday, 24 January 2003 12:03:06 UTC