changes to RDFS MT post-last-call from pat hayes on 2003-01-28 (www-webont-wg@w3.org from February 2003)

From: pat hayes <phayes@ai.uwf.edu>
Date: Mon, 27 Jan 2003 20:26:06 -0800
To: w3c-rdfcore-wg@w3.org
Cc: pfps@research.bell-labs.com, www-webont-wg@w3.org
Message-Id: <p05111b03ba5a23876a4f@[172.16.1.143]>
(Folks, I apologize for being so net-access-challenged for the last 
few weeks. It is getting harder to access networks remotely, and I 
was not prepared for the technical problems I have been encountering.)

Given the various issues which Peter has raised, I propose to make 
the following changes to the RDF semantics doc before final 
publication. Some of these require help from other members of the WG, 
so this message is partly an appeal for that help.

1. The special role of LV will be eliminated. In itself, this will 
not affect any entailments; it is a matter of mathematical style; it 
will make the MT somewhat more conventional to some extent, and also 
more uniform, so I think will be considered an improvement.

(The current treatment is in fact a residue of a time when the 
interpretation of literals was far less complex than it has now 
become, and it would be simpler to treat literals as simply another 
kind of name, require them to be interpreted by each interpretation 
in the usual way, and impose the 'global' conditions as semantic 
requirements on interpretation mappings.)

2. The text will make it abundantly clear that a plain literal 
without a language tag both is, and denotes, the same unicode 
character string.

3. The interpretation of rdfs:Literal will always contain all literal 
values of plain literals, with or without language tags. This will 
fix the entailments that Peter feels are required (and which Patrick 
says were already agreed to by the WG: if so, I apologize for the 
editorial oversight in not noticing this earlier.) The net effect 
will be that a plain literal will be treated similarly to a typed 
literal with a 'trivial' datatype whose lexical and value spaces are 
unicode strings plus <string, lang-tag> pairs, and whose L2V mapping 
is identity.

4. The exact role of XML canonicalization in the built-in datatype 
will be be clarified both in the equations and the prose. The intent, 
as I understand it, is that any typed rdf:XMLLiteral, with or without 
a language tag, should denote the result of applying a process called 
'XML canonicalization' to that literal (with the language tag added 
if it is present according to the convention described in Jeremy's 
Concepts document). I do not know the appropriate form of words to 
use to refer to this process or its result: apparently, two forms of 
words which I took to be synonymous may not be. I would appreciate 
any advice on the correct forms of words to use to refer to XML 
canonicalization.

I would LIKE to say the following:  in a literal

"aaa"^^rdf:XMLLiteral

if the aaa IS a unicode string which can be parsed into (? 
represents? encodes? is a lexical form of? is?) a well-formed XML 
document (? expression? structure?), then the value of the literal - 
what it denotes - IS the canonical XML document (? expression? 
structure?) which it parses into (? represents? encodes? is a lexical 
form of?). In other words, the body of the literal (like other 
literals) is  a string, but the value is an XML thingie. This 
deliberately leaves open the question of whether or not XML thingies 
*really are* strings, notice.

If it would be kosher to simply identify XML thingies with unicode 
strings this could be somewhat simplified, but my past discussions 
with XML experts has left me unsure about whether or not this is 
considered an appropriate assumption. I would be grateful if some XML 
maven could enlighten me.

5. The treatment of XML literals will be aligned exactly with the 
treatment of other datatyped literals, both in the MT and the closure 
rules. Given the above, the net result will be that rdf:XMLLiteral , 
considered as a datatype, has as its lexical space the union of the 
set of all unicode character strings which parse into well-formed XML 
and the set of all pairs of said strings with language tags, and its 
value domain the set of all canonicalized XML documents, and the L2V 
mapping defined by the XML canonicalization process with lang tags 
handled as in Jeremy's document.

The closure rules for XML literals will be re-stated to handle the 
case noted by Peter regarding canonical forms for lang tags.

6. The translation for XML literals into Lbase will be rendered in 
excruciating detail and aligned with the MT. Readers should however 
note that only the MT is considered to be normative.

7. The incompleteness of the closure rules noted by Peter will be 
fixed by re-defining the notion of rule closure of a graph to allow 
generalization over literals before applying the closure rules. (This 
was a genuine technical slip, and I am grateful to Peter for catching 
it.)

8. I will try to clarify the text where it refers to a datatype by 
name, to avoid the potential use/mention misunderstandings which seem 
to have arisen.

---
I expect to have these editorial changes completed before February 7th.
---

The following issues raised by Peter are ones I propose to ignore:

Lbase is irrelevant;
we do not define the denotation mapping between urirefs and datatypes;
some XML datatypes do not provide enough information to enable all 
RDF datatyping inferences to be made.

Pat


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola               			(850)202 4440   fax
FL 32501            				(850)291 0667    cell
phayes@ai.uwf.edu	          http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu   for spam
Received on Monday, 3 February 2003 11:21:53 UTC