- From: pat hayes <phayes@ihmc.us>
- Date: Sun, 27 Jul 2003 17:01:26 -0500
- To: Martin Duerst <duerst@w3.org>
- Cc: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>, www-rdf-comments@w3.org
>Hello Peter, > >Many thanks for your very clear and detailed explanations. > >At 07:54 03/07/25 -0400, Peter F. Patel-Schneider wrote: > >>This quesion is related to pfps-04 because pfps-04 is concerned with >>equality between XML literals in RDF. >> >> >>The root of this problem is that a complete treatment of XML literals in >>RDF needs a complete theory of equality for them. This theory of equality >>cannot just determine equality between XML literals in RDF but also has to >>determine equality between XML literals and other objects in the RDF domain >>of discourse, in particular plain RDF literals and the value space for the >>XML Schema string datatype. > >[This is a very general concern] >As far as I understand, RDF does not really mention XML Schema datatypes >in any normative way, so how would it normatively specify equivalences >to these datatypes? Also, what about other datatype systems that have >very similar constructs? A lot of datatype systems will have some kind >of 'string' type, and a lot of such systems will have some kind of >numeric types (which you mention below). What about these equivalences? > >>Some of these answers can (now) be fairly easily determined from a simple >>perusal of the RDF documents and the canonicalization documents. >> >>Two XML literals are (now) equal in RDF precisely when their Exclusive >>XML Canonicalizations are the same octet sequence. > >Okay. The equivalences would stay exactly the same if XML literals >would be represented a character sequences rather than as octet >sequences. 'equal' here means 'denote the same thing', not 'is identical to' . Nobody is suggesting interfering with how literal strings are represented or encoded. We had to choose some criterion to refer to in order to establish questions of identity between referents. >>However other answers are harder to determine. >> >>1/ When is an XML literal equal to a plain RDF literal? A plain RDF >>literal is a Unicode string (sequence of Unicode characters), so this >>question boils down to whether octets and Unicode characters are disjoint. >>I found it difficult to answer this question, because of hints in the >>exclusive canonicalization document that they are not. > >Can you point to the places where you saw such hints. If there are >such hints, then they definitely have to be fixed, and I'll make >sure that this happens. > >Apart from that, it is very important to make sure that the plain >string "<br/>" (in XML written as "<br/>") is not the >same as the XML markup "<br/>" (in XML written as "<br/>"). >So it is indeed important to make sure this question can easily >be answered. If we were to specify that plain literals and XML literals both denote Unicode character sequences, then "<br/>" and "<br/>"^^rdf:XMLLiteral would be equal and neither of them would bear any RDF relationship to a literal whose character string was "<br/>" So it sounds like you want to say that XML values and Unicode character strings must be distinct; which is the situation we currently have. >However, I think it is absolutely inappropriate to solve this >problem by saying that one of them is characters and the other >is encoded in octets. We aren't saying that XML literals denote things that are encoded in octets: we are saying that XML literals denote the octets themselves. >If there is no other solution here than >with some kind of hack, I think it would be preferable to say >e.g. that characters in plain literals are green, and characters >representing XML literals are red. (and add a note to clarify >that green characters and red characters are not the same). The characters *in* the literals are the same. All literal strings in RDF are character sequences in one single uniform sense (sequences of Unicode in normal form C). The discussion is about what the various kinds of literal *denote*. The point is, we have a distinction between two kinds of literals. To put it crudely, a string (the literal string) can be labelled as 'plain' in which case it (rather oddly) denotes itself, or as 'XML-ish', in which case it might denote something else. The question is, what? The issue is not to do with how the literal itself is encoded or represented. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32501 (850)291 0667 cell phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Sunday, 27 July 2003 18:01:27 UTC