- From: Martin Duerst <duerst@w3.org>
- Date: Thu, 31 Jul 2003 17:57:15 -0400
- To: Graham Klyne <GK-lists@ninebynine.org>, pat hayes <phayes@ihmc.us>
- Cc: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>, www-rdf-comments@w3.org, w3c-i18n-ig@w3.org, msm@w3.org, w3c-rdf-core-wg@w3.org, reagle@w3.org
At 21:42 03/07/31 +0100, Graham Klyne wrote: >Martin, > >as far as I can tell, you're contradicting the XML canonicalization spec. No. I'm just saying that it is a bad idea to use XML canonicalization, which was developed for purposes such as parser testing, digital signatures, and encryption, to come up with a proposal for what an XML literal denotes. >Is canonical XML a sequence of octets or something else? Canonical XML is a sequence of octets. (Exclusive) canonical XML is a good tool to answer questions about the equivalence of XML fragments. Canonical XML, in any kind of definition currently available 'off-the-shelf', is not a good tool to express what XML Literals denote. >The XML canonicalization spec, I understand, says it's a sequence of octets. > >Maybe, you want to say it's a sequence of octets that is to be interpreted >in specific way, in which case it's not *just* a sequence of octets, but >must also carry some distinguishing datum that indicates that this special >processing is required. It's not necessarily a requirement. But it's the most usual and appropriate thing to do with an XML Literal. On the other hand, it's a totally arbitrary thing to do with an octet sequence. So yes, the expectations for processing are different. >Specifically, if I have the values denoted by: > > <eg:bar rdf:parseType="Literal"><br/></eg:bar> > >and > > <eg:bar rdf:datatype="http://www.w3.org/2001/XMLSchema#hexBinary" > >3C62722F3E</eg:bar> > >what is it that tells me the first is to be treated as markup, but not the >second? The first is markup. The second is a sequence of binary octets. And the two are not equivalent according to RDF. Because the canonicalization of <br/> is <br></br>, the octet sequence for <br/> in hexBinary is 3C62723E3C2F62723E. <br/>, <br></br>, and 3C62723E3C2F62723E (with the appropriate syntactic decorations) entail each other. The don't entail 3C62722F3E. There may be some odd cases where 3C62722F3E will be interpreted as XML. The RDF spec would not support that, but it would not prohibit that. However, the RDF spec (if we agree on your interpretation and make my test case positive) says that 3C62723E3C2F62723E is the same as the XML Literal(s) <br></br> or <br/>. This strikes me as very odd, to say the least. Regards, Martin.
Received on Thursday, 31 July 2003 18:01:45 UTC