- From: Benja Fallenstein <b.fallenstein@gmx.de>
- Date: Thu, 31 Jul 2003 23:20:27 +0200
- To: Graham Klyne <GK-lists@ninebynine.org>
- CC: Martin Duerst <duerst@w3.org>, pat hayes <phayes@ihmc.us>, "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>, www-rdf-comments@w3.org, w3c-i18n-ig@w3.org, msm@w3.org, w3c-rdf-core-wg@w3.org
Hi Graham, Graham Klyne wrote: > as far as I can tell, you're contradicting the XML canonicalization spec. > > Is canonical XML a sequence of octets or something else? > > The XML canonicalization spec, I understand, says it's a sequence of > octets. I can see what you're saying. The XML c14n spec says that The term exclusive canonical XML refers to XML that is in exclusive canonical form. <http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/#def-exclusive-canonical-XML> which is refered to by The lexical-to-value mapping [of XMLLiterals] maps a string to the corresponding exclusive Canonical XML (with comments, with empty InclusiveNamespaces PrefixList ). <http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-concepts-20030117/#section-XMLLiteral> I think "XML in exclusive canonical form" can indeed only be taken as octets; an abstract XML infoset certainly cannot be in canonical form. I believe that it is a bad idea to treat XML literals like this, though. Exclusive Canonical XML is a *serialization* of an abstract concept, and IMO the specs say this very clearly: It is normal for XML documents and subdocuments which are equivalent for the purposes of many applications to differ in their physical representation. For example, they may differ in their entity structure, attribute ordering, and character encoding. The goal of this specification is to establish a method for serializing the XPath node-set representation of an XML document or subset [...]. -- http://www.w3.org/TR/xml-exc-c14n/#sec-Intro So exclusive canonical XML is a *serialization* for *a representation* of *an XML document*. I think it makes little sense to specify a way to denotate serializations-- that's like specifying that "254"^^foo:integer is a literal denoting the string of Unicode characters "FE", which is the hexadecimal serialization of the integer 10; and that therefore, the literal has the same denotation as "FE"^^xsd:string You *can* do it, but it doesn't make a lot of sense. (And it certainly is surprising given that the data type is called 'foo:integer.') I agree with Martin that it makes sense for the spec to say that XML documents are an abstract set with equivalence defined by exclusive c14n. If you don't like the abstract set approach, you could also say that XML literals they denote XPath node-sets, that would be in keeping with the c14n spec. (I also agree that it would be good if XML literal without any markup would be equivalent to the corresponding plain literals/XSD strings, but that's off-the-point.) I understand why it makes sense for the *lexical space* of XML literals, to be Exclusive Canonical XML, but I don't understand for the *value space*. Cheers, - Benja
Received on Thursday, 31 July 2003 17:22:17 UTC