- From: Benja Fallenstein <b.fallenstein@gmx.de>
- Date: Thu, 31 Jul 2003 23:20:27 +0200
- To: Graham Klyne <GK-lists@ninebynine.org>
- CC: Martin Duerst <duerst@w3.org>, pat hayes <phayes@ihmc.us>, "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>, www-rdf-comments@w3.org, w3c-i18n-ig@w3.org, msm@w3.org, w3c-rdf-core-wg@w3.org
Hi Graham,
Graham Klyne wrote:
> as far as I can tell, you're contradicting the XML canonicalization spec.
>
> Is canonical XML a sequence of octets or something else?
>
> The XML canonicalization spec, I understand, says it's a sequence of
> octets.
I can see what you're saying. The XML c14n spec says that
The term exclusive canonical XML refers to XML that is in
exclusive canonical form.
<http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/#def-exclusive-canonical-XML>
which is refered to by
The lexical-to-value mapping [of XMLLiterals] maps a string to the
corresponding exclusive Canonical XML (with comments, with empty
InclusiveNamespaces PrefixList ).
<http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-concepts-20030117/#section-XMLLiteral>
I think "XML in exclusive canonical form" can indeed only be taken as
octets; an abstract XML infoset certainly cannot be in canonical form.
I believe that it is a bad idea to treat XML literals like this, though.
Exclusive Canonical XML is a *serialization* of an abstract concept, and
IMO the specs say this very clearly:
It is normal for XML documents and subdocuments which are equivalent
for the purposes of many applications to differ in their physical
representation. For example, they may differ in their entity
structure, attribute ordering, and character encoding. The goal of
this specification is to establish a method for serializing the
XPath node-set representation of an XML document or subset [...].
-- http://www.w3.org/TR/xml-exc-c14n/#sec-Intro
So exclusive canonical XML is a *serialization* for *a representation*
of *an XML document*. I think it makes little sense to specify a way to
denotate serializations-- that's like specifying that
"254"^^foo:integer
is a literal denoting the string of Unicode characters "FE", which is
the hexadecimal serialization of the integer 10; and that therefore, the
literal has the same denotation as
"FE"^^xsd:string
You *can* do it, but it doesn't make a lot of sense. (And it certainly
is surprising given that the data type is called 'foo:integer.')
I agree with Martin that it makes sense for the spec to say that XML
documents are an abstract set with equivalence defined by exclusive
c14n. If you don't like the abstract set approach, you could also say
that XML literals they denote XPath node-sets, that would be in keeping
with the c14n spec.
(I also agree that it would be good if XML literal without any markup
would be equivalent to the corresponding plain literals/XSD strings, but
that's off-the-point.)
I understand why it makes sense for the *lexical space* of XML literals,
to be Exclusive Canonical XML, but I don't understand for the *value space*.
Cheers,
- Benja
Received on Thursday, 31 July 2003 17:22:17 UTC