Re: Test case regarding XML Literals and octets from Martin Duerst on 2003-08-03 (w3c-rdfcore-wg@w3.org from August 2003)

From: Martin Duerst <duerst@w3.org>
Date: Sun, 03 Aug 2003 14:41:36 -0400
To: pat hayes <phayes@ihmc.us>, Benja Fallenstein <b.fallenstein@gmx.de>
Cc: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>, www-rdf-comments@w3.org, w3c-i18n-ig@w3.org, msm@w3.org, w3c-rdfcore-wg@w3.org, reagle@w3.org
Message-Id: <4.2.0.58.J.20030801151419.06a1b290@localhost>
[if the issues addressed in this mail have been changed to the
better in the meantime, please ignore this mail]


At 08:56 03/08/01 -0500, pat hayes wrote:
>>pat hayes wrote:
>>>>I think "XML in exclusive canonical form" can indeed only be taken as 
>>>>octets; an abstract XML infoset certainly cannot be in canonical form.
>>>>
>>>>I believe that it is a bad idea to treat XML literals like this, though.
>>>
>>>Surely that is a matter to take up with the folk who wrote the XML 
>>>specification? It is not our task to re-write a normative specification 
>>>document written by another working group.

Which specification? XML 1.0? That's *the* XML specification.
I agree that it's not your job to re-write a normative
specification document written by another WG. Nobody asks you
to rewrite that spec, or any other. But what you can and should
do is to use other specifications with care. The recent discussion
with the group (w3c-ietf-xmldsig) responsible for the spec you use
has been summarized by Joseph Reagle in
http://www.w3.org/mid/200307311142.00425.reagle@w3.org:

 >>>>
In this case, I think there was a genuine question of whether Canonical XML
provides or precludes a encoding-less Canonical XML. I hope this discussion
and the text in [1] is sufficient to document that it doesn't provide, nor
does it preclude someone else from simply defining and using such a thing.
I think there's also a conceptual disagreement as to whether one can have
octets in a RDF graph, but I don't think that's our specification's
responsibility to answer.

So for the time being, I'm going to defer on an erratum. If the question
comes up repeatedly, perhaps that will be further evidence that an erratum
is necessary, or, more likely, that there's a requirement for new work.

[1]
http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2003JulSep/0039.html
 >>>>

This seems to say:

- The discussion has documented that Canonical XML does not provide
   for a canonical form without an encoding, but that it does not
   preclude anybody to define or use such a form.
- The authors of the Canonical XML spec do not think it is necessary
   to explicitly state this in their spec, they think that for the moment
   documentation in an email thread is good enough. They will reconsider
   this if other people or WGs come up with the same question/problem.
- The XML signature people do not think they are the ones to decide
   which form of canonicalization best serves RDF, or how a canonicalization
   might need to be modified to best serve RDF, or what exactly XML
   Literals should denote [i.e. the RDF Core WG is free to decide this]


>>  AFAIK, XML literals are defined in a specification written by RDFCore.
>>
>>(Sorry to use split-hair logic, but it seems to me that you're doing the 
>>same. ;-) )
>>
>>As I have quoted in the part of my mail that you cut, the Exclusive 
>>Canonical XML spec states that Exclusive Canonical XML is a serialization 
>>of an XPath node-set. So an exclusive canonical XML document denotes an 
>>XPath node-set, not an octet sequence.
>>
>>It is your choice that XML literals denote Exclusive Canonical XML (as 
>>opposed to, XPath node-sets, or XML infosets, neither of which are 
>>octets) and in my opinion that's a bad choice. You can disagree, but I 
>>don't think you can claim that the choice isn't yours.
>
>That choice is ours, indeed, but I do not think it was a bad choice. There 
>are good reasons for choosing exclusive canonicalization,

There are very good reasons to choose the relevant aspects of exclusive
canonicalization relating e.g. to the treatment of namespaces. But these
reasons are not affected by the choice of actual denotation for XML
Literals (sequence of octets or something more abstract).

I would also like to add that while for some reason some people seem
to insist that because of how (exclusive) canonical XML is written,
the denotation of XML Literals has to be as an octet sequence,
the RDF specs very well seem to be able to use a character string
as the abstract syntax, and also a character string (or its representation
with glyphs on paper or screen) for drawing graphs.


>to do precisely with the issues that we have already dealt with at length 
>concerning inheritance of surrounding XML context.

I'll address this in a separate mail because I feel there might be
some points which are still not totally clear.


>The fact that the document refers the definition of this form of 
>canonicalization to octets rather than nodesets (which does seem odd, now 
>you point it out) is largely irrelevant to us: we just read what is 
>written there and follow it. Take the matter up with the authors of that 
>document.

As you can see from the above quotation, we have indeed done so.
I hope you can take the response into account.


Regards,    Martin.
Received on Sunday, 3 August 2003 16:11:57 UTC