Re: Problem in exclusive canonicalization? encoding underspecified

On Monday 30 June 2003 12:47, Martin Duerst wrote:
> I just have had a look at
> http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718, and found
> two problems, one of them i18n-related.
>
>
> 1) encoding underspecified?
>
> The exclusive canonical form of a document subset is a physical
> representation of the XPath node-set, as an octet sequence, produced by
> the method described in this specification.
>
> This does not at all say what the encoding is. Is this UTF-8? If yes,
> where is this specified? If no, what is the encoding? Is the reader
> supposed to go check elsewhere?

Hey Martin, one has to chase it a bit:

| http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/#sec-Specification
| The data model, processing, input parameters, and output data for 
| Exclusive XML Canonicalization are the same as for Canonical 
| XML [XML-C14N] with the following exceptions

| http://www.w3.org/TR/2001/REC-xml-c14n-20010315#ProcessingModel
| The XPath node-set is converted into an octet stream, the canonical 
| form, by generating the representative UCS characters for each node in 
| the node-set in ascending document order, then encoding the result 
| in UTF-8 (without a leading byte order mark).


> 2) what is 'visible'?
>
> The document says "namespace nodes that are not on the
> InclusiveNamespaces PrefixList are expressed only in start tags where
> they are visible and if they are not in effect from an output ancestor of
> that tag."
> The word 'visible' turns up only one more time, again not in a defining
> context. Readers probably can work out what 'visible' is supposed to
> mean from context and examples, but that's not how a spec should work,
> I guess.

I suppose we should've been more careful with those terms and linked to the 
following definition:

| http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/#def-visibly-utilizes
| An element E in a document subset visibly utilizes a namespace
| declaration, i.e. a namespace prefix P and bound value V, if E or an
| attribute node in the document subset with parent E has a qualified name
| in which P is the namespace prefix. A similar definition applies for an
| element E in a document subset that visibly utilizes the default namespace
| declaration, which occurs if E has no namespace prefix.

Basically, does any of the subset actually need that namespace?

Received on Monday, 30 June 2003 15:29:30 UTC