RE: Canonical XML 3.6 from John Boyer on 2000-12-13 (w3c-ietf-xmldsig@w3.org from October to December 2000)

From: John Boyer <jboyer@PureEdge.com>
Date: Wed, 13 Dec 2000 13:02:09 -0800
To: "Rick Jelliffe" <ricko@gate.sinica.edu.tw>
Cc: <w3c-ietf-xmldsig@w3.org>, <w3c-i18n-ig@w3.org>
Message-ID: <BFEDKCINEPLBDLODCODKCEPNCGAA.jboyer@PureEdge.com>

Hi Rick

<rick>
I missed it, twice!  Or at least on two separate occassions I read the
text, got identically confused and then later both times figured out what
was going on (only to forget).  In both cases, I think I assumed that
there was a typo and the & had been missed out from the things that looked
like numeric character references.
</rick>

I agree the form better.

The above is an example of what appears to be going on with this example.
It should read 'I don't agree that the suggested form is better', but the
above is how it goes if one doesn't read every second word.

The point is that misunderstandings are bound to occur if one creates one's
own interpretation of how to fix what are initially perceived to be errors.
It is easy to imagine inserting the word 'is' at the second to last
position, but it isn't what I meant.  It is preferable to look at all the
words in the sentence before assessing what I meant.

<rick>
I think it is bad speccing to present something that looks like XML but
isnt.
</rick>

Unfortunately, your suggested fix also looks like XML, and one still has to
read the notes to decipher what is really meant.  I don't think it's a bad
specification because it requires the reader to read the whole example
before trying to implement what the example suggests.

To wit, if I were to change the contents of the canonical form box to the
way you suggested, it is quite conceivable that we would receive letters
saying 'I was confused because I thought the hex dump had to come after the
doc end tag, and I didn't read the notes till later so I had to develop all
of this hex dumping code that I don't really need.  So, the spec should be
changed to say that the canonical form is <doc>#xC2#xA9</doc>, and it should
have a note explaining that the content between the doc tags are two bytes
expressed in hexadecimal'. 8-)

<rick>
Having the explantation after the event is not much help, because the
confusion has already ocurred.
</rick>

The explanation has to appear somewhere, and what you see in 3.6 is the
standard format used in all of the examples.

<rick>
I hope you will consider my option.

Cheers
Rick Jelliffe
</rick>

I've done this because it has come up before.  I chose the method you
currently see because it is the method of denoting hex characters in the
output that appears in examples in XML 1.0 second edition (and in the XML
erratum from which the example is drawn).  I agree that the example in c14n
takes a slightly different form, but it was the closest I could find to
having a standard way of expressing a hex character in an example.
Moreover, it seems to me that none of the suggested changes for what the
canonical form box should contain are better because one always has to read
the notes to realize that what is shown is not the actual canonical form but
rather a notation representative of the canonical form.  This was the point
of the (humorous) hypothetical feedback above.

Received on Wednesday, 13 December 2000 16:02:18 UTC