Re: Followup on I18N Last Call comments and disposition from Joseph M. Reagle Jr. on 2000-07-12 (w3c-ietf-xmldsig@w3.org from July to September 2000)

From: Joseph M. Reagle Jr. <reagle@w3.org>
Date: Wed, 12 Jul 2000 18:06:07 -0400
To: "Martin J. Duerst" <duerst@w3.org>
Cc: w3c-ietf-xmldsig@w3.org, "John Boyer" <jboyer@PureEdge.com>, www-international@w3.org
Message-Id: <3.0.5.32.20000712180607.00ab71c0@localhost>
Martin,

I just realized all of the proposals we agreed to in [1], didn't make it
into the spec that was published yesterday[2], however, their disposition is
captured as part of the last call issues [3] and will make it in.

[1] http://www.w3.org/Signature/2000/06/section-8-I18N.html
[2] http://www.w3.org/TR/2000/WD-xmldsig-core-20000711/
[3] http://www.w3.org/Signature/20000228-last-call-issues.html

I think if you are happy with this email, most all of the comments for the
Signature spec will have been addressed. There's one issue Eastlake should
clarify, and there's 2.5 rows that have a 'C14N?' that pertain specifically
to C14N (mostly wording) and I will have to check with Boyer to make sure
they are addressed.


At 18:44 7/10/00 +0900, Martin J. Duerst wrote:
 >>Sorry, I still don't understand. The distinction seems immaterial,
 >>particularly depending on your view of the architecture where the
signature
 >>engine is seperate from the content-content regardless. The content we are
 >>talking about is Signatures, not word processing files and such. Signature
 >>applications create XML Signatures, that's the content they concern
 >>themselves with.
 >
 >Then that should be made clearer.
 
Propose: We RECOMMEND that signature applications create XML content
/+(Signature elements and their descendents/content)+/ in Normalized Form C
[NFC] and check that any XML being consumed is in that form as well (if not,
signatures may consequently fail to validate).

 >>  >>  >    It should be mandated that, when a document is transcoded from
a
 >>  >>  >    non-Unicode encoding to Unicode as part of C14N, normalization
 >>must be
 >>  >>  >    performed (At the bottom of section 2 and also in A.4 in the
June
 >>1st
 >>  >>  >    2000 draft).

Are you suggesting this text be a part of Signature and C14N documents? For
the signature specification I've appended a new sentence to the end of the
proposed text in 6.5:

__

Canonicalization algorithms takes two implicit parameter when they appear as
a CanonicalizationMethod within the SignedInfo element: the content and its
charset. (Note, there may be ambiguities in converting existing charsets to
Unicode, for an example see the XML Japanese Profile [XML-Japanese] NOTE.)
The charset is derived according to the rules of the transport protocols and
media formats (e.g, RFC2376 [XML-MT] defines the media types for XML). This
information is necessary to correctly sign and verify documents and often
requires careful server side configuration. Various canonicalization
algorithms require conversion to [UTF-8]. Where any such algorithm is
REQUIRED or RECOMMENDED the algorithm MUST understand at least [UTF-8] and
[UTF-16] as input encodings. Knowledge of other encodings is OPTIONAL.
Transcodings from a non-Unicode encoding to Unicode as part of
canonicalization SHOULD be performed
__

I say SHOULD because this part of the specification is generic and we do not
exclude algorithms from use, only specific required to implement. If someone
uses some other C14N that doesn't do this, we recommend against it (as does
the Web Character Model) but it doesn't make sense to consider it Signature
non-compliant.

 >>Can you point me to the text (even if a Member URL). You are asking me to
 >>align our text with something I haven't seen (and consequently don't
 >>understand why it is necessary.)
 >
 >See http://www.w3.org/TR/xmlbase#IDwNAq1.
 >Please note that 'crosshatch' should be changed to 'number sign',
 >because this is the official ISO 10646/Unicode name.

Ok, text in the Reference section will say:

... The set of characters for URIs is the same as for XML, namely [Unicode].
However, some Unicode characters are disallowed from URI references: all
non-ASCII characters and the excluded characters listed in Section 2.4 of
[IETF RFC 2396], excluding the number sign (#) and the percent sign (%); and
excluding the square bracket characters re-allowed in [IETF RFC 2732].
Disallowed characters must be escaped as described in section 4.1.1 URI
Reference Encoding and Escaping of [XPtr]:
1. Each disallowed character is converted to UTF-8 [IETF RFC 2279] as one or
more bytes.
2. Any octets corresponding to a disallowed character are escaped with the
URI escaping mechanism (that is, converted to %HH, where HH is the
hexadecimal notation of the byte value).
3. The original character is replaced by the resulting character sequence.


_________________________________________________________
Joseph Reagle Jr.   
W3C Policy Analyst                mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair   http://www.w3.org/People/Reagle/
Received on Wednesday, 12 July 2000 18:06:14 UTC