- From: Joseph M. Reagle Jr. <reagle@w3.org>
- Date: Wed, 14 Feb 2001 19:45:33 -0500
- To: "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
- Cc: "Martin J. Duerst" <duerst@w3.org>, "John Boyer" <jboyer@PureEdge.com>
Here are my comments that can be combined with others, if made, before forwarding them on to the I18N groups. __ http://www.w3.org/TR/2001/WD-charmod-20010126/ I'm very glad to see this specification advanced as it is a very useful reference -- and educational tool for myself at least. One would think representing characters is easy, though it's tricky! Consequently, my comments are mostly editorial and relate to any confusions I experienced as a reader and could easily be remedied. A few references are made with respect to sections that realte to XML Signature, but these issues have been largely addressed by the last call of the XML Signature WG's documents: Core and Canonical XML. >1.1 Goals and Scope > All W3C specifications have to conform to this document (see section > [57]2 Conformance). Authors of other specifications (for example, IETF > specifications) are strongly encouraged to take guidance from it. As an aside, while I strongly support this goal, this sort of requirement is atypical and maybe should sit somewhere else in part of the W3C process/guide which is capable of enforcing it? >3.1.2 Units of a Writing System, and Units of Aural Rendering Please define phoneme, (as distinct from meaning), and syllabaries. >3.1.3 Units of Visual Rendering >[Unicode] requires that characters are stored and interchanged in logical >order. Please define "logical order" (or cite definition). >3.1.5 Units of Collation >Software developers MUST NOT merely use a one-to-one mapping as their >string-compare function, as in sorting operations. What are you suggesting they do? Relying upon human context to determine order seems rather haphazard. For instance, how do you sort the words in an English document which contains excerpts from a Spanish document containing sequences such as "ch" and "ll" which are considered atomic collation units in their native document, but not the document in which they are in? >3.2 Digital Representation of Characters >3. To enable use in computers, a suitable base datatype is identified (such >as a byte, a 16-bit wyde or other) and a character encoding form (CEF) is >used, which encodes the abstract integers of a CCS into sequences of the >code units of the base datatype. Note "wyde" typo. Much of this summary is fairly easy to understand and is demonstrated in Appendix A. However, the distinction between CEF and CES is not very clear and might merit an example -- if it can be done simply, getting in to endian and BOM might confuse the case... >3.6.1 Character Encoding Identification >Because of the layered Web architecture (e.g. formats used over protocols), >there may be multiple and at times conflicting information about character >encoding. Specifications MUST define conflict-resolution mechanisms (e.g. >priorities) for these cases, and implementers and content developers MUST >follow them carefully. This requirement can be relevant to dsig that there is a type attribute (of type URI) that could identify the encoding of an identified resource being signed. However, the signature text speaks of dsig types, not MIME types though MIME types when represented as a URI could be included: >http://www.w3.org/TR/2000/CR-xmldsig-core-20001031/#sec-Reference >4.3.3 The Reference Element >. The Type attribute facilitates the processing of referenced data. For >example, while this specification makes no requirements over external data, >an application may wish to signal that the referent is a Manifest. If someone did use this to describe the MIME type, the dsig spec does not address how to resolve conflicting information and leaves it to the application. >4 Early Uniform Normalization >4.1 Motivation >This document also specifies that normalization is to be performed early >(by the sender) as opposed to late (by the recipient). Note, the dsig specification RECOMMENDS but does not require the signature be in NFC: >http://www.w3.org/TR/2000/CR-xmldsig-core-20001031/#sec-XML-Canonicalization >We RECOMMEND that signature applications create XML content (Signature >elements and their descendents/content) in Normalization Form C [NFC] and >check that any XML being consumed is in that form as well (if not, >signatures may consequently fail to validate). >4.3 Responsibility for Normalization >Note: The prohibition of normalization by recipients is necessary for >consistency, on which security depends. DSIG is compliant with this: >http://www.w3.org/TR/2000/CR-xmldsig-core-20001031/#sec-See >8.1.3 "See" What is Signed >Consequently, while we RECOMMEND all documents operated upon and generated >by signature applications be in [NFC] (otherwise intermediate processors >might unintentionally break the signature) encoding normalizations SHOULD >NOT be done as part of a signature transform, or (to state it another way) >if normalization does occur, the application SHOULD always "see" (operate >over) the normalized form. >8 Character Encoding in URI References >This chapter defines how to address this issue in W3C specifications in a >way consistent with the model defined in this document and with deployed >practice. DSIG is compliant with this, see: >http://www.w3.org/TR/2000/CR-xmldsig-core-20001031/#sec-URI __ Joseph Reagle Jr. http://www.w3.org/People/Reagle/ W3C Policy Analyst mailto:reagle@w3.org IETF/W3C XML-Signature Co-Chair http://www.w3.org/Signature W3C XML Encryption Chair http://www.w3.org/Encryption/2001/
Received on Wednesday, 14 February 2001 19:45:41 UTC