- From: Ed Simon <ed.simon@entrust.com>
- Date: Fri, 8 Oct 1999 16:06:57 -0400
- To: "'w3c-ietf-xmldsig@w3.org'" <w3c-ietf-xmldsig@w3.org>
Hi Mark (et al), Mark, it looks great. Since I only had a few minor comments, I've gone ahead and copied the archive as you suggested. Just to try and ensure that everyone has a common understanding of what this is about... It looks like there is an early concensus to removing the canonicalization element from <SignatureInfo> and simply requiring a DOM canonicalization instead. However, we still need a minimal canonicalization algorithm for in <Transformations> so the following text is applicable to that. (I wonder if we will want the option of minimal canonicalization for imbedded <Object> elements as well?) Ed -----Original Message----- From: Mark Bartel [mailto:mbartel@thistle.ca] Sent: October 8, 1999 3:36 PM To: 'Ed Simon ' Subject: RE: minimal canonicalization Ed, There does not seem to be any disagreement to removing the attribute. So, we should post a suggested section minimal canonicalization section to the list. May I propose --start section-- 7.5.2 Minimal Canonicalization The algorithm identifier for the minimal canonicalization is http://www.w3.org/1999/10/signature-core/minimal. An example of a minimal canonicalization CanonicalizationAlg element is <CanonicalizationAlg Algorithm="http://www.w3.org/1999/10/signature-core/minimal"/> ***[Ed: should we use the following instead ***<CanonicalizationAlgorithm name="http://www.w3.org/1999/10/signature-core/minimal"/> ***] The minimal canonicalization algorithm: * converts the character encoding to UTF-8, removing the encoding pseudo-attribute * normalizes line endings, changing CR/LF sequences to LF only ***[Ed: We also need to change each CR not followed by an LF. *** eg. change "Blah\rBlah." to "Blah\nBlah" ***] NOTE If possible, characters that are decomposed in the source character set remain decomposed when converted to UTF-8; similarly with composed characters. No Unicode composition/decomposition normalization is done. ; should we instead require Unicode Canonical Composition (Normalization Form C) as specified for W3C Text Normalization (see http://www.w3.org/TR/WD-charmod#TextNormalization)? -MB ***[Ed: I admit I'm somewhat out of my depth here but from my reading of ***"http://www.unicode.org/unicode/reports/tr15/tr15-10.html", I'd ***say keep the NOTE as it is and see if anyone disagrees. ***] This algorithm is only applicable to XML resources. ***[Ed: Well it would be OK for any text resource. ***For example, it might be useful for XPointers. ***] --end section-- If you agree, let me know and I'll post it (or you could just post it if you like). -Mark -----Original Message----- From: Ed Simon To: 'Mark Bartel' Sent: 10/7/99 3:00 PM Subject: RE: minimal canonicalization Hi Mark, I agree with the scenario and that minimal canonicalization should convert the data to UTF-8. I think the only question is whether the "encoding" pseudo-attribute should be changed to "UTF-8". In my original append, I was in favour of changing it but Don made a reasonable argument against that and I now lean somewhat to his viewpoint. On the issue of specifying the encoding, I can see reasons for and against and that's why, in my original append, I asked for debate about this. It is good you are bringing up the question. However, I am quite opinionated regarding whitespace canonicalization. Normalizing character encoding and line breaks is reasonable text-level normalization. If one is talking about whitespace normalization, then I have to assume that we're getting more into XML-aware canonicalization as opposed to just text-aware canonicalization. (Correct me if my assumption is wrong.) Now, I am very much for requiring, or strongly recommending, XML-aware canonicalization for XML objects but it seems to me that if one wants to do XML-aware canonicalization, you have to go the whole way which means normalization of character encoding, newlines, whitespace, namespaces, attribute ordering, attribute specification (explicit with defaults), and other things like those discussed in DOMHASH and Canonical XML. (Note: I oppose the information loss in Canonical XML.) To me, requiring whitespace normalization for minimal canonicalization is unnecessary for text-level canonicalization and insufficient (by itself) for XML-level canonicalization. (I recognize that changing the encoding pseudo-attribute implies some level of XML awareness but I thought it might be a reasonable exception.) Anyway, I've just re-posted Don's commentary of my orignal append. Perhaps you could take a look at that and then send your opinion to the archive. (You might include the text of this note as reference.) Perhaps, you could indicate exactly what is meant by "whitespace normalization". Regards, Ed -----Original Message----- From: Mark Bartel [mailto:mbartel@thistle.ca] Sent: October 7, 1999 1:30 PM To: 'Ed Simon ' Subject: minimal canonicalization Ed, here's the kind of scenario I was thinking of: You create a detached signature over a web page which is in some non-UTF8 character set. You then decide to change the character set to, say, UTF8. I think the signature should still verify if you have chosen minimal canonicalization, because part of the decision to use minimal canonicalization is that the character set encoding doesn't matter. Note that sometimes the character set of web pages gets converted enroute (because the web server is attempting to be smart). I got the impression that you and I didn't actually disagree. Should I post something to the list describing this scenario asking for further comment? Is there a counter-scenario? My proposal for minimal canonicalization section: --start section-- The minimal canonicalization algorithm: * converts the character encoding to UTF-8, removing the encoding pseudo-attribute * normalizes line endings [something about ambiguous code points here, need to look into this... haven't looked at your code]. Line ending normalization will change any CR/LF (0x0d/0x0a) line endings to LF (0x0a). This algorithms in only applicable to text resources. --end section-- -Mark Bartel JetForm
Received on Friday, 8 October 1999 16:07:10 UTC