- From: Donald E. Eastlake 3rd <dee3@torque.pothole.com>
- Date: Wed, 30 May 2001 22:55:30 -0400
- To: w3c-ietf-xmldsig@w3.org
It seems to me that we need to come to some sort of conclusion on this... Maybe people think we did and I missed it, in which case I hope they will set me straight. The following proposal by Gregor for encoding STRINGS in DNames does not seem to have been refuted, but do we need to say more about leading and trailing whitespace on the DName as a whole? Thanks, Donald From: "Gregor Karlinger" <gregor.karlinger@iaik.at> To: "Tom Gindin" <tgindin@us.ibm.com>, "Merlin Hughs" <merlin@baltimore.ie>, <duerst@w3.org> Cc: <w3c-ietf-xmldsig@w3.org> Date: Wed, 16 May 2001 15:13:14 +0200 Message-ID: <LBEPJAONIMDADHFHAEAOMELJCFAA.gregor.karlinger@iaik.at> In-reply-to: <OF8536EC68.279166CB-ON85256A4D.005E2175@somers.hqregion.ibm.com> Subject: DName encoding (was:KeyName white space) All, I have cited the most important statements (from my point of view) regarding DName encoding below. I would like to make the following proposal for a guidline that should appear in XML-Signature on how to encode a strings in a DName in XML-Signature relevant structures (IssuerName, SubjectName): o Consider the string as consisting of unicode characters. o Escape occurences of the following special characters by prefixing it with the "\" character: - a "#" character occurring at the beginning of the string - one of the characters ",", "+", """, "\", "<", ">" or ";" o Escape all occurences of ASCII control characters (Unicode range \x00 - \x20) by replacing them with "\x" folloed by a two digit hex number showing its Unicode number. Since a XML document logically consists of characters, not bytes (Thanks Martin for reminding me ;-) the resulting unicode string is finally encoded according to the character encoding used for producing the physical representation of the XML document (But this is not in scope of the guidline that should be taken into the XML-Signature specification ...) Liebe Gruesse/Regards, --------------------------------------------------------------- DI Gregor Karlinger mailto:gregor.karlinger@iaik.at http://www.iaik.at Phone +43 316 873 5541 Institute for Applied Information Processing and Communications Austria --------------------------------------------------------------- ### Tom Gindin ### [...] > The situation is actually even more confusing than that. The rules > seem to be, for HT, LF, FF and VT, something like the following: [...] ### Merlin Hughs ### [...] > The escaping is useful because: > > <DName>CN=foo > </DName> > > According to RFC 2253, this states that the common name is > foo<LF><TAB>. However, if you trim() the text value then you > would get common name foo. Requiring escaping of ASCII controls > would result in CN=foo\0A\09 (if that is what is meant) which > is unambiguous and can be safely indented, whitespace formatted > and trim()ed. It would also eliminate a few other potential > ambiguities: [...] > is significant. If we require that all significant ASCII > controls be escaped, then trimming and formatting involving > newlines and tabs will be safe, and meaningful whitespace in > dnames will be explicit. ### Martin Duerst ### An XML document consists of characters, not bytes. And in content, each character can be escaped by a numeric character reference. Therefore, you can have a document only encoded in us-ascii, but will lots of &#dddd; (or &#xhhhh;); this will transport any character in Unicode, and a browser will display it correctly (assuming it has the right font). > > There's another issue that seems relevant. RFC 2253 states > > that strings must be converted to UTF-8 and then the escaping > > rules must be applied. Do we honour this, or should we UTF-8 > > decode the RFC2253 string before embedding it in the text node. > > > > Essentially, should the final example in RFC 2253 be encoded > > in XML as: > > > > UTF-8 encode and require ASCII escaping of high-bit-set chars: > > SN=Lu\C4\8Di\C4\87 This would be okay. [...] > > De-UTF-8 and embed the Unicode original: > > SN=Lu?i? (where ? is the original character) > > > > The last seems like the best option to me. I agree. But it has to be spelled out clearly, and tested. ############
Received on Wednesday, 30 May 2001 22:56:14 UTC