- From: Martin Duerst <duerst@w3.org>
- Date: Wed, 23 Jan 2002 07:47:14 +0900
- To: "Gregor Karlinger" <gregor.karlinger@iaik.at>, "Gregor Karlinger" <gregor.karlinger@iaik.at>, "Joseph M. Reagle Jr." <reagle@w3.org>
- Cc: "XMLSigWG" <w3c-ietf-xmldsig@w3.org>
Hello Gregor, I think you are partially right. The actual resulting text is bytes in one case, and Unicode characters in another case. Same for the original. What I meant was that the escaping procedure can be worded in either terms, and that doesn't make a difference. Regards, Martin. At 14:04 02/01/18 +0100, Gregor Karlinger wrote: >Martin, > > > -----Original Message----- > > From: Martin Duerst [mailto:duerst@w3.org] > > Sent: Wednesday, January 16, 2002 2:50 PM > >[...] > > > I guess I would be the first to jump at anybody who confused > > bytes and characters in a dangerous way. But here, one or > > the other will lead to the same result, because we are only > > looking at both characters as well as bytes with values below > > 128, and in this case, it doesn't make any difference at all, > > due to the properties of UTF-8. > >I do not think that your statement is true. Consider the following use >case: > > * A signature application wants to add a X509SubjectName child to > the KeyInfo element, containing a DN with a single RDN with a > single AVA, for which the attribute type is "CN" and the value > should be "Heinrich Muller", whereas "Heinrich Muller" is a unicode > string, and the application does not bother how this unicode string > is encoded in XMLDSIG or in a X509Certificate. > > * According to the current XMLDSIG draft, the resulting X509SubjectName > element is: > > <X509SubjectName>CN=Heinrich Muller</X509SubjectName> > > The text of this element of course does not conform with RFC 2253, since > this text still consists of XML (~ unicode) characters, and RFC 2253 > operates on UTF8-Strings. > >Regards, Gregor > > > > > Regards, Martin. > > > > > > >Joseph, > > > > > >currently applications conforming with XMLDSIG must encode DNames in > > >the way described in section 4.4.4 of the current draft [1]: > > > > > ><specsnip> > > > * Consider the string as consisting of Unicode characters. > > > > > > * Escape occurrences of the following special characters by > > > prefixing it with the "\" character: > > > > > > - a "#" character occurring at the beginning of the string > > > - one of the characters ",", "+", """, "\", "<", ">" or ";" > > > > > > * Escape all occurrences of ASCII control characters (Unicode range > > > \x00 - \x 1f) by replacing them with "\" followed by a two digit > > > hex number showing its Unicode number. > > > > > > * Escape any trailing white space by replacing "\ " with "\20". > > > > > > * Since a XML document logically consists of characters, not octets, > > > the resulting Unicode string is finally encoded according to the > > > character encoding used for producing the physical representation > > > of the XML document. > > ></specsnip> > > > > > >I think that there are two problems with these instructions: > > > > > >(1) We claim that these instructions are conforming with RFC > > 2253 [2]. This > > > is currently not true, since RFC 2253 demands the escaping of the > > > whitespace character (ASCII code \x20) at the beginning and at the > > > end of the string (see section 2.4). > > > > > >(2) (a fundamental problem): The instructions in section 2.4 of > > [2] operate > > > on a UTF8-String, i. e. in the octet domain. Our > > instructions operate > > > on a Unicode string, i. e. in the character domain. > > Therefore I consider > > > it useless to try to conform to RFC 2253 with the current > > instructions. > > > > > >To solve the problems, I suggest: > > > > > >- Do not state that the encoding of DNames conforms with RFC 2253, rather > > > state that our instructions are similar to that of RFC 2253 > > (only similar > > > because of the domain difference). > > > > > >- Modify the instructions as follows: > > > > > > * Consider the string as consisting of Unicode characters. > > > > > > * Escape occurrences of the following special characters by > > > prefixing it with the "\" character: > > > > > > - a "#" occurring at the beginning of the string > > > - one of the characters ",", "+", """, "\", "<", ">" or ";" > > > > > > * Escape control characters that are not XML characters (\x00-\x08, > > > \x0B-\x0C, \x0E-\x19). > > > > > > This is sufficient in order to produce text that consists of valid > > > XML characters, and to be able to reparse the DName string. > > > > > >Liebe Gruesse/Regards, > > >--------------------------------------------------------------- > > >DI Gregor Karlinger > > >mailto:gregor.karlinger@iaik.at > > >http://www.iaik.at > > >Phone +43 316 873 5541 > > >Institute for Applied Information Processing and Communications > > >Austria > > >--------------------------------------------------------------- > > > > > >--- > > >[1] > > >http://www.w3.org/Signature/Drafts/xmldsig-core/Overview.html#sec >-X509Data > >[2] http://ietf.org/rfc/rfc2253.txt > > > >
Received on Tuesday, 22 January 2002 18:08:49 UTC