RE: Encoding of Strings in DNames (X509IssuerSerial, X509SubjectName) from Martin Duerst on 2002-01-22 (w3c-ietf-xmldsig@w3.org from January to March 2002)

From: Martin Duerst <duerst@w3.org>
Date: Wed, 23 Jan 2002 07:47:14 +0900
To: "Gregor Karlinger" <gregor.karlinger@iaik.at>, "Gregor Karlinger" <gregor.karlinger@iaik.at>, "Joseph M. Reagle Jr." <reagle@w3.org>
Cc: "XMLSigWG" <w3c-ietf-xmldsig@w3.org>
Message-Id: <4.2.0.58.J.20020123074529.071be030@localhost>
Hello Gregor,

I think you are partially right. The actual resulting
text is bytes in one case, and Unicode characters in another
case. Same for the original. What I meant was that the
escaping procedure can be worded in either terms, and
that doesn't make a difference.

Regards,  Martin.

At 14:04 02/01/18 +0100, Gregor Karlinger wrote:
>Martin,
>
> > -----Original Message-----
> > From: Martin Duerst [mailto:duerst@w3.org]
> > Sent: Wednesday, January 16, 2002 2:50 PM
>
>[...]
>
> > I guess I would be the first to jump at anybody who confused
> > bytes and characters in a dangerous way. But here, one or
> > the other will lead to the same result, because we are only
> > looking at both characters as well as bytes with values below
> > 128, and in this case, it doesn't make any difference at all,
> > due to the properties of UTF-8.
>
>I do not think that your statement is true. Consider the following use
>case:
>
>   * A signature application wants to add a X509SubjectName child to
>     the KeyInfo element, containing a DN with a single RDN with a
>     single AVA, for which the attribute type is "CN" and the value
>     should be "Heinrich Muller", whereas "Heinrich Muller" is a unicode
>     string, and the application does not bother how this unicode string
>     is encoded in XMLDSIG or in a X509Certificate.
>
>   * According to the current XMLDSIG draft, the resulting X509SubjectName
>     element is:
>
>       <X509SubjectName>CN=Heinrich Muller</X509SubjectName>
>
>     The text of this element of course does not conform with RFC 2253, since
>     this text still consists of XML (~ unicode) characters, and RFC 2253
>     operates on UTF8-Strings.
>
>Regards, Gregor
>
> >
> > Regards,   Martin.
> >
> >
> > >Joseph,
> > >
> > >currently applications conforming with XMLDSIG must encode DNames in
> > >the way described in section 4.4.4 of the current draft [1]:
> > >
> > ><specsnip>
> > >   * Consider the string as consisting of Unicode characters.
> > >
> > >   * Escape occurrences of the following special characters by
> > >     prefixing it with the "\" character:
> > >
> > >     - a "#" character occurring at the beginning of the string
> > >     - one of the characters ",", "+", """, "\", "<", ">" or ";"
> > >
> > >   * Escape all occurrences of ASCII control characters (Unicode range
> > >     \x00 - \x 1f) by replacing them with "\" followed by a two digit
> > >     hex number showing its Unicode number.
> > >
> > >   * Escape any trailing white space by replacing "\ " with "\20".
> > >
> > >   * Since a XML document logically consists of characters, not octets,
> > >     the resulting Unicode string is finally encoded according to the
> > >     character encoding used for producing the physical representation
> > >     of the XML document.
> > ></specsnip>
> > >
> > >I think that there are two problems with these instructions:
> > >
> > >(1) We claim that these instructions are conforming with RFC
> > 2253 [2]. This
> > >     is currently not true, since RFC 2253 demands the escaping of the
> > >     whitespace character (ASCII code \x20) at the beginning and at the
> > >     end of the string (see section 2.4).
> > >
> > >(2) (a fundamental problem): The instructions in section 2.4 of
> > [2] operate
> > >     on a UTF8-String, i. e. in the octet domain. Our
> > instructions operate
> > >     on a Unicode string, i. e. in the character domain.
> > Therefore I consider
> > >     it useless to try to conform to RFC 2253 with the current
> > instructions.
> > >
> > >To solve the problems, I suggest:
> > >
> > >- Do not state that the encoding of DNames conforms with RFC 2253, rather
> > >   state that our instructions are similar to that of RFC 2253
> > (only similar
> > >   because of the domain difference).
> > >
> > >- Modify the instructions as follows:
> > >
> > >   * Consider the string as consisting of Unicode characters.
> > >
> > >   * Escape occurrences of the following special characters by
> > >     prefixing it with the "\" character:
> > >
> > >     - a "#" occurring at the beginning of the string
> > >     - one of the characters ",", "+", """, "\", "<", ">" or ";"
> > >
> > >   * Escape control characters that are not XML characters (\x00-\x08,
> > >     \x0B-\x0C, \x0E-\x19).
> > >
> > >   This is sufficient in order to produce text that consists of valid
> > >   XML characters, and to be able to reparse the DName string.
> > >
> > >Liebe Gruesse/Regards,
> > >---------------------------------------------------------------
> > >DI Gregor Karlinger
> > >mailto:gregor.karlinger@iaik.at
> > >http://www.iaik.at
> > >Phone +43 316 873 5541
> > >Institute for Applied Information Processing and Communications
> > >Austria
> > >---------------------------------------------------------------
> > >
> > >---
> > >[1]
> > >http://www.w3.org/Signature/Drafts/xmldsig-core/Overview.html#sec
>-X509Data
> >[2] http://ietf.org/rfc/rfc2253.txt
>
>
>
>
Received on Tuesday, 22 January 2002 18:08:49 UTC