RE: Encoding of Strings in DNames (X509IssuerSerial, X509SubjectName) from Gregor Karlinger on 2002-01-18 (w3c-ietf-xmldsig@w3.org from January to March 2002)

From: Gregor Karlinger <gregor.karlinger@iaik.at>
Date: Fri, 18 Jan 2002 14:04:28 +0100
To: "Martin Duerst" <duerst@w3.org>, "Gregor Karlinger" <gregor.karlinger@iaik.at>, "Joseph M. Reagle Jr." <reagle@w3.org>
Cc: "XMLSigWG" <w3c-ietf-xmldsig@w3.org>
Message-ID: <LBEPJAONIMDADHFHAEAOAELKCMAA.gregor.karlinger@iaik.at>

Martin,

> -----Original Message-----
> From: Martin Duerst [mailto:duerst@w3.org]
> Sent: Wednesday, January 16, 2002 2:50 PM

[...]

> I guess I would be the first to jump at anybody who confused
> bytes and characters in a dangerous way. But here, one or
> the other will lead to the same result, because we are only
> looking at both characters as well as bytes with values below
> 128, and in this case, it doesn't make any difference at all,
> due to the properties of UTF-8.

I do not think that your statement is true. Consider the following use
case:

  * A signature application wants to add a X509SubjectName child to
    the KeyInfo element, containing a DN with a single RDN with a
    single AVA, for which the attribute type is "CN" and the value
    should be "Heinrich Muller", whereas "Heinrich Muller" is a unicode
    string, and the application does not bother how this unicode string
    is encoded in XMLDSIG or in a X509Certificate.

  * According to the current XMLDSIG draft, the resulting X509SubjectName
    element is:

      <X509SubjectName>CN=Heinrich Muller</X509SubjectName>

    The text of this element of course does not conform with RFC 2253, since
    this text still consists of XML (~ unicode) characters, and RFC 2253
    operates on UTF8-Strings.

Regards, Gregor

>
> Regards,   Martin.
>
>
> >Joseph,
> >
> >currently applications conforming with XMLDSIG must encode DNames in
> >the way described in section 4.4.4 of the current draft [1]:
> >
> ><specsnip>
> >   * Consider the string as consisting of Unicode characters.
> >
> >   * Escape occurrences of the following special characters by
> >     prefixing it with the "\" character:
> >
> >     - a "#" character occurring at the beginning of the string
> >     - one of the characters ",", "+", """, "\", "<", ">" or ";"
> >
> >   * Escape all occurrences of ASCII control characters (Unicode range
> >     \x00 - \x 1f) by replacing them with "\" followed by a two digit
> >     hex number showing its Unicode number.
> >
> >   * Escape any trailing white space by replacing "\ " with "\20".
> >
> >   * Since a XML document logically consists of characters, not octets,
> >     the resulting Unicode string is finally encoded according to the
> >     character encoding used for producing the physical representation
> >     of the XML document.
> ></specsnip>
> >
> >I think that there are two problems with these instructions:
> >
> >(1) We claim that these instructions are conforming with RFC
> 2253 [2]. This
> >     is currently not true, since RFC 2253 demands the escaping of the
> >     whitespace character (ASCII code \x20) at the beginning and at the
> >     end of the string (see section 2.4).
> >
> >(2) (a fundamental problem): The instructions in section 2.4 of
> [2] operate
> >     on a UTF8-String, i. e. in the octet domain. Our
> instructions operate
> >     on a Unicode string, i. e. in the character domain.
> Therefore I consider
> >     it useless to try to conform to RFC 2253 with the current
> instructions.
> >
> >To solve the problems, I suggest:
> >
> >- Do not state that the encoding of DNames conforms with RFC 2253, rather
> >   state that our instructions are similar to that of RFC 2253
> (only similar
> >   because of the domain difference).
> >
> >- Modify the instructions as follows:
> >
> >   * Consider the string as consisting of Unicode characters.
> >
> >   * Escape occurrences of the following special characters by
> >     prefixing it with the "\" character:
> >
> >     - a "#" occurring at the beginning of the string
> >     - one of the characters ",", "+", """, "\", "<", ">" or ";"
> >
> >   * Escape control characters that are not XML characters (\x00-\x08,
> >     \x0B-\x0C, \x0E-\x19).
> >
> >   This is sufficient in order to produce text that consists of valid
> >   XML characters, and to be able to reparse the DName string.
> >
> >Liebe Gruesse/Regards,
> >---------------------------------------------------------------
> >DI Gregor Karlinger
> >mailto:gregor.karlinger@iaik.at
> >http://www.iaik.at
> >Phone +43 316 873 5541
> >Institute for Applied Information Processing and Communications
> >Austria
> >---------------------------------------------------------------
> >
> >---
> >[1]
> >http://www.w3.org/Signature/Drafts/xmldsig-core/Overview.html#sec
-X509Data
>[2] http://ietf.org/rfc/rfc2253.txt

Received on Friday, 18 January 2002 08:05:46 UTC