Re: Encoding of Strings in DNames (X509IssuerSerial, X509SubjectName) from Martin Duerst on 2002-01-16 (w3c-ietf-xmldsig@w3.org from January to March 2002)

From: Martin Duerst <duerst@w3.org>
Date: Wed, 16 Jan 2002 22:50:25 +0900
To: "Gregor Karlinger" <gregor.karlinger@iaik.at>, "Joseph M. Reagle Jr." <reagle@w3.org>
Cc: "XMLSigWG" <w3c-ietf-xmldsig@w3.org>
Message-Id: <4.2.0.58.J.20020116224805.00a7d688@localhost>

At 14:21 02/01/16 +0100, Gregor Karlinger wrote:

>(2) (a fundamental problem): The instructions in section 2.4 of [2] operate
>     on a UTF8-String, i. e. in the octet domain. Our instructions operate
>     on a Unicode string, i. e. in the character domain. Therefore I consider
>     it useless to try to conform to RFC 2253 with the current instructions.

I guess I would be the first to jump at anybody who confused
bytes and characters in a dangerous way. But here, one or
the other will lead to the same result, because we are only
looking at both characters as well as bytes with values below
128, and in this case, it doesn't make any difference at all,
due to the properties of UTF-8.

Regards,   Martin.


>Joseph,
>
>currently applications conforming with XMLDSIG must encode DNames in
>the way described in section 4.4.4 of the current draft [1]:
>
><specsnip>
>   * Consider the string as consisting of Unicode characters.
>
>   * Escape occurrences of the following special characters by
>     prefixing it with the "\" character:
>
>     - a "#" character occurring at the beginning of the string
>     - one of the characters ",", "+", """, "\", "<", ">" or ";"
>
>   * Escape all occurrences of ASCII control characters (Unicode range
>     \x00 - \x 1f) by replacing them with "\" followed by a two digit
>     hex number showing its Unicode number.
>
>   * Escape any trailing white space by replacing "\ " with "\20".
>
>   * Since a XML document logically consists of characters, not octets,
>     the resulting Unicode string is finally encoded according to the
>     character encoding used for producing the physical representation
>     of the XML document.
></specsnip>
>
>I think that there are two problems with these instructions:
>
>(1) We claim that these instructions are conforming with RFC 2253 [2]. This
>     is currently not true, since RFC 2253 demands the escaping of the
>     whitespace character (ASCII code \x20) at the beginning and at the
>     end of the string (see section 2.4).
>
>(2) (a fundamental problem): The instructions in section 2.4 of [2] operate
>     on a UTF8-String, i. e. in the octet domain. Our instructions operate
>     on a Unicode string, i. e. in the character domain. Therefore I consider
>     it useless to try to conform to RFC 2253 with the current instructions.
>
>To solve the problems, I suggest:
>
>- Do not state that the encoding of DNames conforms with RFC 2253, rather
>   state that our instructions are similar to that of RFC 2253 (only similar
>   because of the domain difference).
>
>- Modify the instructions as follows:
>
>   * Consider the string as consisting of Unicode characters.
>
>   * Escape occurrences of the following special characters by
>     prefixing it with the "\" character:
>
>     - a "#" occurring at the beginning of the string
>     - one of the characters ",", "+", """, "\", "<", ">" or ";"
>
>   * Escape control characters that are not XML characters (\x00-\x08,
>     \x0B-\x0C, \x0E-\x19).
>
>   This is sufficient in order to produce text that consists of valid
>   XML characters, and to be able to reparse the DName string.
>
>Liebe Gruesse/Regards,
>---------------------------------------------------------------
>DI Gregor Karlinger
>mailto:gregor.karlinger@iaik.at
>http://www.iaik.at
>Phone +43 316 873 5541
>Institute for Applied Information Processing and Communications
>Austria
>---------------------------------------------------------------
>
>---
>[1]
>http://www.w3.org/Signature/Drafts/xmldsig-core/Overview.html#sec-X509Data
>[2] http://ietf.org/rfc/rfc2253.txt

Received on Wednesday, 16 January 2002 19:56:04 UTC