- From: Konrad Lanz <Konrad.Lanz@iaik.tugraz.at>
- Date: Mon, 04 Jun 2007 20:52:05 +0200
- To: public-xmlsec-maintwg@w3.org
- Message-ID: <46645F55.5080305@iaik.tugraz.at>
Dear all, TLR summarized the outcome of the discussion we had just after the last conference call about strings in XML here: http://lists.w3.org/Archives/Public/public-xmlsec-maintwg/2007May/0048.html I already mentioned there after taking a quick look at the c14n spec that canonicalization would heal differences with respect to "\<" and "&" in the string representation of a DName, and we agreed on that ... So with respect to "\<" and "&" in a DName string representation, this action is discharged ... and this is a non-issue also from my perspective. With respect to the current text there are still some other things remaining when looking at the latest red line document: #### The text says: "At least one element, from the following ... " So the bullet points will still have to enumerate the the choice of elements within the content of |X509Data| which is not the case in the current red line document ... The text for the first two bullet points will have to read something like this: * The |X509IssuerSerial| element, which contains an X.509 issuer distinguished name/serial number pair. The distinguished name SHOULD be compliant with the DNAME encoding rules at the end of this section and the serial number is represented as a decimal integer, * The |X509SubjectName| element, which contains an X.509 subject distinguished name that SHOULD be compliant with the DNAME encoding rules at the end of this section, #### The so called "DNAME encoding rules at the end of section 4.4.4" are still not entirely clear to me. First I'd like to mention that it can be argued if such corner cases affected by those rules appear at all in real life scenarios and hence may be irrelevant. Nonetheless I'd like to discuss it further as it should not be to hard to reach a clear set of rules. So let's first have a look at the current text and discuss it a little: > > Also, strings in DNames (|X509IssuerSerial|,|X509SubjectName|, and > |KeyName| if approriate) should be encoded in accordance with RFC2253 > [LDAP-DN] except for the encoding of string values within a DName: > %%E01 2002-01-28%%as follows: > > * Consider the string as consisting of Unicode characters. > * Escape occurrences of the following special characters by > prefixing it with the "\" character: > o a "#" character occurring at the beginning of the string > What happens to a leading space " " in an AttributeValue (AVA Value)? According to RFC 2253 this would have to be escaped by "\ ", but here that is not mentioned. I would assume that leading spaces have been forgotten to be mentioned in the first sub point of the second bullet point. This position is also supported by the examples given in http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2002JanMar/0246.html . > * > o one of the characters ",", "+", """, "\", "<", ">" or ";" > * Escape all occurrences of ASCII control characters (Unicode > range \x00 - \x1f) by replacing them with "\" followed by a two > digit hex number showing its Unicode number. > * Escape any trailing white space by replacing "\ " with "\20". > Could anyone from the original working group shed some light on why the last space should (note the small capitalization of the should in the first sentence of the rules) be escaped using "\20" instead of "\ ". I'd doubt that this is RFC2253 compliant. RFC 2253 explicitly mentions that the last space in an Attribute value will have to be escaped using "\ " and not "\20" as other escaping are only valid for characters other than those mentioned in section 2.4 . I also wonder what the rationale would be to treat a leading space different from a trailing one ? > > * Since a XML document logically consists of characters, not > octets, the resulting Unicode string is finally encoded > according to the character encoding used for producing the > physical representation of the XML document. > Last but not least I'd like to mention that http://www.w3.org/Signature/2001/04/05-xmldsig-interop.html#DNAME refers to the link above and even talks about "The following example set contains test vectors for the OPTIONAL DNAME encoding" which clearly indicates to me that the so called "DNAME encoding rules at the end of section 4.4.4" are optional and non normative. Summarizing I get the impression that the processing in the so called "DNAME encoding rules at the end of section 4.4.4" is contradicting RFC2253 and non normative. Nevertheless I can see some value in escaping control characters and spaces to protect them from being modified inside XML. However I would argue that the protection of spaces is sufficiently covered in RFC 2253 already and does not need any additional treatment within XMLDSig. (Interestingly RFC 2253 only asks for escaping the first leading and the last trailing space, and hence allows to mix escaped spaces with non-escaped spaces, which may be considered ugly but is clearly out of our scope to be decided) The situation is different with control characters needing protection that is not provided by RFC 2253. Line breaks in string representations for example are changed in XML (cf. http://www.w3.org/TR/2006/REC-xml-20060816/#sec-line-ends or the note in http://www.w3.org/TR/2006/REC-xml-20060816/#NT-S). Concluding I think the required rules could be a expressed in a clearer fashion and hence would suggest something like the following text specifying how to accommodate DNames inside an XML Document: A quick proposal to serve as a basis for further discussion: > DNames (X509IssuerSerial,X509SubjectName, and KeyName) MUST be > represented in accordance with RFC2253 [LDAP-DN] with the difference > that they will obviously have to have the same encoding (UTF-8, > UTF-16, UTF-32, ISO-8859-1, etc ...) as the XML document and are not > limited to UTF-8. > The AttributeValues within a DName are escaped according to the rules > laid out in RFC2253 with the additional requirement that esacping of > control characters MUST be performed as follows: > > * Escape all occurrences of control characters (Unicode range x00 > - x1f) by replacing them with "\" followed by a two digit hex > number showing its Unicode number. > > Note: According to RFC2253 it is valid to also escape other > characters, which is not changed by this additional requirement. The other requirement from my point of view is not RFC 2253 compliant and confusing and hence should be reviewed and then potentially be removed. > > * Escape any trailing white space by replacing "\ " with "\20". > regards Konrad -- Konrad Lanz, IAIK/SIC - Graz University of Technology Inffeldgasse 16a, 8010 Graz, Austria Tel: +43 316 873 5547 Fax: +43 316 873 5520 https://www.iaik.tugraz.at/aboutus/people/lanz http://jce.iaik.tugraz.at Certificate chain (including the EuroPKI root certificate): https://europki.iaik.at/ca/europki-at/cert_download.htm
Received on Monday, 4 June 2007 18:52:20 UTC