- From: Konrad Lanz <Konrad.Lanz@iaik.tugraz.at>
- Date: Mon, 04 Jun 2007 20:52:05 +0200
- To: public-xmlsec-maintwg@w3.org
- Message-ID: <46645F55.5080305@iaik.tugraz.at>
Dear all,
TLR summarized the outcome of the discussion we had just after the last
conference call about strings in XML here:
http://lists.w3.org/Archives/Public/public-xmlsec-maintwg/2007May/0048.html
I already mentioned there after taking a quick look at the c14n spec
that canonicalization would heal differences with respect to "\<" and
"&" in the string representation of a DName, and we agreed on that ...
So with respect to "\<" and "&" in a DName string representation, this
action is discharged ... and this is a non-issue also from my perspective.
With respect to the current text there are still some other things
remaining when looking at the latest red line document:
####
The text says:
"At least one element, from the following ... "
So the bullet points will still have to enumerate the the choice of
elements within the content of |X509Data| which is not the case in the
current red line document ...
The text for the first two bullet points will have to read something
like this:
* The |X509IssuerSerial| element, which contains an X.509
issuer distinguished name/serial number pair. The distinguished
name SHOULD be compliant with the DNAME
encoding rules at the end of this section and the serial
number is represented as a decimal integer,
* The |X509SubjectName| element, which contains an X.509
subject distinguished name that SHOULD be compliant with the
DNAME encoding rules at the end of this section,
####
The so called "DNAME encoding rules at the end of section 4.4.4" are
still not entirely clear to me.
First I'd like to mention that it can be argued if such corner cases
affected by those rules appear at all in real life scenarios and hence
may be irrelevant.
Nonetheless I'd like to discuss it further as it should not be to hard
to reach a clear set of rules.
So let's first have a look at the current text and discuss it a little:
>
> Also, strings in DNames (|X509IssuerSerial|,|X509SubjectName|, and
> |KeyName| if approriate) should be encoded in accordance with RFC2253
> [LDAP-DN] except for the encoding of string values within a DName:
> %%E01 2002-01-28%%as follows:
>
> * Consider the string as consisting of Unicode characters.
> * Escape occurrences of the following special characters by
> prefixing it with the "\" character:
> o a "#" character occurring at the beginning of the string
>
What happens to a leading space " " in an AttributeValue (AVA Value)?
According to RFC 2253 this would have to be escaped by "\ ", but here
that is not mentioned.
I would assume that leading spaces have been forgotten to be mentioned
in the first sub point of the second bullet point.
This position is also supported by the examples given in
http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2002JanMar/0246.html .
> *
> o one of the characters ",", "+", """, "\", "<", ">" or ";"
> * Escape all occurrences of ASCII control characters (Unicode
> range \x00 - \x1f) by replacing them with "\" followed by a two
> digit hex number showing its Unicode number.
> * Escape any trailing white space by replacing "\ " with "\20".
>
Could anyone from the original working group shed some light on why the
last space should (note the small capitalization of the should in the
first sentence of the rules) be escaped using "\20" instead of "\ ".
I'd doubt that this is RFC2253 compliant. RFC 2253 explicitly mentions
that the last space in an Attribute value will have to be escaped using
"\ " and not "\20" as other escaping are only valid for characters other
than those mentioned in section 2.4 .
I also wonder what the rationale would be to treat a leading space
different from a trailing one ?
>
> * Since a XML document logically consists of characters, not
> octets, the resulting Unicode string is finally encoded
> according to the character encoding used for producing the
> physical representation of the XML document.
>
Last but not least I'd like to mention that
http://www.w3.org/Signature/2001/04/05-xmldsig-interop.html#DNAME refers
to the link above and even talks about "The following example set
contains test vectors for the OPTIONAL DNAME encoding" which clearly
indicates to me that the so called "DNAME encoding rules at the end of
section 4.4.4" are optional and non normative.
Summarizing I get the impression that the processing in the so called
"DNAME encoding rules at the end of section 4.4.4" is contradicting
RFC2253 and non normative.
Nevertheless I can see some value in escaping control characters and
spaces to protect them from being modified inside XML. However I would
argue that the protection of spaces is sufficiently covered in RFC 2253
already and does not need any additional treatment within XMLDSig.
(Interestingly RFC 2253 only asks for escaping the first leading and the
last trailing space, and hence allows to mix escaped spaces with
non-escaped spaces, which may be considered ugly but is clearly out of
our scope to be decided)
The situation is different with control characters needing protection
that is not provided by RFC 2253.
Line breaks in string representations for example are changed in XML
(cf. http://www.w3.org/TR/2006/REC-xml-20060816/#sec-line-ends or the
note in http://www.w3.org/TR/2006/REC-xml-20060816/#NT-S).
Concluding I think the required rules could be a expressed in a clearer
fashion and hence would suggest something like the following text
specifying how to accommodate DNames inside an XML Document:
A quick proposal to serve as a basis for further discussion:
> DNames (X509IssuerSerial,X509SubjectName, and KeyName) MUST be
> represented in accordance with RFC2253 [LDAP-DN] with the difference
> that they will obviously have to have the same encoding (UTF-8,
> UTF-16, UTF-32, ISO-8859-1, etc ...) as the XML document and are not
> limited to UTF-8.
> The AttributeValues within a DName are escaped according to the rules
> laid out in RFC2253 with the additional requirement that esacping of
> control characters MUST be performed as follows:
>
> * Escape all occurrences of control characters (Unicode range x00
> - x1f) by replacing them with "\" followed by a two digit hex
> number showing its Unicode number.
>
> Note: According to RFC2253 it is valid to also escape other
> characters, which is not changed by this additional requirement.
The other requirement from my point of view is not RFC 2253 compliant
and confusing and hence should be reviewed and then potentially be removed.
>
> * Escape any trailing white space by replacing "\ " with "\20".
>
regards
Konrad
--
Konrad Lanz, IAIK/SIC - Graz University of Technology
Inffeldgasse 16a, 8010 Graz, Austria
Tel: +43 316 873 5547
Fax: +43 316 873 5520
https://www.iaik.tugraz.at/aboutus/people/lanz
http://jce.iaik.tugraz.at
Certificate chain (including the EuroPKI root certificate):
https://europki.iaik.at/ca/europki-at/cert_download.htm
Received on Monday, 4 June 2007 18:52:20 UTC