Re: DName encoding (was:KeyName white space) from Joseph M. Reagle Jr. on 2001-06-12 (w3c-ietf-xmldsig@w3.org from April to June 2001)

From: Joseph M. Reagle Jr. <reagle@w3.org>
Date: Tue, 12 Jun 2001 16:43:34 -0400
To: "Donald E. Eastlake 3rd" <dee3@torque.pothole.com>
Cc: w3c-ietf-xmldsig@w3.org
Message-Id: <4.3.2.7.2.20010612164111.00baf318@localhost>
Which element types does this apply to, only the RFC2253 X509IssuerSerial 
and X509SubjectName, or KeyName as well (which can be anything *including* a 
DNAME). I assume MgmtData and PGPKeyID are left as a string without further 
specification.


At 15:27 6/12/2001, Donald E. Eastlake 3rd wrote:

>Since no one has responded to my message below, I assume that we have
>adopted Gregor's proposal and that, except as may be provided in RFC
>2253, whitespace is significant.
>
>Donald
>
>From:  "Donald E. Eastlake 3rd" <dee3@torque.pothole.com>
>Message-Id:  <200105310255.WAA0000031182@torque.pothole.com>
>To:  w3c-ietf-xmldsig@w3.org
>Date:  Wed, 30 May 2001 22:55:30 -0400
>
> >It seems to me that we need to come to some sort of conclusion on
> >this... Maybe people think we did and I missed it, in which case I
> >hope they will set me straight.
> >
> >The following proposal by Gregor for encoding STRINGS in DNames does
> >not seem to have been refuted, but do we need to say more about
> >leading and trailing whitespace on the DName as a whole?
> >
> >Thanks,
> >Donald
> >
> >From:  "Gregor Karlinger" <gregor.karlinger@iaik.at>
> >To:  "Tom Gindin" <tgindin@us.ibm.com>, "Merlin Hughs" 
> <merlin@baltimore.ie>,
> >            <duerst@w3.org>
> >Cc:  <w3c-ietf-xmldsig@w3.org>
> >Date:  Wed, 16 May 2001 15:13:14 +0200
> >Message-ID:  <LBEPJAONIMDADHFHAEAOMELJCFAA.gregor.karlinger@iaik.at>
> >In-reply-to: 
> <OF8536EC68.279166CB-ON85256A4D.005E2175@somers.hqregion.ibm.com>
> >Subject:  DName encoding (was:KeyName white space)
> >
> >All,
> >
> >I have cited the most important statements (from my point of view)
> >regarding DName encoding below. I would like to make the following
> >proposal for a guidline that should appear in XML-Signature on how
> >to encode a strings in a DName in XML-Signature relevant structures
> >(IssuerName, SubjectName):
> >
> >o Consider the string as consisting of unicode characters.
> >
> >o Escape occurences of the following special characters by prefixing
> >  it with the "\" character:
> >
> >    - a "#" character occurring at the beginning of the string
> >    - one of the characters ",", "+", """, "\", "<", ">" or ";"
> >
> >o Escape all occurences of ASCII control characters (Unicode range
> >  \x00 - \x20) by replacing them with "\x" folloed by a two digit
> >  hex number showing its Unicode number.
> >
> >Since a XML document logically consists of characters, not bytes
> >(Thanks Martin for reminding me ;-) the resulting unicode string is
> >finally encoded according to the character encoding used for producing
> >the physical representation of the XML document (But this is not in
> >scope of the guidline that should be taken into the XML-Signature
> >specification ...)
> >
> >Liebe Gruesse/Regards,
> >---------------------------------------------------------------
> >DI Gregor Karlinger
> >mailto:gregor.karlinger@iaik.at
> >http://www.iaik.at
> >Phone +43 316 873 5541
> >Institute for Applied Information Processing and Communications
> >Austria
> >---------------------------------------------------------------
> >
> >
> >### Tom Gindin ###
> >
> >[...]
> >>      The situation is actually even more confusing than that.  The rules
> >> seem to be, for HT, LF, FF and VT, something like the following:
> >[...]
> >
> >### Merlin Hughs ###
> >
> >[...]
> >> The escaping is useful because:
> >>
> >>            <DName>CN=foo
> >>            </DName>
> >>
> >> According to RFC 2253, this states that the common name is
> >> foo<LF><TAB>. However, if you trim() the text value then you
> >> would get common name foo. Requiring escaping of ASCII controls
> >> would result in CN=foo\0A\09 (if that is what is meant) which
> >> is unambiguous and can be safely indented, whitespace formatted
> >> and trim()ed. It would also eliminate a few other potential
> >> ambiguities:
> >[...]
> >> is significant. If we require that all significant ASCII
> >> controls be escaped, then trimming and formatting involving
> >> newlines and tabs will be safe, and meaningful whitespace in
> >> dnames will be explicit.
> >
> >### Martin Duerst ###
> >
> >An XML document consists of characters, not bytes. And in
> >content, each character can be escaped by a numeric character
> >reference. Therefore, you can have a document only encoded
> >in us-ascii, but will lots of &#dddd; (or &#xhhhh;); this
> >will transport any character in Unicode, and a browser
> >will display it correctly (assuming it has the right font).
> >
> >> > There's another issue that seems relevant. RFC 2253 states
> >> > that strings must be converted to UTF-8 and then the escaping
> >> > rules must be applied. Do we honour this, or should we UTF-8
> >> > decode the RFC2253 string before embedding it in the text node.
> >> >
> >> > Essentially, should the final example in RFC 2253 be encoded
> >> > in XML as:
> >> >
> >> > UTF-8 encode and require ASCII escaping of high-bit-set chars:
> >> >   SN=Lu\C4\8Di\C4\87
> >
> >This would be okay.
> >
> >[...]
> >> > De-UTF-8 and embed the Unicode original:
> >> >   SN=Lu?i? (where ? is the original character)
> >> >
> >> > The last seems like the best option to me.
> >
> >I agree. But it has to be spelled out clearly, and tested.
> >
> >############
> >


--
Joseph Reagle Jr.                 http://www.w3.org/People/Reagle/
W3C Policy Analyst                mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair   http://www.w3.org/Signature
W3C XML Encryption Chair          http://www.w3.org/Encryption/2001/
Received on Tuesday, 12 June 2001 16:43:54 UTC