- From: Martin Gudgin <mgudgin@microsoft.com>
- Date: Wed, 26 Mar 2003 10:54:38 -0800
- To: "Amelia A. Lewis" <alewis@tibco.com>
- Cc: <Marc.Hadley@sun.com>, <xml-dist-app@w3.org>
> > -----Original Message----- > From: Amelia A. Lewis [mailto:alewis@tibco.com] > Sent: 26 March 2003 10:24 > To: Martin Gudgin > Cc: Marc.Hadley@sun.com; xml-dist-app@w3.org > > On Wed, 26 Mar 2003 09:36:47 -0800 > "Martin Gudgin" <mgudgin@microsoft.com> wrote: > > Perhaps mandating the canonical rep defined by XML > Schema[1] would help. > > Has it been approved? The supplied reference requires a > login (which I could do, but is there a version approved by > the schema WG?). I understood that it was going to be part of the 2nd edition. > > The supplied errata has at least three problems that I can see. > > 1) it states that processors cannot enforce line-length > limit, and then gives productions that require enforcement of > line-length limits. > > 2) it states that whitespace is permitted, but only LF is > included in the productions; therefore, a processor > implementor could assume that no other whitespace is permitted. > > 3) it is not clear what the length calculation at the bottom > is for, why one would perform it, or who cares (is it > facet-related? Why would the length facet care about the > length of the decoded stream, which this algorithm seems to require?). > > oh, and 4) there's commentary, at the end, that mentions that > RFC2045 explicitly calls out ASCII as the encoding, but the > RFC explicitly states that any encoding that includes the 65 > letters/symbols in its dictionary, plus space, CR, and LF, > can use base64 encoding (notably including EBCDIC). The > statement that "decoding of base64binary data in an XML > entity is to be performed on the [US-ASCII-compatible] > Unicode characters obtained after character encoding > processing as specified by XML 1.0." is wrong-headed, at best > (it requires the transcoding step, which may be highly > inappropriate; a number of processors will then "inflate" the > information into characters defined as sixteen bit entities, > and will then proceed to throw away 5/8 of the memory > allocated, not 1/4). Should we send this input to the Schema WG? > > > The current C14N algorithms for xmldsig all assume a UTF-8 > encoding ( > > AFAIR ) so some of the above concerns are mitigated, I think. > > Assume, or require? They all work on UTF-8 WRT XML. So if you want to compute a dsig of a UTF-16 doc, the sig still needs to be over the UTF-8 form. > > > Agreed. We need to be more specific. > > Please. Is this a plea to update the document? Or just that we add this to a list of issues and resolve it? ( I don't really mind which, just wondering ). Cheers Gudge
Received on Wednesday, 26 March 2003 13:54:49 UTC