- From: <noah_mendelsohn@us.ibm.com>
- Date: Mon, 29 Mar 2004 11:00:14 -0500
- To: "Martin Gudgin" <mgudgin@microsoft.com>
- Cc: "Jacek Kopecky" <jacek.kopecky@systinet.com>, "XMLP Dist App" <xml-dist-app@w3.org>
I respectfully disagree with this analysis. My understanding is that whitespace handling and facets play no role in canonical lexical forms at all. " abcd " is not for base64Binary a canonical form. The whitespace facet in schema is a very strange beast (dare I say kludge?). It's a facet that can be declared on a simple type but plays no direct role in simple type validation >per the part 2< datatypes spec. Rather, it is a hint to users of the datatypes that it might be interesting to manipulate the whitespace >as a preliminary to creating a lexical form for datatypes validation.< The schema structures spec is one such user of datatypes, and it does indeed do such preparation.[1] Canonical forms have nothing to do with this. So, " 4" is not a canonical integer, and " abcd " is not a canonical base64Binary, at least IMO. What is true is that " abcd " will validate and will map to the same point in the value space as "abcd" if you were to try validation via schema structures (which we don't). Note that if another spec, say RDF, chooses to use the datatypes recommendation then it is RDF's business whether or not to honor the whitespace facet. This stands in sharp contrast to most every other facet, as the rest are pretty much universally enforced for all users of datatypes. Thanks! Noah [1] http://www.w3.org/TR/xmlschema-1/#section-White-Space-Normalization-during-Validation -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- "Martin Gudgin" <mgudgin@microsoft.com> Sent by: xml-dist-app-request@w3.org 03/29/2004 07:50 AM To: "Jacek Kopecky" <jacek.kopecky@systinet.com> cc: "XMLP Dist App" <xml-dist-app@w3.org>, (bcc: Noah Mendelsohn/Cambridge/IBM) Subject: RE: Evaluation of XML Schema Part 2 PER base64Binary type Well, if we say that elements have content whose characters are in the canonical lexical form of xs:base64Binary then the leading/trailing whitespace is stripped/ignored. I can see us calling this out in the XOP spec, essentially saying that " abcd " is treated as "abcd". Gudge > -----Original Message----- > From: Jacek Kopecky [mailto:jacek.kopecky@systinet.com] > Sent: 29 March 2004 13:46 > To: Martin Gudgin > Cc: XMLP Dist App > Subject: RE: Evaluation of XML Schema Part 2 PER base64Binary type > > So can we guarantee to transfer the infoset with fidelity? Or > do we have > to restrict the canonical form to that with no leading and trailing > whitespace in XOP? > > Jacek > > On Mon, 2004-03-29 at 14:21, Martin Gudgin wrote: > > Yup, because at the schema level it's actually "abcd" > > > > Gudge > > > > > -----Original Message----- > > > From: xml-dist-app-request@w3.org > > > [mailto:xml-dist-app-request@w3.org] On Behalf Of Jacek Kopecky > > > Sent: 29 March 2004 13:16 > > > To: Martin Gudgin > > > Cc: XMLP Dist App > > > Subject: Re: Evaluation of XML Schema Part 2 PER base64Binary type > > > > > > > > > Gudge, does the whitespace stripping rule mean that " abcd" > > > is also in > > > canonical form? > > > > > > Jacek > > > > > > On Mon, 2004-03-29 at 12:24, Martin Gudgin wrote: > > > > Dear XMLPers, > > > > > > > > I took an action on last weeks call to take a look at > the proposed > > > > edited recommendation of XML Schema Part 2[1] WRT the > base64Binary > > > > type[2]. > > > > > > > > The description of the base64Binary type now contains a > BNF and a > > > > canonical lexical form. The canonical lexical form contains no > > > > whitespace characters within the stream of base64 > > > characters. Whitespace > > > > characters at the beginning and/or end of the stream of base64 > > > > characters are stripped due to the whitespace facet of the > > > type having a > > > > value of collapse. Thus any canonical lexical form of > > > base64Binary is > > > > one line of base64 characters. > > > > > > > > I believe that the addition of a canonical lexical form > > > satisfies our > > > > requirements WRT XOP/MTOM. > > > > > > > > Regards > > > > > > > > Gudge > > > > > > > > [1] http://www.w3.org/TR/2004/PER-xmlschema-2-20040318/ > > > > [2] > http://www.w3.org/TR/2004/PER-xmlschema-2-20040318/#base64Binary > > > > > > > > > > > >
Received on Monday, 29 March 2004 11:02:45 UTC