W3C home > Mailing lists > Public > xml-dist-app@w3.org > March 2004

RE: Evaluation of XML Schema Part 2 PER base64Binary type

From: <noah_mendelsohn@us.ibm.com>
Date: Mon, 29 Mar 2004 11:00:14 -0500
To: "Martin Gudgin" <mgudgin@microsoft.com>
Cc: "Jacek Kopecky" <jacek.kopecky@systinet.com>, "XMLP Dist App" <xml-dist-app@w3.org>
Message-ID: <OFF8652A01.9A5388CC-ON85256E66.005757FE@lotus.com>

I respectfully disagree with this analysis.  My understanding is that 
whitespace handling and facets play no role in canonical lexical forms at 
all.  "   abcd  " is not for base64Binary a canonical form.

The whitespace facet in schema is a very strange beast (dare I say 
kludge?).  It's a facet that can be declared on a simple type but plays no 
direct role in simple type validation >per the part 2< datatypes spec. 
Rather, it is a hint to users of the datatypes that it might be 
interesting to manipulate the whitespace >as a preliminary to creating a 
lexical form for datatypes validation.<  The schema structures spec is one 
such user of datatypes, and it does indeed do such preparation.[1] 
Canonical forms have nothing to do with this.  So, "   4" is not a 
canonical integer, and "   abcd " is not a canonical base64Binary, at 
least IMO.

What is true is that "   abcd " will validate and will map to the same 
point in the value space as "abcd" if you were to try validation via 
schema structures (which we don't).  Note that if another spec, say RDF, 
chooses to use the datatypes recommendation then it is RDF's business 
whether or not to honor the whitespace facet.  This stands in sharp 
contrast to most every other facet, as the rest are pretty much 
universally enforced for all users of datatypes.

Thanks!

Noah

[1] 
http://www.w3.org/TR/xmlschema-1/#section-White-Space-Normalization-during-Validation

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








"Martin Gudgin" <mgudgin@microsoft.com>
Sent by: xml-dist-app-request@w3.org
03/29/2004 07:50 AM

 
        To:     "Jacek Kopecky" <jacek.kopecky@systinet.com>
        cc:     "XMLP Dist App" <xml-dist-app@w3.org>, (bcc: Noah 
Mendelsohn/Cambridge/IBM)
        Subject:        RE: Evaluation of XML Schema Part 2 PER base64Binary type



Well, if we say that elements have content whose characters are in the
canonical lexical form of xs:base64Binary then the leading/trailing
whitespace is stripped/ignored. I can see us calling this out in the XOP
spec, essentially saying that "   abcd   " is treated as "abcd".

Gudge

> -----Original Message-----
> From: Jacek Kopecky [mailto:jacek.kopecky@systinet.com] 
> Sent: 29 March 2004 13:46
> To: Martin Gudgin
> Cc: XMLP Dist App
> Subject: RE: Evaluation of XML Schema Part 2 PER base64Binary type
> 
> So can we guarantee to transfer the infoset with fidelity? Or 
> do we have
> to restrict the canonical form to that with no leading and trailing
> whitespace in XOP?
> 
> Jacek
> 
> On Mon, 2004-03-29 at 14:21, Martin Gudgin wrote:
> > Yup, because at the schema level it's actually "abcd"
> > 
> > Gudge 
> > 
> > > -----Original Message-----
> > > From: xml-dist-app-request@w3.org 
> > > [mailto:xml-dist-app-request@w3.org] On Behalf Of Jacek Kopecky
> > > Sent: 29 March 2004 13:16
> > > To: Martin Gudgin
> > > Cc: XMLP Dist App
> > > Subject: Re: Evaluation of XML Schema Part 2 PER base64Binary type
> > > 
> > > 
> > > Gudge, does the whitespace stripping rule mean that "  abcd" 
> > > is also in
> > > canonical form?
> > > 
> > > Jacek
> > > 
> > > On Mon, 2004-03-29 at 12:24, Martin Gudgin wrote:
> > > > Dear XMLPers,
> > > > 
> > > > I took an action on last weeks call to take a look at 
> the proposed
> > > > edited recommendation of XML Schema Part 2[1] WRT the 
> base64Binary
> > > > type[2]. 
> > > > 
> > > > The description of the base64Binary type now contains a 
> BNF and a
> > > > canonical lexical form. The canonical lexical form contains no
> > > > whitespace characters within the stream of base64 
> > > characters. Whitespace
> > > > characters at the beginning and/or end of the stream of base64
> > > > characters are stripped due to the whitespace facet of the 
> > > type having a
> > > > value of collapse. Thus any canonical lexical form of 
> > > base64Binary is
> > > > one line of base64 characters.
> > > > 
> > > > I believe that the addition of a canonical lexical form 
> > > satisfies our
> > > > requirements WRT XOP/MTOM.
> > > > 
> > > > Regards
> > > > 
> > > > Gudge
> > > > 
> > > > [1] http://www.w3.org/TR/2004/PER-xmlschema-2-20040318/
> > > > [2] 
> http://www.w3.org/TR/2004/PER-xmlschema-2-20040318/#base64Binary
> > > > 
> > > 
> > > 
> 
> 
Received on Monday, 29 March 2004 11:02:45 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:16 GMT