W3C home > Mailing lists > Public > xml-dist-app@w3.org > March 2004

RE: Evaluation of XML Schema Part 2 PER base64Binary type

From: <noah_mendelsohn@us.ibm.com>
Date: Mon, 29 Mar 2004 12:33:26 -0500
To: "Martin Gudgin" <mgudgin@microsoft.com>
Cc: "Jacek Kopecky" <jacek.kopecky@systinet.com>, "XMLP Dist App" <xml-dist-app@w3.org>
Message-ID: <OF28233ADE.8B8B2D7F-ON85256E66.005FC6D8@lotus.com>

Marting Gudgin writes:

>>Fine, let's just say that the base64 string MUST 
>> NOT contain any whitespace chars, preceding, 
>> inline or following. At which point, I'm
>> not sure why we even care what the Schema datatypes 
>> PER says.

OK, no problem at all.  This is at worst redundant with saying that it 
must be a canonical form.  I can easily live with either of the following 
(neither of which is wordsmithed.)  The first is intended to be exactly 
what you've proposed, the second a slight variation.

* To be optimized, the characters comprising the [children] MUST be in the 
canonical form of xsd:base64Binary and MUST not contain any whitespace 
chars, preceding, inline with or following the non-whitespace content.

-or-

* To be optimized, the characters comprising the [children] MUST be in the 
canonical form of xsd:base64Binary.  Note: this implies that there must 
not be any whitespace chars, preceding, inline with or following the 
non-whitespace content.

The former has the advantage of closing off any possible risk that we 
haven't been clear in our spec, but with the modest risk of (correctly) 
restating the normative rules of schema datatypes.  The latter runs the 
risk that I have misinterpreted datatypes, and that we are therefore 
leaving open some unintentional wiggle room.  As I say, I can quite 
happily live with either, maybe slight preference for the latter. 

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








"Martin Gudgin" <mgudgin@microsoft.com>
03/29/2004 11:29 AM

 
        To:     <noah_mendelsohn@us.ibm.com>
        cc:     "Jacek Kopecky" <jacek.kopecky@systinet.com>, "XMLP Dist App" 
<xml-dist-app@w3.org>
        Subject:        RE: Evaluation of XML Schema Part 2 PER base64Binary type


Noah,

Fine, let's just say that the base64 string MUST NOT contain any
whitespace chars, preceding, inline or following. At which point, I'm
not sure why we even care what the Schema datatypes PER says.

Gudge

> -----Original Message-----
> From: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com] 
> Sent: 29 March 2004 17:00
> To: Martin Gudgin
> Cc: Jacek Kopecky; XMLP Dist App
> Subject: RE: Evaluation of XML Schema Part 2 PER base64Binary type
> 
> I respectfully disagree with this analysis.  My understanding is that 
> whitespace handling and facets play no role in canonical 
> lexical forms at 
> all.  "   abcd  " is not for base64Binary a canonical form.
> 
> The whitespace facet in schema is a very strange beast (dare I say 
> kludge?).  It's a facet that can be declared on a simple type 
> but plays no 
> direct role in simple type validation >per the part 2< 
> datatypes spec. 
> Rather, it is a hint to users of the datatypes that it might be 
> interesting to manipulate the whitespace >as a preliminary to 
> creating a 
> lexical form for datatypes validation.<  The schema 
> structures spec is one 
> such user of datatypes, and it does indeed do such preparation.[1] 
> Canonical forms have nothing to do with this.  So, "   4" is not a 
> canonical integer, and "   abcd " is not a canonical base64Binary, at 
> least IMO.
> 
> What is true is that "   abcd " will validate and will map to 
> the same 
> point in the value space as "abcd" if you were to try validation via 
> schema structures (which we don't).  Note that if another 
> spec, say RDF, 
> chooses to use the datatypes recommendation then it is RDF's business 
> whether or not to honor the whitespace facet.  This stands in sharp 
> contrast to most every other facet, as the rest are pretty much 
> universally enforced for all users of datatypes.
> 
> Thanks!
> 
> Noah
> 
> [1] 
> http://www.w3.org/TR/xmlschema-1/#section-White-Space-Normaliz
> ation-during-Validation
> 
> --------------------------------------
> Noah Mendelsohn 
> IBM Corporation
> One Rogers Street
> Cambridge, MA 02142
> 1-617-693-4036
> --------------------------------------
> 
> 
> 
> 
> 
> 
> 
> 
> "Martin Gudgin" <mgudgin@microsoft.com>
> Sent by: xml-dist-app-request@w3.org
> 03/29/2004 07:50 AM
> 
> 
>         To:     "Jacek Kopecky" <jacek.kopecky@systinet.com>
>         cc:     "XMLP Dist App" <xml-dist-app@w3.org>, (bcc: Noah 
> Mendelsohn/Cambridge/IBM)
>         Subject:        RE: Evaluation of XML Schema Part 2 
> PER base64Binary type
> 
> 
> 
> Well, if we say that elements have content whose characters are in the
> canonical lexical form of xs:base64Binary then the leading/trailing
> whitespace is stripped/ignored. I can see us calling this out 
> in the XOP
> spec, essentially saying that "   abcd   " is treated as "abcd".
> 
> Gudge
> 
> > -----Original Message-----
> > From: Jacek Kopecky [mailto:jacek.kopecky@systinet.com] 
> > Sent: 29 March 2004 13:46
> > To: Martin Gudgin
> > Cc: XMLP Dist App
> > Subject: RE: Evaluation of XML Schema Part 2 PER base64Binary type
> > 
> > So can we guarantee to transfer the infoset with fidelity? Or 
> > do we have
> > to restrict the canonical form to that with no leading and trailing
> > whitespace in XOP?
> > 
> > Jacek
> > 
> > On Mon, 2004-03-29 at 14:21, Martin Gudgin wrote:
> > > Yup, because at the schema level it's actually "abcd"
> > > 
> > > Gudge 
> > > 
> > > > -----Original Message-----
> > > > From: xml-dist-app-request@w3.org 
> > > > [mailto:xml-dist-app-request@w3.org] On Behalf Of Jacek Kopecky
> > > > Sent: 29 March 2004 13:16
> > > > To: Martin Gudgin
> > > > Cc: XMLP Dist App
> > > > Subject: Re: Evaluation of XML Schema Part 2 PER 
> base64Binary type
> > > > 
> > > > 
> > > > Gudge, does the whitespace stripping rule mean that "  abcd" 
> > > > is also in
> > > > canonical form?
> > > > 
> > > > Jacek
> > > > 
> > > > On Mon, 2004-03-29 at 12:24, Martin Gudgin wrote:
> > > > > Dear XMLPers,
> > > > > 
> > > > > I took an action on last weeks call to take a look at 
> > the proposed
> > > > > edited recommendation of XML Schema Part 2[1] WRT the 
> > base64Binary
> > > > > type[2]. 
> > > > > 
> > > > > The description of the base64Binary type now contains a 
> > BNF and a
> > > > > canonical lexical form. The canonical lexical form contains no
> > > > > whitespace characters within the stream of base64 
> > > > characters. Whitespace
> > > > > characters at the beginning and/or end of the stream of base64
> > > > > characters are stripped due to the whitespace facet of the 
> > > > type having a
> > > > > value of collapse. Thus any canonical lexical form of 
> > > > base64Binary is
> > > > > one line of base64 characters.
> > > > > 
> > > > > I believe that the addition of a canonical lexical form 
> > > > satisfies our
> > > > > requirements WRT XOP/MTOM.
> > > > > 
> > > > > Regards
> > > > > 
> > > > > Gudge
> > > > > 
> > > > > [1] http://www.w3.org/TR/2004/PER-xmlschema-2-20040318/
> > > > > [2] 
> > http://www.w3.org/TR/2004/PER-xmlschema-2-20040318/#base64Binary
> > > > > 
> > > > 
> > > > 
> > 
> > 
> 
> 
> 
> 
> 
Received on Monday, 29 March 2004 12:36:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:16 GMT