W3C home > Mailing lists > Public > xml-dist-app@w3.org > May 2003

Re: PASWA, Include and Protocol Bindings

From: <noah_mendelsohn@us.ibm.com>
Date: Mon, 5 May 2003 16:54:33 -0400
To: "John J. Barton" <John_Barton@hpl.hp.com>
Cc: xml-dist-app@w3.org
Message-ID: <OF6AE31421.743D3187-ON85256D1D.0073822A@lotus.com>

John Barton asks:

> However, there is one important issue in your note
> below that I'd like to understand better.  As I read
> the infoset docs, the bit representation of components
> of the infoset are *not* defined.  That is, infoset as
> I understand it is silent on binary vs base64 vs plain
> text or whatever.  So Infoset is about the data
> structure not the data representation.  Is the correct?

Well, let's be careful not to confuse what is modeled from how it is 
represented.  The Infoset IS very definitely about characters.  There is 
absolutely no question in at the Infoset that the following are different:

<e>123</e>
<e>00123</e>

The content of the first is three character children information items, 
the second has five children. In fact, even the following are different in 
the Infoset (though not necessarily in the new XQuery/XPath data model):

<f xsi:type="xsd:integer">123</f>
<f xsi:type="xsd:integer">00123</f>

What's confusing you is that in another sense you are absolutely correct: 
the infoset does not tell you how to represent those characters.  Indeed, 
the very "trick" at the heart of PASWA is that one way to optimize the 
first of each of these pairs is to make a note that the characters are 
what the schema recommendation calls the canonical lexical representation 
(no leading zeros...you can use a single bit isCanonical/isNotCanical to 
signal that the optimization has triggere), and then to store the actual 
integer 123 for the value.  Note that the infoset is still unambiguously 
characters.  If you are asked for the content of that first element e you 
must come up with the three characters "1", "2", "3".  The trick in PASWA 
is that, in most cases, the application won't actually ask for the infoset 
content, but for something derived from it (I.e. the actual binary value.)

Note that I've used integers for ease of illustration;  PASWA focuses 
primarily on the base64 type.

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------
Received on Monday, 5 May 2003 17:03:36 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:14 GMT