RE: EXI LC Comments from FABLET Youenn on 2009-01-08 (public-exi@w3.org from January 2009)

From: FABLET Youenn <Youenn.Fablet@crf.canon.fr>
Date: Thu, 8 Jan 2009 10:49:31 +0100
To: Tatu Saloranta <tsaloranta@gmail.com>, Daniel Peintner <daniel.peintner@gmail.com>
CC: "public-exi@w3.org" <public-exi@w3.org>, Efficient XML Interchange WG <member-exi-wg@w3.org>
Message-ID: <C1797CB6A125334AB23C5A0A160944AD2C39E284B1@cressida.crf.canon.fr>
Building on the stax2 API example Tatu brought up, it would seem reasonnable to hope that base64 data sent to an EXI processor through the stax2 typed binary API be encoded as binary blobs. MTOM is able to support that as well as Fast Infoset IIRC.

A schema-less EXI processor may be able to support partially this use case by automatically inserting a xsi:type, for instance in the following example:
        writer.writeStartElement('foo');
        writer.writeBinary(blob,...);
        writer.writeEndElement();
But the same EXI processor would fail to optimise the following use case using xsi:type, since setting a wrong xsi:type may (would?) break interoperability:
        writer.writeStartElement('foo');
        writer.writeAttribute('id','1');
        writer.writeBinary(blob,...);
        writer.writeEndElement();
This behaviour is not intuitive to me and I am unsure of its actual usability.

Even the first case seems questionable in terms of interoperability.
The generated infosets would be different when using EXI and textual XML writers.
Also, if validation occurs after decoding, the xsi:type value must also be compatible with the schema or validation will fail.
It seems that it is the application responsibility to actually insert or not the @xsi:type, not the encoder.
This may be an issue since the application may not know the actual types known by the encoder.
Any confirmation/precision/information would be appreciated there.

Another alternative would be to use datatype representation map to overload the encoding of strings, but you leave the open/interoperable EXI world.

Going back to stax2 typed API, the integration of stax2 typed API with EXI may not show great processing efficiency gains except when feeding the EXI processor with the right schema, thus reducing the overall flexibility of the system.

Concerning the support of list of floats or integers, it would require the definition of a few new complex types.
In terms of implementation however, the cost of adding these types is almost none for pure schema-less EXI processors.

        youenn

-----Original Message-----
From: Tatu Saloranta [mailto:tsaloranta@gmail.com]
Sent: lundi 29 décembre 2008 19:52
To: Daniel Peintner
Cc: FABLET Youenn; public-exi@w3.org; Efficient XML Interchange WG
Subject: Re: EXI LC Comments

On Mon, Dec 29, 2008 at 1:23 AM, Daniel Peintner
<daniel.peintner@gmail.com> wrote:
>
...
> We feel that EXI should fit smoothly into the XML stack. To avoid any
> problems EXI supports type information in the same fashion as XML
> does. The groups intention is not to super-set XML. XML allows one to
> specify typed-values with the attribute xsi:type. EXI provides the
> same mechanism and both work on elements only.

While this is true for fully data-based (schema-first, xml-centric
etc) approach, it is not the only way to handle typing. It is easy to
infer expected type from application access; that is, if application
requests content as being of certain datatype (which generally needs
to be of recognized and pre-defined types, that is, one of XSD types),
it can be accessed as such.
There are other ways too, such as inferring schema binding from object
types, but that only works for statically typed languages.

In case of textual xml for example, this just means that textual
content is transferred as text, but parser/generator can deal with
conversions to and from external types: the obvious example is that of
automated base64 encoding (when writing) and decoding (when reading).
This can be thought of as variation (or application) of "duck typing":
instead of requiring a static schema definition, equivalent
information (or at least subset of) can be provided by access through
the interface (I realize EXI does not specify API, just encoding
aspects)

For what it is worth, this is the approach taking with "typed access
API" for stax2 extension API.
For EXI perhaps it could be used as a way to generate necessary schema
information when generating (writing) content; and on receiving end,
figuring out which if any conversions are needed between raw/native
data format and format applications wants to use. I don't know if
this would even fall within scope of what EXI tries to do.

I don't claim that it would be feasible to consider such an approach
for EXI (both due to timing and since EXI does not define API level
aspects), I just wanted to mention that there's more than one way to
slice the cake. Not everything has to be handled at XML level.

-+ Tatu +-
Received on Thursday, 8 January 2009 09:50:14 UTC