RE: EXI LC Comments

Hi Youenn,

I agree with you that it ought to be the application program that drops in
xsi:type type annotation into the infoset. Therefore the snippet you provided
would be better described like this:

  writer.writeStartElement('foo');
  writer.writeAttribute("http://.../XMLSchema-instance",  "type", "base64Binary");
  writer.writeBinary(blob,...);
  writer.writeEndElement();

Speaking of the second example you provided that involves the use of attributes,
I suggest to define a number of type definitions such as the one below for each
built-in datatype and share them in advance of the communication.

  <complexType name="base64BinaryItemType" final="extension">
    <simpleContent>
      <extension base="base64Binary">
        <anyAttribute namespace="##any" processContents="lax" />
      </extension>
    </simpleContent>
  </complexType>

You can then, use such basic types in your EXI instances.

  writer.writeStartElement('foo');
  writer.writeAttribute("unn:foo",  "type", "base64BinaryItemType");  
  writer.writeAttribute('id','1');
  writer.writeBinary(blob,...);
  writer.writeEndElement();

It still has limitation with regards to the typing of attributes; attributes need to
be typed  in the schema, since xsi:type is not applicable to attributes themselves.

-taki


-----Original Message-----
From: public-exi-request@w3.org [mailto:public-exi-request@w3.org] On Behalf Of FABLET Youenn
Sent: Thursday, January 08, 2009 1:50 AM
To: Tatu Saloranta; Daniel Peintner
Cc: public-exi@w3.org; Efficient XML Interchange WG
Subject: RE: EXI LC Comments


Building on the stax2 API example Tatu brought up, it would seem reasonnable to hope that base64 data sent to an EXI processor
through the stax2 typed binary API be encoded as binary blobs. MTOM is able to support that as well as Fast Infoset IIRC.

A schema-less EXI processor may be able to support partially this use case by automatically inserting a xsi:type, for instance in
the following example:
        writer.writeStartElement('foo');
        writer.writeBinary(blob,...);
        writer.writeEndElement();
But the same EXI processor would fail to optimise the following use case using xsi:type, since setting a wrong xsi:type may (would?)
break interoperability:
        writer.writeStartElement('foo');
        writer.writeAttribute('id','1');
        writer.writeBinary(blob,...);
        writer.writeEndElement();
This behaviour is not intuitive to me and I am unsure of its actual usability.

Even the first case seems questionable in terms of interoperability.
The generated infosets would be different when using EXI and textual XML writers.
Also, if validation occurs after decoding, the xsi:type value must also be compatible with the schema or validation will fail.
It seems that it is the application responsibility to actually insert or not the @xsi:type, not the encoder.
This may be an issue since the application may not know the actual types known by the encoder.
Any confirmation/precision/information would be appreciated there.

Another alternative would be to use datatype representation map to overload the encoding of strings, but you leave the
open/interoperable EXI world.

Going back to stax2 typed API, the integration of stax2 typed API with EXI may not show great processing efficiency gains except
when feeding the EXI processor with the right schema, thus reducing the overall flexibility of the system.

Concerning the support of list of floats or integers, it would require the definition of a few new complex types.
In terms of implementation however, the cost of adding these types is almost none for pure schema-less EXI processors.

        youenn

-----Original Message-----
From: Tatu Saloranta [mailto:tsaloranta@gmail.com]
Sent: lundi 29 décembre 2008 19:52
To: Daniel Peintner
Cc: FABLET Youenn; public-exi@w3.org; Efficient XML Interchange WG
Subject: Re: EXI LC Comments

On Mon, Dec 29, 2008 at 1:23 AM, Daniel Peintner
<daniel.peintner@gmail.com> wrote:
>
...
> We feel that EXI should fit smoothly into the XML stack. To avoid any
> problems EXI supports type information in the same fashion as XML
> does. The groups intention is not to super-set XML. XML allows one to
> specify typed-values with the attribute xsi:type. EXI provides the
> same mechanism and both work on elements only.

While this is true for fully data-based (schema-first, xml-centric
etc) approach, it is not the only way to handle typing. It is easy to
infer expected type from application access; that is, if application
requests content as being of certain datatype (which generally needs
to be of recognized and pre-defined types, that is, one of XSD types),
it can be accessed as such.
There are other ways too, such as inferring schema binding from object
types, but that only works for statically typed languages.

In case of textual xml for example, this just means that textual
content is transferred as text, but parser/generator can deal with
conversions to and from external types: the obvious example is that of
automated base64 encoding (when writing) and decoding (when reading).
This can be thought of as variation (or application) of "duck typing":
instead of requiring a static schema definition, equivalent
information (or at least subset of) can be provided by access through
the interface (I realize EXI does not specify API, just encoding
aspects)

For what it is worth, this is the approach taking with "typed access
API" for stax2 extension API.
For EXI perhaps it could be used as a way to generate necessary schema
information when generating (writing) content; and on receiving end,
figuring out which if any conversions are needed between raw/native
data format and format applications wants to use. I don't know if
this would even fall within scope of what EXI tries to do.

I don't claim that it would be feasible to consider such an approach
for EXI (both due to timing and since EXI does not define API level
aspects), I just wanted to mention that there's more than one way to
slice the cake. Not everything has to be handled at XML level.

-+ Tatu +-

Received on Thursday, 8 January 2009 21:05:01 UTC