RE: Constraint on string tables from Stanley A. Klein on 2007-08-07 (public-exi@w3.org from August 2007)

From: Stanley A. Klein <sklein@cpcug.org>
Date: Tue, 7 Aug 2007 16:37:13 -0400 (EDT)
To: "Vogelheim, Daniel" <daniel.vogelheim@siemens.com>
Cc: public-exi@w3.org
Message-ID: <9874.207.188.248.157.1186519033.squirrel@www.cpcug.org>
Daniel -

Thanks for responding.

I understand the concern of the working group regarding negotiation and
its complexities.  However, you do define string tables with some initial
entry strings that you happen to know would be very generally useful for
XML.  It might be good to simply allow a user community to agree off-line
what strings might be useful in their particular use case and context and
allow those strings to also be defined as initial entries as an extension.
 No on-line negotiation would be involved, just a knowledge that the
particular implementation is intended for that particular use case. 
Everything else could be the same, except perhaps for a binary in the
header indicating that this is happening and naming of the use case
somewhere in the header.

You already have something called a pluggable codec, that I don't
understand.  This would be a pluggable string table initial entry
extension, treated in a comparable manner (i.e., indicated as an option
and named).

Regarding the option of using enumerations in the schema, this would be
difficult for my use case.  The messages generally consist of object names
and object values.  The object names are constructed of standardized
sub-strings concatenated in a manner similar to file naming.  The actual
construction of the object names is difficult to describe in a schema and
leads to numerous complications, so it is best to just define the names as
strings.  The schema can be of some help regarding strings in element and
attribute names, but not the object names.

Regarding the option of using fragments.  This would be useful, as long as
the EXI document/fragment itself is a fictional construct that never
actually exists and whose history except for string tables and some other
state information can be forgotten after each message is processed.  If
the incremental processing has to build the incrementally received nodes
into a document and retain the document, that would not work.  The
document would grow with each message and eventually become unwieldy.

I hope this clarifies my concerns.


Stan Klein



On Tue, August 7, 2007 7:25 am, Vogelheim, Daniel wrote:
> Hello Stanley,
>
> Thanks for your interest in EXI!
>
> Sorry that it took me so long; but let me try to give you an answer,
> after discussion with the EXI WG:
>
>> Section 7.3 specifies that"
>>
>> "The life cycle of a string table spans the processing of a single EXI
>> stream. String tables are not represented in an EXI stream or
>> exchanged
>> between EXI processors. A string table cannot be reused
>> across multiple
>> EXI streams; therefore, EXI processors MUST use a string table that is
>> equivalent to the one that would have been newly created and
>> pre-populated
>> with initial values for processing each EXI stream."
>>
>> Why is this constraint included?  It appears to imply that the string
>> table must be rebuilt for each message in a communications
>> context.  If a
>> number of strings are known to occur repeatedly in a particular
>> communications context, why not allow the string table to be
>> optionally
>> pre-populated for that context, just as schemas are permitted to be
>> optionally specified?
>
> First, your understanding of the spec's intention is correct. :-)
>
> Second, why was this done? EXI works more efficiently by having more
> knowledge about the data at hand. In the current design, all such
> a-priori knowledge is encoded in XML Schema (other than the generic
> knowledge of XML / XML Namespaces itself). Adding additonal sources of
> such knowledge has a number of far reaching consequences on design
> complexity and especially for deployment. For example, content
> negotiation becomes significantly more complex if there are multiple
> items that need to be negotiated before an EXI document could be
> understood. The current feeling of the group is that the added
> complexity would be rather detrimental.
>
> Third, what other options are there? There's two alternatives I'd like
> you to consider:
>
> If the known repeated strings occur in particular contexts, you might
> consider writing an schema for encoding purposes that would use
> enumerations in those particular places. Note that due to full support
> for schema deviations, you can still encode arbitrary XML documents.
> It's just that by supplying an appropriatly modified schema with
> enumerations, you can inform the EXI processor about exactly which
> strings are expected in which places.
>
> In case you are more worried about a sequence of messages and wish to
> preserve state across transmissions, you might consider encoding the
> entire message sequence as a single EXI fragment. That is, from the EXI
> point of view the entire exchange is a single EXI document. The root of
> each individual message would then essentially add one node to the
> fragment. Once each such message/fragment node is encoded, it could be
> incrementally transmitted in order to achieve the actual message
> exchange. That would allow to preserve all encoding state throughout the
> entire session, and fully within the framework of the current EXI draft.
>
>
> Stan, I hope this answers your question. If not, please let us know what
> you think!
>
>
> Sincerely,
> Daniel Vogelheim
>


--
Received on Tuesday, 7 August 2007 20:18:41 UTC