RE: [EXI] Partial flush

Hi Jochen,

EXI provides a facility called "self-contained elements" that can be
leveraged to enable such functions like indexing, skipping, etc.

Self-contained elements are independent from the rest of the documents.
A self-contained element uses its own string table, dynamic grammars
and namespace prefixes, therefore is immediately decodable out of the
context.

Semantics of an event SC that represents a self-contained element
are described in section 8.5.4.4.1 "Adding Productions when Strict is False"
in detail.

To use self-contained elements, an option in the header "selfContained"
needs to be turned on. It is worth noting that this option is mutually
exclusive with "compression" option. Self-contained elements can be
used only in non-compression mode.

Please direct further comments about EXI specification if any in response
to this message, to the following mailing list: public-exi-comments@w3.org.

Hope it helps,

-taki


-----Original Message-----
From: public-exi-request@w3.org [mailto:public-exi-request@w3.org] On Behalf Of Jochen Darley
Sent: Thursday, September 25, 2008 1:07 PM
To: Yuri Delendik
Cc: Daniel Peintner; public-exi@w3.org
Subject: Re: [EXI] Partial flush


Hello Yuri & Daniel,

Yuri Delendik wrote:
> By buffer I meant 1 byte (8 bits). Now there is no way for EXI encoder send incomplete byte to the underline channel.
>
> Let's use XMPP example. XML stream data will be send in following manner without closing the underline channel: XML, {time delay},
XML, {time delay}, ., XML.
>
> For example:
> <stream>
> {some time delay}
> <stanza1>.</stanza1>
> {some time delay}
> <stanza2>.</stanza2>
> <stanza1>.</stanza1>
> {some time delay}
> ..
> </stream>
>
> And EXI encoder may hold bits after <stream> or </stanza1> and not send them to the channel, but other party is expecting that
data as soon as possible -- those stanzas can be time critical material.
>
> ZLIB has partial flush functionality that aligns bits to byte border and sends compressed data to the channel, and makes it
available to other party. But it does not terminate compression process.
>
> Compression's blockSize will not help is this case ether.

This seem to be a valid point. Also I wonder why there can't be less
than blockSize AT,CH vales. The specification states every block has
to contain "...the minimum set of consecutive events that have _exactly
blockSize_ Attribute (AT) and Character (CH) values..." [3]. If I
understood that chapter correctly then EXI does not allow "flushing"
partial blocks, because such a block has less than "_exactly blockSize_"
AT and CH events.

Yuri used a good example with common use cases:
1) XML LOG streams
2) XML based news feeds
3) XMPP messages (over an uninterrupted HTTP connection)
....

With the current format limitations the data will be delayed by "hours"
and/or be incomplete, if I did not miss something.

I've been wondering if skipping could be supported by EXI. XML does not
support skipping, but does not prevent it. For example after adding an
attribute with the (byte) size of the element's
content the content can be skipped during reading.

So I thought about the dependencies of blocks:

What do I need to read a block in the middle of the stream?
1) some global dependencies: Schema, Compression Options
2) incremental String Table (depends on many/all previous blocks?)
3) ???

If I understood the spec correctly there is no block dependency for
classic compression (e.g. deflate), as each block is compressed separately.
As far as I can tell EXI does _prevent skipping_ as the string table can
not be reset or written to the stream.

Is it possible for EXI to have less dependencies between blocks ?
Compress a block (packet?) as much as possible, but keep blocks as
independent as possible would help to support skipping.

Thanks,
Jochen Darley

> [1] http://www.w3.org/TR/exi/#compression
> [2] http://www.w3.org/TR/exi/#key-blockSizeOption
[3] http://www.w3.org/TR/exi/#blocks

Received on Thursday, 30 October 2008 01:30:20 UTC