AW: "7.3.3 Partitions Optimized for Frequent use of String Literals"

Hi Arman,

> So a value is getting added to the local value partition when it is
> absent in both the local and the global value partitions. This
> implicitly defines that the value shall be added to only one of the
> "local" value partitions.

Your understanding is correct.

>  I think it would be great if the spec would state so explicitly.

We believe that the specifications does already state this characteristic. Nevertheless, we will look into clarifying this property in a future version of the specification.

> I wonder why the group settled on such a design for the
> local value table. It is not unusual for the same value to
> repeat in the content of elements with different names. 

This characteristic of the string table design is intentional so as to facilitate optimized string tables implementations achieving better processing efficiency by allowing encoders to look up a string in a single table only, and greatly simplifying the implementation of bounded value partitions.

Although more elaborate designs are conceivable, the WG concluded they are not very attractive considering their negative impact on processing efficiency and the marginal merit on compactness that they would bring.

Hope this helps to explain our design decision,

Thanks,

-- Daniel


________________________________________
Von: Arman Djusupov [arman@noemax.com]
Gesendet: Donnerstag, 24. Februar 2011 13:30
An: Peintner, Daniel
Cc: public-exi@w3.org; 'Efficient XML Interchange WG'
Betreff: Re: "7.3.3 Partitions Optimized for Frequent use of String Literals"

Ah, I see.

So a value is getting added to the local value partition when it is absent in both the local and the global value partitions. This implicitly defines that the value shall be added to only one of the "local" value partitions. I think it would be great if the spec would state so explicitly.

I wonder why the group settled on such a design for the local value table. It is not unusual for the same value to repeat in the content of elements with different names. For example:

<barbie>
     <hairColor>Black</hairColor>
     <skinColor>Black</skinColor>
     <eyeColor>Black</eyeColor>
     <barbieCarColor>Red</barbieCarColor>
     <boxColor>Red</boxColor>
</barbie>

While in the first case ("hairColor") the EXI encoder will have to check against only a single value in the table, in the second case it will have to find the value in the global value table (which might have a few hundred thousand strings already) just because "Black" can only be added to a single "local" table.

Or did I miss something?

With best regards,
Arman



On 2/24/2011 1:33 PM, Peintner, Daniel wrote:
> Hi Arman,
>
> Thank you for your feedback on the EXI specification.
>
>> As far as I understand the "local" value partition is associated to the
>> current qualified name in the scope. So the value *V* that is getting
>> removed from the "local" partition may also be present in the "local"
>> partitions associated to other qualified names. Does this mean that value
>> *V* should be removed only form the "local" partition of the qualified
>> name in the current scope or from all "local" partitions that include this
>> value?
>
> In EXI a string optimized for frequent use is assigned to two partitions, a "local" value partition and the global value partition. Once a string has been added to a local partition it is not possible to add the same string to any other local value partition (unless it has been removed again) [1].
>
> This implies that removing a string value, due to string table bounds, means removing the appearance of the string in the global partition and the "one" associated local value partition.
>
> Hope this helps,
>
> -- Daniel
>
> [1] http://www.w3.org/TR/exi/#encodingOptimizedForMisses
>

Received on Wednesday, 2 March 2011 18:34:51 UTC