- From: Taki Kamiya <tkamiya@us.fujitsu.com>
- Date: Wed, 31 Mar 2010 00:21:24 -0700
- To: "'FABLET Youenn'" <Youenn.Fablet@crf.canon.fr>, <public-exi-comments@w3.org>
- Message-ID: <A62E0F8E9D9243869063A2F5AFCE83A7@homunculus>
Hi Youenn, The strict schema-informed mode has been formulated to be competitive with some existing highly optimized formats. Indeed, it should not be an overstatement to say that this is the primary purpose of the strict schema-informed mode -- to be competitive in situations where every bit counts. Please also note that such occurrences of xsi:type are recoverable at receiver's end. Since a decoder knows which elements in an EXI stream use their natural global types, some decoder implementations may want to decorate such elements with xsi:type information when it serves applications accessing the data. Hope this helps, -taki _____ From: public-exi-comments-request@w3.org [mailto:public-exi-comments-request@w3.org] On Behalf Of FABLET Youenn Sent: Friday, January 29, 2010 6:14 AM To: public-exi-comments@w3.org Subject: xsi:type in strict mode Dear all, This is feedback related to the EXI specification. Currently, according http://www.w3.org/TR/exi/#addingProductionsStrict, a @xsi:type production is added only when named subtypes are known to the EXI processor. The intention, AIUI, is that as few @xsi:type productions as possible are actually added to grammars so as to get some compression gain. I see a practical drawback with this current approach. Some XML documents, valid according the XML schema components used to generate the grammars, will not be encodable by EXI encoders. Given the following schema and instance: <xs:schema> < xs:element name="test" type="xs:base64Binary"/> </ xs:schema> <test xsi:type="xs:base64Binary" ./> The instance is valid as per the schema but, according our current interpretation, is not encodable in strict mode. If that is not correct, could you clarify the specification? If that is correct, this is clearly not practical since one of the strict mode design goal was to encode at least all XML schema valid documents. This case is not happening often currently. However, it may happen that applications put more often @xsi:type information to ensure value typing, even in schemaless mode. The additional issue is that, in some environments, it may be tempting to use a global schema at the application level and a subset for the EXI transmission (to fit the lightest devices). In those cases, this issue may cause trouble since perfectly application-schema-valid documents will not be encodable in strict mode, depending of course on the exact EXI schema subset. It could be expected that documents valid according a particular schema could be encoded in strict mode with a subset of the particular schema. Also, adding new grammars to a given grammar set would require potential modifications of the grammars themselves. Getting back to the actual compression gain of the current approach, the benefit is not high. At max, it gains 1 bit per element (in bit-packed mode only, no difference for the compression mode). In practice, I even doubt that the compression gain is that high: @xsi:type production is often added to simple typed elements ("xs:string" typed elements e.g.) @xsi:type production has minor impact to the code length as soon as element content has optional attribute/child items At the very least, the rule could be changed so that a @xsi:type production is added for all elements whose content is defined by a global type definition. This seems more inline with the XML Schema specification. In addition, schema writers that want to squeeze as much bits as possible could inline their schema type definitions. Regards, youenn
Received on Wednesday, 31 March 2010 07:22:11 UTC