- From: FABLET Youenn <Youenn.Fablet@crf.canon.fr>
- Date: Thu, 23 Sep 2010 10:38:27 +0200
- To: "public-exi-comments@w3.org" <public-exi-comments@w3.org>, FABLET Youenn <Youenn.Fablet@crf.canon.fr>
- Message-ID: <52D873DC3403944C840CF3063ABE43A701932E2604@Nina.crf.canon.fr>
Thanks for the comment. The WG reviewed your comment and decided to keep the current consensus. The WG would first like to state that one important target of the schema-informed strict mode is to maximize the compression for most schema-valid documents. Some decisions were made to favor common documents at the expense of less-common documents. In this particular case, it was thought that the benefit in terms of compression outweighs the ability to encode in strict mode some not very common schema-valid documents. The WG would also like to note that the default schema-informed mode applies well for these documents and will generally achieve very good compression results. Regards, Youenn From: FABLET Youenn Sent: vendredi 29 janvier 2010 15:14 To: 'public-exi-comments@w3.org' Subject: xsi:type in strict mode Dear all, This is feedback related to the EXI specification. Currently, according http://www.w3.org/TR/exi/#addingProductionsStrict, a @xsi:type production is added only when named subtypes are known to the EXI processor. The intention, AIUI, is that as few @xsi:type productions as possible are actually added to grammars so as to get some compression gain. I see a practical drawback with this current approach. Some XML documents, valid according the XML schema components used to generate the grammars, will not be encodable by EXI encoders. Given the following schema and instance: <xs:schema> < xs:element name="test" type="xs:base64Binary"/> </ xs:schema> <test xsi:type="xs:base64Binary" .../> The instance is valid as per the schema but, according our current interpretation, is not encodable in strict mode. If that is not correct, could you clarify the specification? If that is correct, this is clearly not practical since one of the strict mode design goal was to encode at least all XML schema valid documents. This case is not happening often currently. However, it may happen that applications put more often @xsi:type information to ensure value typing, even in schemaless mode. The additional issue is that, in some environments, it may be tempting to use a global schema at the application level and a subset for the EXI transmission (to fit the lightest devices). In those cases, this issue may cause trouble since perfectly application-schema-valid documents will not be encodable in strict mode, depending of course on the exact EXI schema subset. It could be expected that documents valid according a particular schema could be encoded in strict mode with a subset of the particular schema. Also, adding new grammars to a given grammar set would require potential modifications of the grammars themselves. Getting back to the actual compression gain of the current approach, the benefit is not high. At max, it gains 1 bit per element (in bit-packed mode only, no difference for the compression mode). In practice, I even doubt that the compression gain is that high: @xsi:type production is often added to simple typed elements ("xs:string" typed elements e.g.) @xsi:type production has minor impact to the code length as soon as element content has optional attribute/child items At the very least, the rule could be changed so that a @xsi:type production is added for all elements whose content is defined by a global type definition. This seems more inline with the XML Schema specification. In addition, schema writers that want to squeeze as much bits as possible could inline their schema type definitions. Regards, youenn
Received on Thursday, 23 September 2010 08:39:04 UTC