RE: EXI LC Comments

Hi Youenn,

Thanks much for your question and continued interest in EXI.

>(2) Guidelines for schema modeling
>Is there any guideline regarding the relationship between EXI and schema modeling?
>Guidelines would be useful to understand the impact of some schema modeling decisions
>on EXI encoding/decoding in terms of efficiency and compression.
>For instance, it seems that the more global constructs (elements, types, attributes),
>the bigger will be the generated grammars since all global schema constructs need to be
>kept (right?), having a lot of xs:all or maxOccurs="999" may also hurt efficiency.
>See also question 3)


It seems that there are two aspects to your question.  One pertains to design issues involved in creating XML Schemas for optimizing EXI encoding compactness.  The other relates to the internal representation that an EXI implementation might use to perform schema-based encoding.

Concerning designing a schema to get more compact EXI encodings, the general rule would be, "More precise content constraints and more deterministic structure lead to a more compact EXI encodings."  Examples of applying this general rule are:

-Carefully think about minOccurs and maxOccurs of particles, and requirement (vs. optionality) of attributes. The more you exploit those properties, the more compactness you can achieve.

-Use substitutionGroup, instead of depending on xsi:type mechanism, whenever possible.

-Use wildcard only when absolutely necessary. Use concrete elements and attributes if possible.

The above are simple guidelines.  However, some questions related to the general rule are not so simple. An example of such a consideration is:  Where is the right location for a required element in a model group?  For example, in the following schema fragment, element "D" comes first. But placing "D" after "C" (as shown in the second schema fragment, below) would increase determinism.  The reason for this relates to the number of possible alternatives for elements which may follow "B".  So, in the first example, the elements "C", "W", "X", "Y" and "Z" may each appear after "B".  However, in the second example, only element "D" may follow "B".  The second example is more deterministic and should result in a more compact encoding.

<sequence>
  <element name="D" />
  <element name="B" minOccurs="0" />
  <element name="C" minOccurs="0" />
  <choice>
    <element name="W" />
    <element name="X" />
    <element name="Y" />
    <element name="Z" />
  </choice>
</sequence>

<sequence>
  <element name="B" minOccurs="0" />
  <element name="C" minOccurs="0" />
  <element name="D" />
  <choice>
    <element name="W" />
    <element name="X" />
    <element name="Y" />
    <element name="Z" />
  </choice>
</sequence>


Regarding the grammar size relative to schema size (which is one of the issues we believe you raise), this is the nature of schema-processing.  EXI is no different from schema-validation of XML parsers using XML schemas.  In principle, grammar size is proportional to the number of definitions and constructs used in the schema.  This should be a consideration when generated schema-informed EXI  encodings.  The question of how much schema information should be stored and applied is important, because in some cases there are two conflicting goals.  The first is the desire to limit grammar footprint and the second is the desire to increase determinism of the grammar.

So this is a practical issue that each use case has to cope with, rather than an inherent issue with EXI. We have not, as a group, discussed this, nor was it in our plans to develop these kinds of implementation guidelines.  While such instances of schemas may cause footprint issues for a straight-forward implementation of the specification, there have been implementation reports from within the WG that there are implementation techniques that resolve such footprint issues.

We may look into schema modeling guidelines, but only after other charter-driven tasks are completed, and if time allows.  It seems that guidelines addressing issues in implementing an XML schema validation parser would be pertinent to your questions (especially with respect to your question about grammar size).  Considering this, the XML Schema group may have better insight than the EXI group can provide at this time.

Thanks again,

Mike Cokus (for the EXI working group)

Mike Cokus
The MITRE Corporation
757-896-8553; 757-826-8316 (fax)
903 Enterprise Parkway
Hampton, VA 23666

Received on Monday, 2 March 2009 16:05:55 UTC