- From: Daniel Peintner <daniel.peintner@gmail.com>
- Date: Thu, 26 Feb 2009 10:39:33 +0100
- To: FABLET Youenn <Youenn.Fablet@crf.canon.fr>
- Cc: public-exi-comments@w3.org, Efficient XML Interchange WG <member-exi-wg@w3.org>
Hello Youenn, thank you very much for your comment. > 5) EXI schema-less/schema-informed modes > Based on internal discussions and internal feedback, there is a > general assumption that the EXI specification somehow defines > two separate modes (schema-less and schema-informed). > > While this is clearly stated in the specification that both modes > easily coexist in a single EXI stream, additional advertisement > (maybe in the primer) of that feature may be good for adoption. > > The latest published primer (dec 2007) could maybe be improved with that respect. You are right in assuming that schema-informed and built-in grammars may coexist in the same EXI stream. The first published EXI primer document [1] uses an incorrect terminology in that regard. A revised version will be available soon and integrate your comments, beside other improvements and spec consistency issues. Thanks again for pointing us to the problem, -- Daniel [1] http://www.w3.org/TR/2007/WD-exi-primer-20071219/ On Thu, Nov 6, 2008 at 5:15 PM, FABLET Youenn <Youenn.Fablet@crf.canon.fr> wrote: > Dear EXI WG, > > please find below some comments and questions regarding EXI specification > last call working draft. > > Regards, > > youenn fablet > > > > 1) Some facets are supported like minInclusive or maxExclusive. > > What about the support of the length, minLength and maxLength facets which > could be useful to better encode string or list sizes. > > It should not be too difficult to support them based on current facet > support. > > Is there a rationale to not include these facets? > > > > 2) Guidelines for schema modeling > > Is there any guideline regarding the relationship between EXI and schema > modeling? > > Guidelines would be useful to understand the impact of some schema modeling > decisions on EXI encoding/decoding in terms of efficiency and compression. > > For instance, it seems that the more global constructs (elements, types, > attributes), the bigger will be the generated grammars since all global > schema constructs need to be kept (right?), > > having a lot of xs:all or maxOccurs="999" may also hurt efficiency. > > See also question 3) > > > > 3) DataTypeRepresentationType question > > I would like a confirmation of the current DataTypeRepresentationType > behaviour. > Let's have a schema with the following attribute definition: > > <xs:attribute name="test" type="xs:string"/> > > In that case, the only way to change the encoding for @test1 values with the > DataTypRepresentationType feature > > is to redefine xs:string which may have great impact. > > If we only want to change the @test values with the > DataTypRepresentationType feature, we would need to > > change the schema as follow: > > <xs:simpleType name="mystring"> > > <xs:restriction base="xs:string"/> > > </xs:simpleType> > > <xs:attribute name="test" type="mystring"/> > > DataTypeRepresentationType could then be used to redefine mystring. > > Is it correct? > > If so, the interoperability will generally be lost, since interoperable > DataTypeRepresentationType use is currently limited to XML Schema part 2 > predefined types redefinition (end of section 7.4). > > What about extending that behaviour to all simple types that have been > gathered by consuming the schema in use? > > Is there any rationale behind that specific constraint? > > > > 4) Typed encoding in schema-less mode > > EXI enables limited typed encoding support in schema-less encoding. > Since only predefined types are supported, xsi:type seems mainly useful to > encode base64 chunks with the binary encoding. > > Even in that case, the usability is not so good : in some cases, elements > whose content is base64 have also attributes. For instance ds:SignatureValue > has an optional ID attribute. > > Of course, one could still use xsi:type=base64Binary in deviation mode but > interoperability may be pretty bad and putting a wrong xsi:type for the > purpose of compression seems broken. > > Also to be noted that: > > - Attribute values cannot be typed encoded with schema-less > grammars. > > - Other useful types like "list of float","list of integers" > cannot be used without external schema knowledge. > > Improved out-of-the-box support of this use case would be very helpful. > > > > 5) EXI schema-less/schema-informed modes > > Based on internal discussions and internal feedback, there is a general > assumption that the EXI specification somehow defines two separate modes > (schema-less and schema-informed). > > While this is clearly stated in the specification that both modes easily > coexist in a single EXI stream, > > additional advertisement (maybe in the primer) of that feature may be good > for adoption. > > The latest published primer (dec 2007) could maybe be improved with that > respect. > > > > Additionaly, while EXI provides great flexibility in the amount of schema > put in grammars, > > the schemaID mechanism seems very minimal. > > It seems that interoperable uses of schema-informed EXI will greatly > restrain the use of this flexibility. > > Is there some additional work in that area that could or will be further > conducted? > > 6) Is it conformant to not follow the attribute order in the case of a > schema-informed grammar encoded element in deviation mode? > > As stated in section 6, it seems not conformant. > > In some cases, grammars can support attributes in no particular order, such > as the example below (correct me if I got something wrong). > > <xs:complexType name="test"> > > <xs:attribute name="name" type="xs:string"/> > > <xs:anyAttribute namespace="#any"/> > > </xs:complexType> > > <xs:element name="test" type="test"/> > > > > While the benefit of ordering the attributes at the grammar level and the > general compression benefit for encoders to follow the given order are > obvious, I do not see compelling reasons of including this constraint in the > format itself. > > At the encoder side, the encoder may decide to order attributes or not. > > If encoding fails due to bad ordering (in strict mode) or if the compression > ratio is bad, the encoder can always decide to order the attributes. > At the decoder side, the decoder is only following the grammars so it does > not really care about the ordering. > > There is even a drawback as this is one (major ?) difference between > schema-informed and schema-less processing. > > Am I missing something obvious? > > > > 7) RDF/XMP use case > > This is more a general comment on specific XML/EXI use cases, notably RDF or > XMP documents where > > no standard, well defined XML schemas are available. > > These documents generally have some defined structures and types (RDF > schema, XMP schemas…) but no > > well defined XML schemas. > > What would be the recommendation from the WG to enable good interoperable > EXI compression? Stick with schema less encoding? Create a XML schema, > publish it and use it? > > > > 8) Through careful checking of published EXI encoded streams > > (Thanks again for the publication of these encoded examples by the way!), > > Herve found some potential differences between the streams and the > specifications (see below). > > > > 9) > > Section 8.5.4.4.1: > > When adding production: > > AT (qname) [schema-invalid value] Element?,? > > to Elementi,j > > Which next Symbol should be used? > > Spec says Elementi,j > > It would be more logical to use the symbol from the production: > > AT (qname) [schema-valid value] Elementi,k > > > > 10) > > Section 9.3 > > "Value channels that contain no more than 100 values" seems to mean: with > *strictly* less than 100 values. > > In this paragraph, all comparison should be made clearer using 'greater or > equal' and 'strictly greater'. > > > > 11) > > Section 8.4.3 > > In Schema-less mode, EE productions should be promoted to event code 0 when > used (if no EE production with an event code length of 1 already exist). > > > > 12) > > Section 8.4.3 > > In Schema-less mode, when using the SE(*) production, should the creation of > the SE(qname) production be done before the evaluation of the element > content? > > > > In most case, this has no impact. In case of recursive elements, this leads > to better compaction. > > Moreover, in case or recursive elements, the current specification seems to > imply creating several SE(qname) productions. > > > > 13) > > Section 8.4.3 > > xsi:schemaLocation attributes seems to be removed from the infoset before > encoding in agile delta streams. > > Is it by design or is it implementation related? > > > > 14) > > Section 7.3.3 > > Empty strings can occur as attribute values. > > Section 7.3.3 suggests that these empty strings are to be added in indexing > tables. > The current litteral EXI encoding being compact enough, it is reasonnable > not to add them in the table. > >
Received on Thursday, 26 February 2009 09:40:13 UTC