- From: Cokus, Michael S. <msc@mitre.org>
- Date: Thu, 24 Sep 2009 16:32:39 -0400
- To: Paul Pierce <prp@teleport.com>, EXI Comments <public-exi-comments@w3.org>
- CC: "public-xml-core-wg@w3.org" <public-xml-core-wg@w3.org>
Hello Paul, Back in March, I promised an update on the EXI WG actions in response to comments from XML Core. These are discussed below. The original comments from XML Core are quoted, followed by the updated information. I apologize for not following up on this sooner. > 0) The Core XML WG remains concerned about the whole concept of > EXI as an alternative representation of XML infosets, but does > not have consensus about whether it is a Good Thing, a Bad Thing, > or a Neutral Thing. Further comment on this fundamental point may > be forthcoming later. The EXI group's initial response indicated that EXI is intended to be used as an "opt-in" technology. Additionally, we wanted to note that we believe EXI is a necessary thing. Without a common standard, a number of diverging approaches would create serious interoperability issues. > 1) We find the draft somewhat hard to follow; in particular, > the unusual and non-standard grammar notation is not easy to > grasp at a glance; the explanation of compression should be > postponed to after the grammars section; the explanation of > event codes is very hard to follow. In our initial response, we reported that work was underway to address this comment. We have revised the specification accordingly. The compression section had been moved as recommended. The use of "event code" was linked to its definition. Also, the definition of event code was expanded/clarified. In addition, we would like to clarify that the grammar notation in the EXI specification is based on common conventions used to describe Java, C#, and JavaScript. The primary difference is that we have added annotations to signify EXI event codes used to represent non-terminals in the grammar. We trust that keeping this difference in mind will make it easier to interpret the grammar notation, as the rest of the conventions are commonly used in practice. > 2) We believe it is essential to provide (as called out in > an editorial note) a better magic number for EXI. The current > magic number is only 2 bits long, and serves to discriminate > between EXI and XML, but not between EXI and other formats. > This should be fixed by using a 3-4 byte magic number. In the initial EXI working group response we reported that this topic was already under discussion. Since then, a magic number has been added to the EXI format. The change has been noted in http://www.w3.org/TR/2008/WD-exi- 20080919/#changes2 . > 3) We believe that an XML document containing xsi:type > attributes should be treated as a schema-informed document > rather than a schemaless document. This allows processes > that create a single XML document to decorate it with > xsi:type attributes and then get good compression from > an EXI encoder following in the pipeline. In our initial response, we suggested a temporary workaround to address this. Since then, a more elegant means has been devised to achieve the affect described by XML Core and we have revised the specification. Providing an empty schemaID indicates the EXI encoding is schema-informed, but uses no user-defined types (i.e., uses only built-in XSD types which may be referenced using xsi:type). > 4) Reversing the digits when representing decimal fractions > (and fractions of seconds in the date-time datatypes) is > very unnatural. We think it is better to use a (total digits, > scale factor) pair. Thus instead of representing 12.345 as > (12,543) it would be (12345,3). This is one byte longer, > but much easier to decode properly. In our initial response we stated that our intension was to employ simpler techniques when performance did not differ significantly. In this case, the difference in compaction is quite significant, though. In addition, we believe that writing the digits from right to left is really no more complex than writing from left to right. So We have decided keep this approach (reversed digits) because it provides better efficiency and does not increase complexity. > 5) IEEE float representation is better on all counts than > the EXI-specific representation. It's true that some hardwares > can't process it directly, but *no* hardware can process > the current EXI representation. The issue of floating point encoding has involved much good discussion, and we appreciate all the comments we have received on this important issue. The EXI group has run several tests and found that for the majority of EXI use cases, EXI float has significant advantages over IEEE float. We plan to share some test results with the public shortly. > 6) The current date-time representation expresses a date as > ((years-2000), (month*31+day), hour*1440+minute*60+seconds, > reversed fractional second). However, logically years and > months can be reduced to months, and days can be reduced to > seconds, since leap seconds are ignored. We therefore propose > the following triple: ((year-2000)*12+month, > day*86400+hour*1440+minute*60+seconds scaled, scale factor). If > fraction scaling is rejected, this would become ((year-2000)*12+month, > day*86400+hour*1440+minute*60+seconds, reversed fractional second). In our initial response we explained that the EXI date-time encoding was modeled after the various XML Schema date and time related simple types. Our analysis shows that the two representations are comparable regarding size. When representing the full dateTime type, the two approaches differ by 4 bits, with the current EXI format being smaller in the majority of cases. So we have decided to retain the original date-time representation for EXI, because it is "closer" to XML Schema and has comparable (or slightly better) size performance. > 7) We believe that the current representation of strings has no > material advantage over UTF-8, since although it uses at most 3 bytes > per character, 4-byte UTF characters are very rare except in documents > written in obsolete scripts. In our initial response we noted that a number of languages in common use are represented in UTF using 4 bytes. So we concluded that the EXI design (which uses 3 bytes) would result in significant savings in size. To our knowledge, there were no further questions/responses concerning this comment. > 8) We are strongly concerned about the concept of pluggable > codecs as a barrier to interoperability, and believe that the > draft should contain a strong health warning about the use of > these: they should be used only in cases where there is explicit > agreement between the communicating parties, and never for > documents intended for consumption by a general audience. We agree and said as much in our initial response. A note has been placed in section 7.4 "Data Representation Map" to address this: http://www.w3.org/TR/2008/WD-exi-20080919/#datatypeRepresentationMap Thanks for your interest and comments. We hope this response has adequately explained the working group's activities undertaken to address the comments from XML Core. Please let us know if you have additional questions. Mike Cokus (for the EXI Working Group) The MITRE Corporation 757-896-8553; 757-826-8316 (fax) 903 Enterprise Parkway Hampton, VA 23666 >-----Original Message----- >From: public-exi-comments-request@w3.org [mailto:public-exi-comments-request@w3.org] On Behalf Of >Cokus, Michael S. >Sent: Sunday, March 15, 2009 5:16 PM >To: Paul Pierce; EXI Comments >Subject: RE: "Request for response to original XML Core WG comments" > >Hello Paul, > >Thanks much for your comments! > >The EXI Working Group responded publicly to the XML Core comments in January of last year [1]. >But as you noted, the group has indeed taken further action since then to address their comments. >We are working on an update to our original response to clarify the group's actions/resolutions >(including any changes made to the EXI specification) to address the comments from XML Core. We >expect to post the update within the next couple of weeks. > >Thanks again, > >Mike Cokus (for the EXI Working Group) > >[1] http://lists.w3.org/Archives/Public/public-exi/2008Jan/0003.html > >Mike Cokus >The MITRE Corporation >757-896-8553; 757-826-8316 (fax) >903 Enterprise Parkway >Hampton, VA 23666 > >>-----Original Message----- >>From: public-exi-comments-request@w3.org [mailto:public-exi-comments- >>request@w3.org] On Behalf Of Paul Pierce >>Sent: Friday, February 27, 2009 3:56 PM >>To: EXI Comments >>Subject: "Request for response to original XML Core WG comments" >> >>These original comments on the first draft were acknowleged but never >>publicly responded to as far as I can tell, so I'm incorporating them >>here in the comments list by reference: >> >>XML Core WG review of Efficient XML Interchange (EXI) Format 1.0, draft >>of 2007-07-16 >>http://lists.w3.org/Archives/Public/public-exi/2007Oct/0005.html >> >>I know the EXI WG discussed these long ago and incorporated a few into >>the spec, but these comments are very important and must have a >>substantial public response. >> >>In my opinion comments 2, 4, 5, 7, 8 must be more carefully considered >>for inclusion; 2 and 8 as mandatory rather than options. >> >>Paul Pierce
Received on Thursday, 24 September 2009 20:33:20 UTC