- From: Grosso, Paul <pgrosso@ptc.com>
- Date: Mon, 20 Aug 2007 15:10:56 -0400
- To: <public-xml-core-wg@w3.org>
Interesting and most likely worthwhile comments. But what about the bigger picture. I assume EXI is a (different from XML 1.x) way to serialize an XML infoset--is this correct? Does EXI describe the same set of infosets as XML 1.x? Is EXI something the XML Core WG should support, allow, or argue against? paul > -----Original Message----- > From: public-xml-core-wg-request@w3.org > [mailto:public-xml-core-wg-request@w3.org] On Behalf Of John Cowan > Sent: Monday, 2007 August 20 12:39 > To: public-xml-core-wg@w3.org > Subject: DRAFT XML Core WG review of Efficient XML > Interchange (EXI) Format 1.0 > > > [I have written this draft in the first person plural, but its current > state reflects John Cowan's views only.] > > This is the XML Core WG's review of EXI WD1 (2007-07-16). Items are > mostly in the order they appear in the draft, and do not appear in > priority order. > > 1) We find the draft somewhat hard to follow; in particular, > the unusual > and non-standard grammar notation is not easy to grasp at a glance; > the explanation of compression should be postponed to after > the grammars > section; the explanation of event codes is very hard to follow. > > 2) We believe it is essential to provide (as called out in an > editorial > note) a better magic number for EXI. The current magic number is only > 2 bits long, and serves to discriminate between EXI and XML, but not > between EXI and other formats. This should be fixed by using > a 3-4 byte > magic number. > > 3) We believe that an XML document containing xsi:type attributes > should be treated as a schema-informed document rather than a > schemaless > document. This allows processes that create a single XML document to > decorate it with xsi:type attributes and then get good results from an > EXI encoder following in the pipeline. > > 4) Reversing the digits when representing decimal fractions (and > fractions of seconds in the date-time datatypes) is very unnatural. > We think it is better to use a (total digits, scale factor) pair. > Thus instead of representing 12.345 as (12,543) it would be (12345,3). > This is one byte longer, but much easier to decode properly. > > 5) IEEE float representation is better on all counts than the > specialized > representation. It's true that some hardwares can't process > it directly, > but *no* hardware can process the current EXI representation. > > 6) The current date-time representation expresses a date as > ((years-2000), > (month*31+day), hour*1440+minute*60+seconds, reversed fractional > second). However, logically years and months can be reduced > to months, > and days can be reduced to seconds, since leap seconds are ignored. > We therefore propose the following triple: ((year-2000)*12+month, > day*86400+hour*1440+minute*60+seconds scaled, scale factor). If > fraction scaling is rejected, this would become ((year-2000)*12+month, > day*86400+hour*1440+minute*60+seconds, reversed fractional second). > > 7) We believe that the current representation of strings has no > material advantage over UTF-8, since although it uses at most 3 bytes > per character, 4-byte UTF characters are very rare except in documents > written in obsolete scripts. > > [This discharges my action.] > > -- > Híggledy-pìggledy / XML programmers John Cowan > Try to escape those / I-eighteen-N woes; > http://www.ccil.org/~cowan > Incontrovertibly / What we need more of is cowan@ccil.org > Unicode weenies and / François Yergeaus. >
Received on Monday, 20 August 2007 19:11:14 UTC