- From: John Cowan <cowan@ccil.org>
- Date: Mon, 20 Aug 2007 13:38:41 -0400
- To: public-xml-core-wg@w3.org
[I have written this draft in the first person plural, but its current state reflects John Cowan's views only.] This is the XML Core WG's review of EXI WD1 (2007-07-16). Items are mostly in the order they appear in the draft, and do not appear in priority order. 1) We find the draft somewhat hard to follow; in particular, the unusual and non-standard grammar notation is not easy to grasp at a glance; the explanation of compression should be postponed to after the grammars section; the explanation of event codes is very hard to follow. 2) We believe it is essential to provide (as called out in an editorial note) a better magic number for EXI. The current magic number is only 2 bits long, and serves to discriminate between EXI and XML, but not between EXI and other formats. This should be fixed by using a 3-4 byte magic number. 3) We believe that an XML document containing xsi:type attributes should be treated as a schema-informed document rather than a schemaless document. This allows processes that create a single XML document to decorate it with xsi:type attributes and then get good results from an EXI encoder following in the pipeline. 4) Reversing the digits when representing decimal fractions (and fractions of seconds in the date-time datatypes) is very unnatural. We think it is better to use a (total digits, scale factor) pair. Thus instead of representing 12.345 as (12,543) it would be (12345,3). This is one byte longer, but much easier to decode properly. 5) IEEE float representation is better on all counts than the specialized representation. It's true that some hardwares can't process it directly, but *no* hardware can process the current EXI representation. 6) The current date-time representation expresses a date as ((years-2000), (month*31+day), hour*1440+minute*60+seconds, reversed fractional second). However, logically years and months can be reduced to months, and days can be reduced to seconds, since leap seconds are ignored. We therefore propose the following triple: ((year-2000)*12+month, day*86400+hour*1440+minute*60+seconds scaled, scale factor). If fraction scaling is rejected, this would become ((year-2000)*12+month, day*86400+hour*1440+minute*60+seconds, reversed fractional second). 7) We believe that the current representation of strings has no material advantage over UTF-8, since although it uses at most 3 bytes per character, 4-byte UTF characters are very rare except in documents written in obsolete scripts. [This discharges my action.] -- Híggledy-pìggledy / XML programmers John Cowan Try to escape those / I-eighteen-N woes; http://www.ccil.org/~cowan Incontrovertibly / What we need more of is cowan@ccil.org Unicode weenies and / François Yergeaus.
Received on Monday, 20 August 2007 17:38:58 UTC