- From: Grosso, Paul <pgrosso@ptc.com>
- Date: Mon, 9 Nov 2009 14:45:06 -0500
- To: <public-xml-core-wg@w3.org>
John et al., Any comments on this response from the EXI WG to our comments? paul > -----Original Message----- > From: public-xml-core-wg-request@w3.org [mailto:public-xml-core-wg- > request@w3.org] On Behalf Of Cokus, Michael S. > Sent: Monday, 2009 November 02 9:02 > To: Paul Pierce; EXI Comments > Cc: public-xml-core-wg@w3.org > Subject: RE: "RE: "Request for response to original XML Core WG > comments"" > > > Michael, > > Hi Paul, > > > > > Thank you. I would like further discussion on the following two (in > addition > > to the ongoing IEEE floating point discussion, where I look forward > to seeing > > the relevant test results.) > > Please find our responses below, in line with your comments/questions. > > Just as a cross-reference, the floating point test data is discussed in > a recent posting to the EXI public comments list: > http://lists.w3.org/Archives/Public/public-exi- > comments/2009Oct/0000.html > > > > > > > > > 7) We believe that the current representation of strings has no > > > > material advantage over UTF-8, since although it uses at most 3 > bytes > > > > per character, 4-byte UTF characters are very rare except in > documents > > > > written in obsolete scripts. > > > > > > In our initial response we noted that a number of languages in > common use are > > > represented in UTF using 4 bytes. So we concluded that the EXI > design (which > > > uses 3 bytes) would result in significant savings in size. To our > knowledge, > > > there were no further questions/responses concerning this comment. > > > > Is it possible that the languages that are inefficiently coded in > UTF-8 work > > better in UTF-16? A lot of XML documents are coded in either UTF-8 or > UTF-16, > > plus some heavily used programming languages use UTF string encoding > natively. > > It would be very cool if EXI processors could move character data > straight > > across. EXI could have a single bit to indicate either UTF-8 or UTF- > 16, > > corresponding to this common subset of the XML encoding declaration. > > > > If UTF-16 isn't good enough, is there another relatively simple way > to import > > a subset of the XML encoding declaration into EXI in such a way that > most > > characters can travel between EXI and XML or across API's without > translation? > > We thought John Cowan's comments on this topic were quite good: > http://lists.w3.org/Archives/Public/public-exi- > comments/2009Sep/0004.html > > > > > > > 8) We are strongly concerned about the concept of pluggable > > > > codecs as a barrier to interoperability, and believe that the > > > > draft should contain a strong health warning about the use of > > > > these: they should be used only in cases where there is explicit > > > > agreement between the communicating parties, and never for > > > > documents intended for consumption by a general audience. > > > > > > We agree and said as much in our initial response. A note has been > placed > > > in section 7.4 "Data Representation Map" to address this: > > > > > > http://www.w3.org/TR/2008/WD-exi- > 20080919/#datatypeRepresentationMap > > > > > > > I would very much like to see this pluggable codec/user datatype > feature > > disappear altogether. It is already effectively present in schema and > need > > not be duplicated in EXI. Leaving it to schema would make EXI more > like XML > > and would be, I think, better design in having good separation of > function. > > > > The purpose of the Datatype Representation Map is distinct from that of > the user-defined datatype feature in XML Schema. The former provides > the capability to associate a user-defined datatype *representation* > with a given datatype. In other words, Datatype Representation Map > allows the user to pick how data of a given datatype is represented > (i.e.,"written") in the EXI stream. It is the mechanism by which a > user can tell an EXI processor that he wants a particular > encoding/compression method employed for certain types of data. > > > So I guess I'm asking for a robust case for its existence, beyond the > few > > cases already discussed (e.g. floating point) where the standard > leans on > > user datatypes to support a standard representation in liu of the > default > > EXI specific representation. What use cases require user datatypes > and why > > can't they use schema? Are there other considerations? > > Another good example is found in the visualization domain (X3D). An > optimized serialization of X3D data can be achieved by employing > application-specific compression (e.g. combining coplanar polygons, > quantizing colors). In some cases lossy compression is an acceptable > option. Supporting these types of compression goes beyond the > capabilities of XML Schema. > > > > > Paul > > Hope addresses your questions, Paul. > > Thanks, > > --mike > > > Mike Cokus > The MITRE Corporation > 757-896-8553; 757-826-8316 (fax) > 903 Enterprise Parkway, Hampton, VA
Received on Monday, 9 November 2009 19:46:35 UTC