W3C home > Mailing lists > Public > public-exi@w3.org > August 2007

RE: False Distinguishing Bits claim

From: Vogelheim, Daniel <daniel.vogelheim@siemens.com>
Date: Tue, 7 Aug 2007 13:41:21 +0200
Message-ID: <D720D1DD80557241B9452630315560C30163D6BC@MCHP7IEA.ww002.siemens.net>
To: "Bjoern Hoehrmann" <derhoermi@gmx.net>, <public-exi@w3.org>

Hello Björn,

You wrote:
> Dear Efficient XML Interchange Working Group,
> http://www.w3.org/TR/2007/WD-exi-20070716/#DistinguishingBits claims: 
>   This bit sequence cannot occur as the first two bits of a 
> well-formed
>   XML document and represents the minimum length EXI document prefix
>   required to distinguish EXI documents from XML documents.
> This is false, the XML Recommendations do not place any 
> restrictions on
> the binary represention of well-formed XML documents, I might 
> well come
> up with a character encoding where any sequence of bytes maps to <?xml
> version='1.0' encoding='x-myencoding'?><x/>. This would not 
> violate the
> XML Recommendation's requirements in any way. The paragraph after the
> one quoted above is more accurate, I would suggest to remove 
> the former.

Thanks. Yes, you are technically correct and we will fix the description!

I must say that I personally am a bit unhappy with the whole issue. On a theoretical level such encodings could indeed come into existance, but... meanwhile, in the real world, the whole character encoding craze has (thankfully!) died down and any recent new character encodings (such as GB 18030) generally try to maintain at least some level of compatibility with ASCII and/or Unicode. That usually makes them very well behaved. And if someone really were to create a sufficiently funny encoding, they would have a lot more problems than just distinguishing it from EXI. So for all practical purposes the description says exactly what it should say... We still need to find some way to describe it, that is both theoretically correct and still doesn't obscure the actual meaning of the paragraph.

Daniel Vogelheim
Received on Tuesday, 7 August 2007 11:42:21 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:52:42 UTC