W3C home > Mailing lists > Public > public-xml-core-wg@w3.org > September 2004

Re: Encoding issues

From: Daniel Veillard <daniel@veillard.com>
Date: Fri, 10 Sep 2004 19:24:04 +0200
To: Norman Walsh <Norman.Walsh@sun.com>
Cc: public-xml-core-wg@w3.org
Message-ID: <20040910172404.GM17755@daniel.veillard.com>

On Fri, Sep 10, 2004 at 12:31:13PM -0400, Norman Walsh wrote:
> I helped Alejandro work on this essay before he published it. I'd
> appreciate any feedback that you might have.
> 
>   http://blogs.sun.com/roller/page/tucu/20040909#detecting_xml_charset_encoding_getting

  if BOM is NULL and XMLEnc is NULL
      if  XMLGuessEnc is ('UTF-16BE' or 'UTF-16LE')
          ERROR (encoding mismatch)                                    [#1.0]
      else
          encoding is 'UTF-8'                                          [#1.1]
  if BOM is NULL and XMLEnc is ('UTF-8' or 'UTF-16BE' or 'UTF-16LE')
      ERROR (XML requires BOM for 'UTF-16*' charsets)                 [#1.2]

  that sounds bogus to me. Appendix F "Without a Byte Order Mark:"
  http://www.w3.org/TR/REC-xml/#sec-guessing-no-ext-info
indicates that BOM is not needed. Seems to me a BOM is never required.
I don't know how the "Following Section 4.3.3 and Appendix F.1 of the
XML 1.0 specification" dependancy was done to produce #1.0, #1.1 or #1.2,
but this seems to me to not reflect Appendix F , from the very start.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | 
Received on Friday, 10 September 2004 17:24:10 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:31 GMT