W3C home > Mailing lists > Public > www-svg@w3.org > November 2004

Does UTF-16 require a BOM? (was: Re: SVG 1.2 Comment: image/svg+xml;charset="")

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Thu, 25 Nov 2004 14:47:32 +0100
To: Chris Lilley <chris@w3.org>
Cc: www-svg@w3.org
Message-ID: <41bfde8c.291334515@smtp.bjoern.hoehrmann.de>

* Chris Lilley wrote:
>AvK> That's not true. You can have UTF-16 or UTF-8 content for that matter
>AvK> without a BOM.
>
>Um, leaving aside UTF-8, and noting that UTF-16 is not the same as
>UTF-16BE and UTF-16LE, please justify this statement with reference toa
>named portion of a specification.

That should be obvious from RFC2781, e.g. section 3.2 notes "the
character 0xFEFF in the first position of a stream MAY be interpreted
as a zero-width non-breaking space, and is not always a byte-order
mark". In XML 1.0, entities encoded in UTF-16 are required to start
with a byte order mark but it is only an error (not a fatal error)
not to do that. For example (all examples have no BOM and are UTF-16
encoded)

  Content-Type: application/xml

  <?xml version="1.0"?>

this would be a fatal error ("it is a fatal error [...] for an entity
which begins with neither a Byte Order Mark nor an encoding declaration
to use an encoding other than UTF-8.") but

  Content-Type: application/xml

  <?xml version="1.0" encoding="UTF-16"?>

would not be a fatal error, especially when using big-endian order.
Received on Thursday, 25 November 2004 13:48:04 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:14:53 UTC