W3C home > Mailing lists > Public > www-validator-css@w3.org > July 2005

CSS validator mystery symbol: ÿþ, UTF-16 BOM

From: Etan Wexler <ewexler@stickdog.com>
Date: Thu, 07 Jul 2005 00:36:16 -0400
Message-ID: <42CCB140.4050004@stickdog.com>
To: W3C CSS-validator list <www-validator-css@w3.org>
CC: Paul Coombe <pcwm@sympatico.ca>

Paul Coombe wrote to the W3C CSS-validator list 
<mailto:www-validator-css@w3.org> on 6 December 2004 in “css validator 
mystery symbol” (<mid:41B4E781.2060605@sympatico.ca>, 
<http://www.w3.org/mid/41B4E781.2060605@sympatico.ca>):

> When using the W3C CSS validator I got an error that showed a y with 2 
> dots above and a combination pb. This is the closest I could come to a 
> reproduction.

Then came a graphic depicting glyphs for the character sequence “ÿþ” 
(<Latin small letter y with diaeresis, Latin small letter thorn>, 
<U+00FF U+00FE>).

> What does it mean?

It was almost certainly supposed to be an encoding signature, flagging 
the encoding of the text as little-endian UTF-16.

The encoding of the style sheet represented each character with an 
eight-bit byte, giving us <FF FE>. The character zero width no-break 
space (U+FEFF) has the semantics of a byte-order mark (known as “BOM”), 
or encoding signature. When serialized as little-endian UTF-16, the BOM 
yields the bytes <FF FE>.

The solution that first comes to mind is to use better authoring 
software. A good text editor will let the author choose the encoding in 
which to save and, in the case of the UTF encodings, whether to ensure 
the presence of a BOM at the start of text.

-- 
Etan Wexler.
Received on Thursday, 7 July 2005 04:33:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 27 June 2012 00:14:16 GMT