W3C home > Mailing lists > Public > www-validator@w3.org > October 2000

UTF-8 with Unicode line separator and BOM

From: Christian Ottosson <christian.ottosson@kurir.net>
Date: Mon, 16 Oct 2000 15:22:02 +0200
Message-Id: <a05001903b61096fdb104@[195.67.253.177]>
To: plh@w3.org, www-validator@w3.org
Hi!

On www-validator@w3.org I found a discussion about the U+FEFF ZERO 
WIDTH NON-BREAKING SPACE, the BOM (Byte Order Mark) character, at the 
beginning of UTF-16 files, but nothing about Unicode line separator 
characters.

After conversion of my web files to UTF-8, I have encountered 
problems with both the CSS and HTML validators (W3C's). The files 
begin with the BOM, as UTF-8 signature, and I also use the Unicode 
line separators. Neither of the validators like neither of the 
characters and my files don't validate. When I remove the BOM and 
change the line separators to LF (unix line separators) my files 
validate.

For examples, see:
http://www.bromma.kfuk-kfum.se/tmp/ served as 'text/html; charset=UTF-8'
http://www.bromma.kfuk-kfum.se/tmp/standard served as 'text/css; charset=UTF-8'

At least the Unicode line (and paragraph) separators should be 
recognized as "white space", I think, shouldn't they? Do you 
recommend the use of the BOM, as a UTF-8 signature, or should it be 
omitted?

Kind regards,
-- 
Christian Ottosson
http://www.sbc.su.se/~christian/
Received on Monday, 16 October 2000 09:28:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:13:54 GMT