W3C home > Mailing lists > Public > www-international@w3.org > October to December 2006

RE: Strange advice re BOM and UTF-8

From: Richard Ishida <ishida@w3.org>
Date: Wed, 6 Dec 2006 15:43:33 -0000
To: "'Richard Ishida'" <ishida@w3.org>, "'Chris Lilley'" <chris@w3.org>, <www-validator@w3.org>
Cc: <www-international@w3.org>
Message-ID: <012601c7194d$4a03fce0$6501a8c0@w3cishida>

I just checked, and the blank line in the PHP file appears in IE7 too.

RI


============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/People/Ishida/
http://www.w3.org/International/
http://people.w3.org/rishida/blog/
http://www.flickr.com/photos/ishida/
 

> -----Original Message-----
> From: www-international-request@w3.org 
> [mailto:www-international-request@w3.org] On Behalf Of Richard Ishida
> Sent: 06 December 2006 15:39
> To: 'Chris Lilley'; www-validator@w3.org
> Cc: www-international@w3.org
> Subject: RE: Strange advice re BOM and UTF-8
> 
> 
> None of the things you say are incorrect, and it would be 
> nice to be able to say that it's ok to use the utf-8 
> signature, however, some applications - such as a text editor 
> or a browser - have been known to display the BOM as an extra 
> line in the file, others will display unexpected characters, 
> such as i>?.
> 
> Note that the wording refers to problems caused by user 
> agents when displaying text with signature.  
> 
> It might be worth testing whether this si still generally the 
> case, however, or whether applications have indeed improved 
> significantly in the last year or so.
> 
> I have a test at
> http://www.w3.org/International/tests/sec-utf8-signature-1.htm
l which seems to indicate that the latest versions of IE, > Firefox and
Opera on Windows cope ok with the utf-8 signature 
> in embedded files.  I have seen this problem recently, 
> however, in files included into PHP that have the signature.  
> I have temporarily created an example at 
> http://www.w3.org/International/questions/qa-css-charset.vi.ph
p  (I will fix this tomorrow.)  Look at it in Firefox, and it is > fine -
look at it in IE6, and there's a blank line at the top 
> of the page. (compare the IE page with one of the other 
> translations of the same article) (The bom is in an included file.)
> 
> RI
> 
> 
> 
> 
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
> 
> http://www.w3.org/People/Ishida/
> http://www.w3.org/International/
> http://people.w3.org/rishida/blog/
> http://www.flickr.com/photos/ishida/
>  
> 
> > -----Original Message-----
> > From: www-international-request@w3.org 
> > [mailto:www-international-request@w3.org] On Behalf Of Chris Lilley
> > Sent: 06 December 2006 14:35
> > To: www-validator@w3.org
> > Cc: www-international@w3.org
> > Subject: Strange advice re BOM and UTF-8
> > 
> > 
> > Hello www-validator,
> > 
> > I was surprised to see, on the W3C DTD validator, the following 
> > advice:
> > 
> >   The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files 
> is known to
> >   cause problems for some text editors and older browsers. You may
> >   want to consider avoiding its use until it is better supported.
> > 
> > This is odd because the use of a BOM with UTF-8 files is
> > 
> > a) standards compliant, to Unicode and to XML and to CSS
> > b) common practice
> > c) allows text editors to auto-detect the encoding of a plain text 
> > document.
> > 
> > I believe therefore that the advice is incorrect and indeed 
> > potentially damaging.
> > 
> > 
> > -- 
> >  Chris Lilley                    mailto:chris@w3.org
> >  Interaction Domain Leader
> >  Co-Chair, W3C SVG Working Group
> >  W3C Graphics Activity Lead
> >  Co-Chair, W3C Hypertext CG
> > 
> > 
> 
> 
Received on Wednesday, 6 December 2006 15:43:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:09 GMT