- From: McDonald, Ira <imcdonald@sharplabs.com>
- Date: Wed, 6 Dec 2006 09:00:31 -0800
- To: 'Richard Ishida' <ishida@w3.org>, 'Chris Lilley' <chris@w3.org>, www-validator@w3.org
- Cc: www-international@w3.org
Hi, FWIW - the IETF's formal definition of UTF-8 (RFC 3629) recommends very strongly AGAINST the use of BOM in UTF-8 in all IETF protocols because: (a) it's useless as a signature (a small fragment of UTF-8 can be reliably auto-detected without BOM); (b) it's dangerous because it breaks string concatenation. Cheers, - Ira Ira McDonald (Musician / Software Architect) Chair - FSG Open Printing Steering Committee Blue Roof Music / High North Inc PO Box 221 Grand Marais, MI 49839 phone: +1-906-494-2434 email: imcdonald@sharplabs.com -----Original Message----- From: www-international-request@w3.org [mailto:www-international-request@w3.org]On Behalf Of Richard Ishida Sent: Wednesday, December 06, 2006 10:44 AM To: 'Richard Ishida'; 'Chris Lilley'; www-validator@w3.org Cc: www-international@w3.org Subject: RE: Strange advice re BOM and UTF-8 I just checked, and the blank line in the PHP file appears in IE7 too. RI ============ Richard Ishida Internationalization Lead W3C (World Wide Web Consortium) http://www.w3.org/People/Ishida/ http://www.w3.org/International/ http://people.w3.org/rishida/blog/ http://www.flickr.com/photos/ishida/ > -----Original Message----- > From: www-international-request@w3.org > [mailto:www-international-request@w3.org] On Behalf Of Richard Ishida > Sent: 06 December 2006 15:39 > To: 'Chris Lilley'; www-validator@w3.org > Cc: www-international@w3.org > Subject: RE: Strange advice re BOM and UTF-8 > > > None of the things you say are incorrect, and it would be > nice to be able to say that it's ok to use the utf-8 > signature, however, some applications - such as a text editor > or a browser - have been known to display the BOM as an extra > line in the file, others will display unexpected characters, > such as i>?. > > Note that the wording refers to problems caused by user > agents when displaying text with signature. > > It might be worth testing whether this si still generally the > case, however, or whether applications have indeed improved > significantly in the last year or so. > > I have a test at > http://www.w3.org/International/tests/sec-utf8-signature-1.htm l which seems to indicate that the latest versions of IE, > Firefox and Opera on Windows cope ok with the utf-8 signature > in embedded files. I have seen this problem recently, > however, in files included into PHP that have the signature. > I have temporarily created an example at > http://www.w3.org/International/questions/qa-css-charset.vi.ph p (I will fix this tomorrow.) Look at it in Firefox, and it is > fine - look at it in IE6, and there's a blank line at the top > of the page. (compare the IE page with one of the other > translations of the same article) (The bom is in an included file.) > > RI > > > > > ============ > Richard Ishida > Internationalization Lead > W3C (World Wide Web Consortium) > > http://www.w3.org/People/Ishida/ > http://www.w3.org/International/ > http://people.w3.org/rishida/blog/ > http://www.flickr.com/photos/ishida/ > > > > -----Original Message----- > > From: www-international-request@w3.org > > [mailto:www-international-request@w3.org] On Behalf Of Chris Lilley > > Sent: 06 December 2006 14:35 > > To: www-validator@w3.org > > Cc: www-international@w3.org > > Subject: Strange advice re BOM and UTF-8 > > > > > > Hello www-validator, > > > > I was surprised to see, on the W3C DTD validator, the following > > advice: > > > > The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files > is known to > > cause problems for some text editors and older browsers. You may > > want to consider avoiding its use until it is better supported. > > > > This is odd because the use of a BOM with UTF-8 files is > > > > a) standards compliant, to Unicode and to XML and to CSS > > b) common practice > > c) allows text editors to auto-detect the encoding of a plain text > > document. > > > > I believe therefore that the advice is incorrect and indeed > > potentially damaging. > > > > > > -- > > Chris Lilley mailto:chris@w3.org > > Interaction Domain Leader > > Co-Chair, W3C SVG Working Group > > W3C Graphics Activity Lead > > Co-Chair, W3C Hypertext CG > > > > > > -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.432 / Virus Database: 268.15.11/575 - Release Date: 12/6/2006 12:22 PM
Received on Wednesday, 6 December 2006 17:01:03 UTC