- From: Richard Ishida <ishida@w3.org>
- Date: Wed, 6 Dec 2006 15:39:21 -0000
- To: "'Chris Lilley'" <chris@w3.org>, <www-validator@w3.org>
- Cc: <www-international@w3.org>
None of the things you say are incorrect, and it would be nice to be able to say that it's ok to use the utf-8 signature, however, some applications - such as a text editor or a browser - have been known to display the BOM as an extra line in the file, others will display unexpected characters, such as i>?. Note that the wording refers to problems caused by user agents when displaying text with signature. It might be worth testing whether this si still generally the case, however, or whether applications have indeed improved significantly in the last year or so. I have a test at http://www.w3.org/International/tests/sec-utf8-signature-1.html which seems to indicate that the latest versions of IE, Firefox and Opera on Windows cope ok with the utf-8 signature in embedded files. I have seen this problem recently, however, in files included into PHP that have the signature. I have temporarily created an example at http://www.w3.org/International/questions/qa-css-charset.vi.php (I will fix this tomorrow.) Look at it in Firefox, and it is fine - look at it in IE6, and there's a blank line at the top of the page. (compare the IE page with one of the other translations of the same article) (The bom is in an included file.) RI ============ Richard Ishida Internationalization Lead W3C (World Wide Web Consortium) http://www.w3.org/People/Ishida/ http://www.w3.org/International/ http://people.w3.org/rishida/blog/ http://www.flickr.com/photos/ishida/ > -----Original Message----- > From: www-international-request@w3.org > [mailto:www-international-request@w3.org] On Behalf Of Chris Lilley > Sent: 06 December 2006 14:35 > To: www-validator@w3.org > Cc: www-international@w3.org > Subject: Strange advice re BOM and UTF-8 > > > Hello www-validator, > > I was surprised to see, on the W3C DTD validator, the > following advice: > > The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files is known to > cause problems for some text editors and older browsers. You may > want to consider avoiding its use until it is better supported. > > This is odd because the use of a BOM with UTF-8 files is > > a) standards compliant, to Unicode and to XML and to CSS > b) common practice > c) allows text editors to auto-detect the encoding of a plain > text document. > > I believe therefore that the advice is incorrect and indeed > potentially damaging. > > > -- > Chris Lilley mailto:chris@w3.org > Interaction Domain Leader > Co-Chair, W3C SVG Working Group > W3C Graphics Activity Lead > Co-Chair, W3C Hypertext CG > >
Received on Wednesday, 6 December 2006 15:39:37 UTC