- From: McDonald, Ira <imcdonald@sharplabs.com>
- Date: Wed, 6 Dec 2006 09:00:31 -0800
- To: 'Richard Ishida' <ishida@w3.org>, 'Chris Lilley' <chris@w3.org>, www-validator@w3.org
- Cc: www-international@w3.org
Hi,
FWIW - the IETF's formal definition of UTF-8 (RFC 3629)
recommends very strongly AGAINST the use of BOM in UTF-8
in all IETF protocols because:
(a) it's useless as a signature (a small fragment of
UTF-8 can be reliably auto-detected without BOM);
(b) it's dangerous because it breaks string concatenation.
Cheers,
- Ira
Ira McDonald (Musician / Software Architect)
Chair - FSG Open Printing Steering Committee
Blue Roof Music / High North Inc
PO Box 221 Grand Marais, MI 49839
phone: +1-906-494-2434
email: imcdonald@sharplabs.com
-----Original Message-----
From: www-international-request@w3.org
[mailto:www-international-request@w3.org]On Behalf Of Richard Ishida
Sent: Wednesday, December 06, 2006 10:44 AM
To: 'Richard Ishida'; 'Chris Lilley'; www-validator@w3.org
Cc: www-international@w3.org
Subject: RE: Strange advice re BOM and UTF-8
I just checked, and the blank line in the PHP file appears in IE7 too.
RI
============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)
http://www.w3.org/People/Ishida/
http://www.w3.org/International/
http://people.w3.org/rishida/blog/
http://www.flickr.com/photos/ishida/
> -----Original Message-----
> From: www-international-request@w3.org
> [mailto:www-international-request@w3.org] On Behalf Of Richard Ishida
> Sent: 06 December 2006 15:39
> To: 'Chris Lilley'; www-validator@w3.org
> Cc: www-international@w3.org
> Subject: RE: Strange advice re BOM and UTF-8
>
>
> None of the things you say are incorrect, and it would be
> nice to be able to say that it's ok to use the utf-8
> signature, however, some applications - such as a text editor
> or a browser - have been known to display the BOM as an extra
> line in the file, others will display unexpected characters,
> such as i>?.
>
> Note that the wording refers to problems caused by user
> agents when displaying text with signature.
>
> It might be worth testing whether this si still generally the
> case, however, or whether applications have indeed improved
> significantly in the last year or so.
>
> I have a test at
> http://www.w3.org/International/tests/sec-utf8-signature-1.htm
l which seems to indicate that the latest versions of IE, > Firefox and
Opera on Windows cope ok with the utf-8 signature
> in embedded files. I have seen this problem recently,
> however, in files included into PHP that have the signature.
> I have temporarily created an example at
> http://www.w3.org/International/questions/qa-css-charset.vi.ph
p (I will fix this tomorrow.) Look at it in Firefox, and it is > fine -
look at it in IE6, and there's a blank line at the top
> of the page. (compare the IE page with one of the other
> translations of the same article) (The bom is in an included file.)
>
> RI
>
>
>
>
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
>
> http://www.w3.org/People/Ishida/
> http://www.w3.org/International/
> http://people.w3.org/rishida/blog/
> http://www.flickr.com/photos/ishida/
>
>
> > -----Original Message-----
> > From: www-international-request@w3.org
> > [mailto:www-international-request@w3.org] On Behalf Of Chris Lilley
> > Sent: 06 December 2006 14:35
> > To: www-validator@w3.org
> > Cc: www-international@w3.org
> > Subject: Strange advice re BOM and UTF-8
> >
> >
> > Hello www-validator,
> >
> > I was surprised to see, on the W3C DTD validator, the following
> > advice:
> >
> > The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files
> is known to
> > cause problems for some text editors and older browsers. You may
> > want to consider avoiding its use until it is better supported.
> >
> > This is odd because the use of a BOM with UTF-8 files is
> >
> > a) standards compliant, to Unicode and to XML and to CSS
> > b) common practice
> > c) allows text editors to auto-detect the encoding of a plain text
> > document.
> >
> > I believe therefore that the advice is incorrect and indeed
> > potentially damaging.
> >
> >
> > --
> > Chris Lilley mailto:chris@w3.org
> > Interaction Domain Leader
> > Co-Chair, W3C SVG Working Group
> > W3C Graphics Activity Lead
> > Co-Chair, W3C Hypertext CG
> >
> >
>
>
--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.432 / Virus Database: 268.15.11/575 - Release Date: 12/6/2006
12:22 PM
Received on Wednesday, 6 December 2006 17:01:03 UTC