RE: Comments on draft-yergeau-rfc2279bis-00.txt

Ian,

> 
> I think we need to carefully distinguish that while Unicode 3.2
> and ISO 10646:2000 allow (and seem to encourage) leading BOM
> in UTF-8, 

The UTC does not encourage a leading BOM in UTF-8, but I agree
that the wording to date has not been very clear. The Unicode 4.0
text, when it is finally published, will take a much clearer
stance, indicating that a leading BOM in UTF-8 is "neither
required nor recommended", but that its presence is not
considered non-conformant for UTF-8, because of considerations
of round-trip conversions between UTF-8 and UTF-16 or UTF-32,
where the presence of a BOM can make more sense.

> an IETF 'standards track' RFC that describes UTF-8
> usage _for_Internet_protocols_ should preferably say:
> 
> 1)  Historically, leading BOM usage in the UTF-8 encoding
>     has been allowed by ISO 10646.
> 2)  All Internet protocols SHOULD NOT specify or encourage
>     leading BOM usage in the UTF-8 encoding.

I agree that this is the general stance that the RFC should
take.

--Ken

> 
> (the above wording obviously can be improved - Martin probably
> said it better already - if I could only find his note...)
> 

Received on Wednesday, 2 October 2002 22:36:04 UTC