RE: Comments on draft-yergeau-rfc2279bis-00.txt

Hi,

I can't find Martin Duerst's suggested revisions but...

This IETF standard should NOT encourage the use of leading BOM in
streams of UTF-8 text.  The optional use of leading BOM in UTF-8 (as
I know Martin said) destroys the crucial property that US-ASCII
is a perfect subset of UTF-8 and that US-ASCII can pass _without
harm_ through UTF-8 handling software libraries.

Specifically, in the printer industry, the optional presence of
leading BOM in UTF-8 attribute string values sent over-the-wire
in the Internet Printing Protocol/1.1 (IPP/1.1, RFC 2910)
has caused bugs, but has _never_ provided any utility.

The use of detection of leading BOM by software that guesses the
charset encoding of arbitrary text is pernicious and dangerous.

UTF-8 never needs a 'byte-order' signature.  The concatenation and
substring extraction bugs inherent in allowing/encouraging leading
BOM in UTF-8 are serious issues.

Cheers,
- Ira McDonald (co-editor of Printer MIB v2)
  High North Inc


-----Original Message-----
From: Patrik Fältström [mailto:paf@cisco.com]
Sent: Wednesday, October 02, 2002 5:35 PM
To: Francois Yergeau
Cc: ietf-charsets@iana.org; Bert Wijnen
Subject: Re: Comments on draft-yergeau-rfc2279bis-00.txt


On Thursday, September 19, 2002, at 06:49 AM, Francois Yergeau wrote:

> I think I have covered most outstanding comments, with the notable
> exception of the BOM issue raised by Martin Dürst. This one is neither
> trivial nor uncontroversial, and I have not seen anything ressembling a
> consensus, so it remains open (no changes to the draft).

[2 weeks have passed again, and I have not seen any comments on this 
list on this]

If anyone agree with Martin changes and text about the BOM issue _IS_ 
needed, let me know no later from one week from now (i.e. october 9). 
If I don't see anyone screaming, I declare consensus for this draft, 
and I'll take over from here.

     Thanks to all of you for all help!

         paf

Received on Wednesday, 2 October 2002 17:57:45 UTC