Re: UTF-8 and BOM

     Why do we warn people about BOM but not about surrogates, anyway?  One
is no more appropriate than the other in canonicalized UTF-8.

          Tom Gindin

"Joseph M. Reagle Jr." <reagle@w3.org>@w3.org on 08/22/2000 04:56:47 PM

Sent by:  w3c-ietf-xmldsig-request@w3.org


To:   "John Boyer" <jboyer@PureEdge.com>
cc:   "XML DSig" <w3c-ietf-xmldsig@w3.org>
Subject:  Re: UTF-8 and BOM



At 12:53 8/22/2000 -0700, John Boyer wrote:
 >After recently reading thoroughly over the latest DSig spec, I noticed
several places where we have qualified UTF-8 with the parenthetic "without
byte order mark" or words to that effect.
 >
 >I'm still unsure why one would ever need a BOM for UTF-8.  I thought the
point of UTF-8 was to have a format that could provide lots of Unicode/UCS
characters but not be subject to the endian disease.
 >
 >Still, I'm sure there is a reason, so could someone please explain it?

[0] in respond to [1,2]. It isn't supposed to imply BOM is useful, just
that
it isn't done, there might be a better way to do this.

__

[0]
http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2000JulSep/0058.html
Ok, I sprinkled two :
UTF-8 /+ (without a byte ordering mark (BOM)) +/
into the Signature spec (6.5.1:minimal C14N) and (7.0: XML Canonicalization
and Syntax Constraint Considerations) but it obvioulsy needs to go in
xml-c14n.

[1]
http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2000AprJun/0287.html
"Adding a sentence saying that the UTF-8 produced does not start
with a BOM may be a good idea for a clarification."
[2]
http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2000JulSep/0045.html
"We should use the name 'UTF-8' in the specification but I hope adding
short
note about no-BOM to the specification."


_________________________________________________________
Joseph Reagle Jr.
W3C Policy Analyst                mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair   http://www.w3.org/People/Reagle/

Received on Tuesday, 22 August 2000 17:41:54 UTC