- From: Martin Duerst <duerst@w3.org>
- Date: Sat, 05 Oct 2002 15:09:28 +0900
- To: Francois Yergeau <FYergeau@alis.com>, ietf-charsets@iana.org
At 14:11 02/10/03 -0400, Francois Yergeau wrote: >Martin Duerst wrote: > > Therefore, senders SHOULD NOT use the BOM in larger, usually > > labeled, pieces of text (e.g. MIME entities), and MUST NOT > > use it in smaller protocol elements (usually with a fixed > > encoding). Receivers SHOULD recognize and remove the BOM > > in larger, usually labeled, pieces of text (e.g. MIME entities). > >This is a far cry from banning the BOM outright and the distinction between >larger pieces of text and smaller protocol elements seems like a useful one >(but perhaps not worded optimally yet). > >Some thoughts: > >- Perhaps the distinction is less between larger and smaller pieces of text >than between payloads and protocol elements proper. I think this is another way to explain the distinction. I suggest you add these words, but keep the others. >- I think it would be better for *this* RFC to refrain from telling senders >and receivers what to do with the BOM, but to offer advice to protocol >designers. It is specific protocols that should know better where the BOM >should be banned or allowed. Protocols often have to work together. And protocol designers often don't understand all the issues in i18n. So the clearer and the more direct, the better. But if you have alternative wording, please feel free to propose. >- There seems to be some confusion over what stripping the BOM means in >practice. 'Stripping' should be more like 'ignoring at appropriate times'. >Example: my web browser gets a BOM-bearing UTF-8 page through HTTP. Whether >or not it uses the BOM to determine that the page is in UTF-8, the browser >should ignore it when displaying the page to me, but it should certainly not >strip it out when I ask it to save the page to my disk (which is exactly the >point where the BOM becomes useful, as my file system will not preserve as >metadata the fact that this page is in UTF-8). Please feel free to work on the exact wording. Regards, Martin.
Received on Saturday, 5 October 2002 02:17:06 UTC