- From: Martin Duerst <duerst@w3.org>
- Date: Fri, 21 May 2004 17:08:52 +0900
- To: "Michael Kay" <mhk@mhk.me.uk>, "'Henry Zongaro'" <zongaro@ca.ibm.com>, <w3c-i18n-ig@w3.org>
- Cc: <public-qt-comments@w3.org>
Hello Michael, The I18N WG (Core TF) has discussed your mail, and has asked me to reply. I'm sorry for the delay. At 17:52 04/05/06 +0100, Michael Kay wrote: > > > > I worry that we will get many complaints from users who > > are misusing > > > > these codepoints if we do this. > > > > How are they misusing these code points? The case we know is that > > bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with > > the intent of giving them the windows-1252 semantics. > >This was the case I had in mind. People create documents in cp1252 and >declare them as iso-8859-1. And it all works, because the errors cancel each >other out. If we oblige processors to detect this situation we will be >asking users to pay for the extra processing cost, and in return the >application that worked before will stop working. Will they thank us? >Because if they won't, we shouldn't do it. Some users will be very thankful, others won't. The users that will be thankful will be those that care about data integrity and interoperability worldwide and in the long term. They will be able to fix a problem in their data that they otherwise might not have found. As a result, they will not only produce correct, valid output, but will also make sure that their input data will work well in other circumstances, such as searching, sorting, and any kind of other processing. Not the least, with the introduction of XML 1.1, there are also such issues as the confusion betwen NEL and the three-dot elipsis. There was a time when the mentality on the Web was 'everything goes', which lead to the slippery slope of bugwards compatibility. We have learned, with great pain, that this is a dead end, and we don't want to go there anymore. XML is the clearest example of how this can be done better. And I sincerely hope that XSLT will not be tempted to go down the bugwards compatibility slope. The C1 area is forbidden in HTML exactly because it is a very easy and cheap way to help people check and (if necessary) clean up their data. RFC 2070 (http://www.ietf.org/rfc/rfc2070.txt) was written almost 10 years ago. That C1 is allowed in XML is, according to James Clark, an oversight. XML 1.1 has corrected it. > > In some way just a detail, but: There is currently no XSLT 2.0 > > code that will stop working. XSTL 1.0 doesn't have the XHTML > > output method. > >I may have lost the thread, but I thought we were discussing the HTML output >method? Okay, sorry. There is still no XSLT 2.0 code that will stop working, even for the HTML output method. And because the XHTML output method is supposed to work according to the compatibility guidelines, it of course also should forbid producing C1 character output. Regards, Martin. > > > [20] 6.4 HTML Output Method: Writing Character Data: "Certain > >characters, > >Michael Kay
Received on Friday, 21 May 2004 04:09:36 UTC