- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Wed, 07 Dec 2011 08:48:17 +0200
2011-12-07 2:36, Leif Halvard Silli wrote: > This entire thread started with a user problem. As far as I can see, the problem presented was: ?I still frequently break websites and webapps simply by entering my name (Faruk Ate?).? What we need to fix such issues is that sites and applications are modified to _deal with_ any characters, and this means that they minimally need to _parse_ input data as UTF-8 encoded. Of course their authors need to specify that the form data is to be submitted as UTF-8 encoded, normally by making the page UTF-8 encoded and declaring it as such. This is surely the most trivial side of the matter. Pages that currently cannot handle the letter ??? in input data would not behave any better if browsers started treating them as UTF-8 encoded, which is what the proposed change would me. On the contrary, they would work worse. They probably currently work for some set of characters outside ASCII, such as ISO-8859-1, and the change would stop that, as letters like ??? would now be transmitted as UTF-8 encoded but the form handler implies another encoding and sees the data as something completely different. > But with the proposed change, then even users *outside* the locales that > share the default encoding of the sloppy author's locale, would benefit. Exactly how would _any_ user benefit from the proposed change? I have shown that for the form data issue presented, the change would create serious problems, not solve any?except in the rather theoretical case where form data processing is based on UTF-8, the page is actually UTF-8 encoded but its encoding is not declared in any way (any examples of such pages around?) and the user?s browser implies an encoding other than UTF-8. In this theoretical case, the error correction principle I?ve suggested (don?t just apply an encoding if it turns out that the page cannot be in that encoding) would probably fix the problem if the page contains non-ASCII characters. Yucca
Received on Tuesday, 6 December 2011 22:48:17 UTC