W3C home > Mailing lists > Public > public-html@w3.org > May 2008

Re: Dangers of non-UTF-8 Re: Details on internal encoding declarations

From: Alexey Proskuryakov <ap@webkit.org>
Date: Fri, 23 May 2008 14:49:12 +0400
Cc: Ian Hickson <ian@hixie.ch>, HTML WG <public-html@w3.org>
Message-Id: <2F5B7D51-FC4A-4D02-997B-6BE48F5AE5F7@webkit.org>
To: Henri Sivonen <hsivonen@iki.fi>


On May 23, 2008, at 1:15 PM, Henri Sivonen wrote:

> Note: When the document is not encoded as UTF-8, IRIs are not  
> converted to URIs properly and to data loss happens in form  
> submissions when the user enters characters that cannot be mapped to  
> bytes using the encoding of the document.


FWIW, Firefox and Safari (not sure about IE) encode form data using  
numeric entities in this case, so data loss doesn't happen. Not all  
servers handle this correctly, but some do (e.g. http://www.google.ru/search?q=%26%231090%3B%26%231077%3B%26%231089%3B%26%231090%3B) 
.

- WBR, Alexey Proskuryakov
Received on Friday, 23 May 2008 10:49:59 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:31 UTC