Re: Dangers of non-UTF-8 Re: Details on internal encoding declarations

On May 23, 2008, at 13:49, Alexey Proskuryakov wrote:

> On May 23, 2008, at 1:15 PM, Henri Sivonen wrote:
>
>> Note: When the document is not encoded as UTF-8, IRIs are not  
>> converted to URIs properly and to data loss happens in form  
>> submissions when the user enters characters that cannot be mapped  
>> to bytes using the encoding of the document.
>
>
> FWIW, Firefox and Safari (not sure about IE) encode form data using  
> numeric entities in this case, so data loss doesn't happen.


I am aware of this. The server cannot know if the user typed a  
character or a string that looks like an NCR, so I think that is  
dataloss in the strict sense.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Friday, 23 May 2008 11:03:30 UTC