Re: UTF-8 encoding error

On 29/05/2023 17:11, Norm Tovey-Walsh wrote:
>> I got the first submission to my processor this week with a UTF-8
>> encoding error, which managed to hang the processor.
> Curiously, I have no trouble with the grammar. But I also haven’t
> provided any way for the user to specify an encoding, so I’m not sure
> what Java is doing.

My processor seems to have the replacement character (65533) substituted 
for the #b7.

The file I got through email contains the incorrect byte sequence (i.e. 
no #c2 before the #b7),  but it looks as if when injected into the 
browser context (using JavaScript FileReader.readasText(file,'UTF-8')) 
the errant code is converted to the replacement.


John Lumley

Received on Tuesday, 30 May 2023 10:24:18 UTC