[Bug 23646] "us-ascii" should not be an alias for "windows-1252"

https://www.w3.org/Bugs/Public/show_bug.cgi?id=23646

--- Comment #11 from Jirka Kosek <jirka@kosek.cz> ---
(In reply to Anne from comment #10)
> Why would the XML parser result in an error? Surely it should use the same
> encoding layer.

Because 0xA9 is invalid sequence in 7-bit encoding. I have tried two randomly
chosen XML parser and both choke on this example:

$ cat test.xml
<?xml version="1.0" encoding="us-ascii"?>
<test>©</test>

$ xmllint --noout test.xml 
I/O error : encoder error
test.xml:2: parser error : Premature end of data in tag test line 2
<test>
      ^

$ xjparse test.xml
Attempting validating, namespace-aware parse
com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Byte
"169" is not a member of the (7-bit) ASCII character set.

So in my opinion Encoding spec breaks compatibility with existing content and
implementations in regard to "us-ascii" encoding.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Received on Saturday, 28 June 2014 10:47:43 UTC