Re: SVG 1.2 Comment: image/svg+xml;charset=""

Bjoern Hoehrmann wrote:
> * Robin Berjon wrote:
>>If you disagree, take it to the TAG because we're sure sticking to their 
>>recommendation:
>>
>>  http://www.w3.org/TR/webarch/Overview.html#xml-media-types
> 
> <http://www.imc.org/ietf-xml-mime/mail-archive/msg00984.html>:
> 
> [...]
>   I do not see any way to justify removing the 'charset' parameter
>   based on 'good practice' advice in the Web Architecture document
> [...]

I am aware of Martin's arguments, but I can't say that I agree. The 
charset parameters allows one to make broken XML "right" using external 
information, which is a direct violation of XML's draconian design.

I also fail to see many use cases that would justify the cost in 
awkwardness and potential for conflict in the charset parameter.

Finally, if experience with RSS means anything, it's quite clear that 
that's a feature that it highly likely to be misused. Most web servers 
nowadays provide a default charset parameter to avoid XSS attacks and 
asking users to turn it off (or most of the time, have someone else turn 
it off) is unrealistic. What we'll get is broken content that UAs will 
accept silently anyway. Let's just not go there.

> SVG processors and general purpose XML processors need to determine the
> same character encoding for image/svg+xml;charset=... resources, nothing
> in RFC3023 or the current RFC3023bis ensures that if the SVG 1.2 spec
> requires implementations to ignore the charset parameter.

There's an excellent way for them to determine the same character 
encoding: don't use charset. XML did a brilliant job of getting a lot of 
getting this mostly right, all that charset does on XML media types is 
break it. Using charset with XML media types (or other horrors such as 
text/xml) leads to everything but interoperability.

Take for instance:

[~]$ HEAD http://expway.com/robin/foo.xml.sjis | grep Content-Type
Content-Type: application/xml; charset=shift_jis
[~]$ xmllint http://expway.com/robin/foo.xml.sjis
<?xml version="1.0" encoding="UTF-8"?>
<foo>יגגיייי</foo>

Is that conformant? What do you think most XML parsers do?

Part of the purpose of registering a media type is you get to register 
the parameters. Dropping charset for image/svg+xml certainly matches my 
reading of the AWWW and while it removes some very minor uses it makes 
the whole system far more robust. If it leads to remaining 
inconsistencies with 'bis, then fix the latter :)

> Using the parameter and specifying processing when it is used are quite
> different matters. I for example could live with registration text in
> SVG 1.2 that has a charset parameter but states that using it is
> STRONGLY DISCOURAGED or something.

If it's strongly discouraged, it shouldn't be made possible at all.

-- 
Robin Berjon

Received on Tuesday, 23 November 2004 16:24:44 UTC