- From: Chris Lilley <chris@w3.org>
- Date: Sat, 09 Jun 2001 00:03:48 +0200
- To: "McDonald, Ira" <imcdonald@sharplabs.com>
- CC: John Cowan <cowan@mercury.ccil.org>, Bjoern Hoehrmann <derhoermi@gmx.net>, www-international@w3.org, phoffman@imc.org
"McDonald, Ira" wrote: > > Hi Chris, > > I hope you meant to say, XML which is encoded in UTF-16 should > not be served as "text/xml". No, unfortunately, that was not the logical conclusion that I was able to draw. > XML which is encoded in UTF-8 is > perfectly safe to serve as "text/xml" and SHOULD be. XML which is encoded in UTF-8 and XML which is encoded in UTF-16 can both omit the encoding declaration. The server will have a hard time telling them apart. In addition, since both UTF-8 and UTF-16 are required to be supported, software might well convert between these two encodings based on, for example, whichever gives the smaller file size. > > Oddly, RFC 3023 (XML Media Types) actually discusses using > "text/xml" with UTF-16 encoding ONLY over HTTP transport > (how this could be safe for the receiver AFTER the resource > is moved by HTTP transport is not explained in RFC 3023). Yes, exactly. But the point is not the handling when text/xml is recognised. The point is that text/* has a lot of (IMHO) unfortunate rules which apply to the entire text/* hierarchy, and one of those is the requirement to be able to blindly assume things about end of line markers. So proxies, middleware, mail-to-we gateways and so forth are unfortunately allowed to wreak havoc on XML files based on some rather dated assumptions. Thus, the safe way to ship XML and ensure end-to-end integrity is to use a non-text type such as application/xml or soem more specific type such as image/svg+xml, application/xhtml+xml, and so forth. -- Chris
Received on Friday, 8 June 2001 18:04:40 UTC