- From: Ian Hickson <ian@hixie.ch>
- Date: Thu, 1 Mar 2007 01:58:31 +0000 (UTC)
On Sat, 9 Apr 2005, Lachlan Hunt wrote: > > In the current draft, for specifying the character encoding [1], it is > stated: > > | In XHTML, the XML declaration should be used for inline character > | encoding information. > | > | Authors should avoid including inline character encoding information. > | Character encoding information should instead be included at the > | transport level (e.g. using the HTTP Content-Type header). > > The second paragraph should only apply to HTML using the meta element, > not XHTML using the XML declaration. I don't understand why it would be ok for one and not the other. > For X(HT)ML, according to the Architecture of the World Wide Web, Volume > One - Media types for XML [2]: > [2] http://www.w3.org/TR/2004/REC-webarch-20041215/#xml-media-types > > | In general, a representation provider SHOULD NOT specify the character > | encoding for XML data in protocol headers since the data is > | self-describing. I personally disagree with the arguments above (transcoding proxies mean that the content really can't know what its content is, and therefore it shouldn't be saying what its encoding is). I could see an argument for removing the advice from the HTML5 spec altogether, though. What do you think? > I think it should also be noted that authors who omit the XML > declaration (or include it but don't specify the encoding attribute) > *must* use UTF-8 or UTF-16, as described in the XML recommendation. If you specify the HTTP headers, you could use anything, even, say, GSM03.38 or UTF-EBCDIC. On Sat, 9 Apr 2005, Anne van Kesteren wrote: > > Why? If people are still using text/xml for example you really want them > to use the HTTP Content-Type header. Otherwise its US-ASCII. Right. > > I think it should also be noted that authors who omit the XML > > declaration (or include it but don't specify the encoding attribute) > > *must* use UTF-8 or UTF-16, as described in the XML recommendation. > > Where did you read that in the XML specification? You can always specify > encoding using the 'charset' parameter. That it is not recommended > because "webarch" things documents should be self-describing doesn't > matter. Also note that when the document is served using text/xml they > could use UTF-8 but it wouldn't work. Exactly. On Sat, 9 Apr 2005, Lachlan Hunt wrote: > > I didn't consider text/xml because the current draft states in the > conformance requirements. > > | XML documents [...] that are served over the wire (e.g. by HTTP) must > | be sent using an XML MIME type such as application/xml or > | application/xhtml+xml... > > I had initially interpreted that as meaning authors must use > application/*+xml and must not use text/xml; however, that > interpretation may be incorrect. Perhaps it should be explicitly stated > that text/xml should not be used, with a reference to the webarch > recommendation. I never did understand why people don't like text/*. It's nice and short and all these types are text, so... I've made no changes to the spec, but let me know if you think something should change. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 28 February 2007 17:58:31 UTC