- From: Martin J. Duerst <duerst@w3.org>
- Date: Mon, 13 Mar 2000 11:50:23 +0900
- To: xml-editor@w3.org
I wrote this up for a different purpose, but Dan Connolly
suggested that it might fit into the XML spec
(http://lists.w3.org/Archives/Member/w3c-html-cg/2000JanMar/0133.html).
----
There are three basic situations:
- XML sent (e.g. mail, http) as text/xml (or equivalent, e.g. text/vnd.wap.wml):
- Charset parameter is strongly recommended
- If no charset parameter, default is ASCII. The default of iso-8859-1 in
HTTP is explicitly overridden in the specification of the charset
parameter in section 3.1 "Text/xml Registration" of RFC 2376
(http://www.ietf.org/rfc/rfc2376.txt)
- No error handling provisions
- An encoding declaration, if present, is irrelevant, but when saving a
received resource as a file, the correct encoding declaration should
be inserted.
- XML sent as application/xml (or equivalent):
- Charset parameter is strongly recommended, and if present,
it takes precedence.
- If the charset parameter is omited, the rules for XML in static storage
are followed (see below).
- XML in static storage without external metainformation (e.g. file):
- Default is UTF-8, or UTF-16 if there is a BOM
- For other things, there has to be an encoding declaration
- There is some provision for 'error recovery'. What exactly this
means is currently under discussion in the XML Core WG, so that
it can be clarified.
----
Regards, Martin.
#-#-# Martin J. Du"rst, I18N Activity Lead, World Wide Web Consortium
#-#-# mailto:duerst@w3.org http://www.w3.org/People/D%C3%BCrst
Received on Sunday, 12 March 2000 21:51:43 UTC