- From: Gavin Nicol <gtn@ebt.com>
- Date: Thu, 17 Oct 1996 14:40:06 -0400
- To: U35395@UICVM.CC.UIC.EDU
- CC: w3c-sgml-wg@w3.org
> - the character repertoire of XML documents is that of ISO 10646 Good. > - conforming XML documents may be in UTF-8 or UCS-2 form Good. > - all XML processors must accept documents in UTF-8 and UCS-2 (or > optionally UTF-16) form I'm not a great fan of UTF-16, and am worried about the connotations of "accept". Does that mean parse, process, or just accept and die? > - XML processor may provide a user option which causes them to accept > documents in other coded character sets (e.g. ISO 8859 or JIS 0208) > or other encodings of 10646 or other coded character sets (e.g. > Extended Unix Code) -- this behavior must be optional (i.e. the user > must be able to turn it off, so that documents not in UTF-8 or > UCS-2 raise errors). OK. I can live with this, but am not overly happy about the "must be optional" clause. >Still open: details of the mechanism to be used for signaling the >encoding and/or coded character set in use. 3 methods: 1) MIME headers for HTTP/email/filesystem (via *.mim) 2) FSI attributes 3) Catalog parameters
Received on Thursday, 17 October 1996 14:41:51 UTC