- From: Mike Brown <mbrown@webb.net>
- Date: Fri, 27 Apr 2001 12:05:06 -0600
- To: "'duerst@w3.org'" <duerst@w3.org>, Mike Brown <mbrown@webb.net>, "'unicode@unicode.org'" <unicode@unicode.org>
- Cc: www-i18n-comments@w3.org
I asserted, referring to section 4.2.2 of the XML spec: >> <!ENTITY greeting SYSTEM >> "http://somewhere/getgreeting?lang=es&name=C%C3%A9sar"> >> ]> >> >> The name Ce'sar is represented here as C%C3%A9sar in the >> UTF-8 based escaping, as per the XML requirement. You replied: > What the XML spec (and all the others mentioned above) say is > something different. > > - If you use non-ASCII characters directly in a system id, > they're converted using UTF-8. > - If you want anything else, use exactly the %-escapes you > want. You won't get the benefit of using the actual > character in the source document. OK, I can now see how this is the same as in HTML, where the spec is saying what a document processor should do when it encounters malformed URI references. The way it is worded in the main spec, it looks to me like it is telling a document author how to go about writing a URI reference. However, I am willing to admit I am wrong. In my own paper I even mentioned the erratum to the XML spec that changes the wording to indicate that this section is in fact intended for an XML processor. Yeesh. My statement about conflict with HTTP stems from my incomplete understanding of HTTP's iso-8859-1 legacy. Never mind.
Received on Friday, 27 April 2001 14:04:14 UTC