- From: Martin Duerst <duerst@w3.org>
- Date: Mon, 18 Jun 2001 16:49:06 +0900
- To: Paul Grosso <pgrosso@arbortext.com>, Francois Yergeau <FYergeau@alis.com>
- Cc: xml-editor@w3.org, w3c-xml-core-wg@w3.org, w3c-i18n-ig@w3.org, connolly@w3.org
At 09:14 01/06/14 -0500, Paul Grosso wrote: >At 13:14 2001 06 14 +0900, Martin Duerst wrote: > >Dear XML core WG, > > > >By chance, I just discovered Proposed Erratum 71: > > > >http://www.w3.org/XML/Group/2000/10/proposed-xml10-2e-errata#PE71 > > > >It is true that this is a bit vague in not saying who is > >responsible for the escaping, but this has been fixed by > >PE 51/E4 to say that the XML processor is responsible: > > > >http://www.w3.org/XML/Group/2000/10/proposed-xml10-2e-errata#PE51 > >http://www.w3.org/XML/xml-V10-2e-errata#E4 > > >Right, but I think this erratum is wrong, so I'm asking to >reopen this issue. If you think erratum http://www.w3.org/XML/xml-V10-2e-errata#E4 is wrong, then that's not a problem with that erratum, but it's a problem with the XML Rec as it came out in Feb 1998: http://www.w3.org/TR/1998/REC-xml-19980210#sec-external-ent: An XML processor should handle a non-ASCII character in a URI by representing the character in UTF-8 as one or more bytes, and then escaping these bytes with the URI escaping mechanism (i.e., by converting each byte to %HH, where HH is the hexadecimal notation of the byte value). As you see, it starts with "An XML processor". Erratum E4 just restored that, after it got lost when working out the details of the conversion in the second edition. >A system id should be a string to the >XML processor, and that's what production 11 makes clear. Yes, the XML processor sees this as a string according to [11]. But the question is what the XML processor does with it. >Escaping may be necessary before doing something URI-ish >with the string, but that should be done by the process >doing something URI-ish, not the XML processor. Norm >explains how an entity resolution process is one example >of why the XML processor should not to the escaping. The XML System Identifier *is* an URI (modulo some syntactical differences that are dealt with as described). Doing something 'uri-ish' with it means just dealing with it according to it's nature. The term 'uri-ish' is therefore not appropriate. It would be much better to say that you want to do something 'catalog-ish' with the XML System Identifier URI. In this respect, XML is definitely different from SGML. Changing that would change the nature of XML quite a bit. Also, the process that does URI resolution with a system identifier has to get an URI, within strict syntax limitations, or it may cause an error. That's why the XML processor does the conversion, before handing it off. I agree with you that catalog-ish resolution doesn't have to do the escaping, but it's the business of the catalog spec to deal with that, not XML. Regards, Martin.
Received on Monday, 18 June 2001 03:56:50 UTC