- From: Richard Tobin <richard@inf.ed.ac.uk>
- Date: Tue, 5 Dec 2006 16:05:42 +0000 (GMT)
- To: <public-xml-core-wg@w3.org>
Here are my suggested responses, interleaved with Michael's comments. 1. The rules on returning xml:base unescaped seem to have changed too radically for an erratum: this needs the spec to be versioned. As we discussed, Norm will try to talk to Michael about this. 2. There are several deficiencies in the existing spec that aren't addressed: 2a. When the spec says that the xml:base attribute "may be used", it should make it clear that the attribute has no special status as far as DTD or XML Schema validity checking is concerned: it may be used only if permitted by the DTD or schema. I don't see any problem with adding a note to this effect. 2b. The spec doesn't say which relative URIs in a document are affected by xml:base. Possible positions on this are (i) no relative URI is affected by xml:base unless the relevant specification says it is affected (ii) relative URIs should be assumed to be affected unless the relevant specification says otherwise (iii) relative URIs are affected if and only if they are dereferenced. (This is a real issue, there have been disagreements for example over whether xml:base should affect the interpretation of schemaLocation in XML Schema 1.0). RFC 3986 has a section "5.1. Establishing a Base URI". One of the ways a base URI can be established is by a "base URI embedded in content". XML Base describes the syntax for embedding a base URI in an XML document. Given that it's working in this general framework, it doesn't seem appropriate for XML Base to say which relative URIs it affects. Rather it sets the base URI for a part of the document, and any strings within that part which are defined (by some spec) as URI references should be interpreted according to RFC 3986 which means that they use XML Base when they are resolved. That leaves the question of which relative references are resolved, and that seems to be an issue for whichever spec that defines them as URI references. What I didn't mention in the first paragraph is that RFC 3986 says that it's the media type of the document that determines the syntax used for emedding base URIs. The new XML media type draft http://www.w3.org/2006/02/son-of-3023/draft-murata-kohn-lilley-xml-02.html (is that the latest version?) does point to XML Base for this. 2c. The spec says nothing about leading and trailing spaces in the xml:base attribute value. Since it says nothing I think we must assume that no normalisation is done as part of XML Base processing (we had agreement on that on the last WG telcon I think). Any normalisation implied by DTD attribute declarations is done as part of parsing, before the attribute is interpreted. The question of schema normalisation seems harder: clearly if XML Base is used for schemaLocations, it can't take account of any normalisation implied by the type assigned to it in the as-yet unfetched schema. 2d. The spec says nothing useful about the situation where the base URI of the document entity is unknown. (should be OK if xml:base is absolute) According to RFC 3986, if none of the usual mechanisms for determining the base URI apply, the base URI is application dependent (rather than there not being a base URI). So a relative xml:base is resolved against an application-dependent URI, and the result will of course be application dependent. I think this follows without XML Base having to say anything about it. 3 (Comment on XLink v1.1 5.4.1) the spec says that to convert an XML resource identifier to an IRI Reference, the character #0 must be escaped. This implies that the character #0 can exist in unescaped form; but it can't. I think we agreed that XLink should note this. It's conceivable that XML Resource Identifiers could come from some source other than the text of an XML document, so we shouldn't remove #x0 from the description. 4. It would be useful if we could all converge on the term "percent-encoding" as used in the RFCs, rather than "escaping" which is a much less specific term. XML Namespaces uses the term "%-escaping". I'm not sure which is best but I will change it to one or the other. -- Richard
Received on Tuesday, 5 December 2006 16:05:02 UTC