- From: Richard Tobin <richard@inf.ed.ac.uk>
- Date: Fri, 8 Dec 2006 16:34:41 +0000 (GMT)
- To: Michael Kay <mike@saxonica.com>
- Cc: www-xml-linking-comments@w3.org
Thank you for your comments on the public XML Base draft. The XML Core WG has considered your comments and come to the following conclusions: > 1. The rules on returning xml:base unescaped seem to have changed too > radically for an erratum: this needs the spec to be versioned. We don't regard this as a change, rather as something that was not specified in the original. As you say, the phrase "processors must encode and escape these characters ..." could be taken as implying that the XML Base processor must do this before returning a value, but the sentence continues "... to obtain a valid URI reference" and this could simply be a statement of what must be done in order to use the value for retrieval. At least some specifications referring to XML Base already expect to be able to obtain unescaped values, in particular the Infoset, which says: These (i.e. base URI properties) are computed according to [XML Base] ... The value of these properties does not reflect any URI escaping that may be required for retrieval of the resource and the XSLT2 family of specs, whose base-uri property "may contain Unicode characters that are not allowed in URIs" (Data Model, 6.1.3). Finally, the rule is a "should" rather than a "must" so existing implementations may consider that backward-copmatibility is a good enough reason to disobey it. > 2. There are several deficiencies in the existing spec that aren't > addressed: > > 2a. When the spec says that the xml:base attribute "may be used", it should > make it clear that the attribute has no special status as far as DTD or XML > Schema validity checking is concerned: it may be used only if permitted by > the DTD or schema. We have added a note as follows: This specification does not give the xml:base attribute any special status as far as XML validity is concerned. In a valid document the attribute must be declared in the DTD, and similar considerations apply to other schema languages. > 2b. The spec doesn't say which relative URIs in a document are affected by > xml:base. Possible positions on this are > > (i) no relative URI is affected by xml:base unless the relevant > specification says it is affected > > (ii) relative URIs should be assumed to be affected unless the relevant > specification says otherwise > > (iii) relative URIs are affected if and only if they are dereferenced. > > (This is a real issue, there have been disagreements for example over > whether xml:base should affect the interpretation of schemaLocation in XML > Schema 1.0). RFC 3986 has a section "5.1. Establishing a Base URI". One of the ways a base URI can be established is by a "base URI embedded in content". XML Base fits into this framework by describing the syntax for embedding a base URI in an XML document. In this view, XML Base sets the base URI for a part of the document, and any strings within that part which are defined (by some spec) as URI references should be interpreted according to RFC 3986 which means that they use XML Base when they are resolved. That leaves the question of which relative references are resolved, and that seems to be an issue for whichever spec that defines them as URI references. RFC 3986 says that it's the media type of the document that determines the syntax used for emedding base URIs. The new XML media type draft http://www.w3.org/2006/02/son-of-3023/draft-murata-kohn-lilley-xml-02.html does point to XML Base for this, and we have added a sentence to the introduction noting this: It is expected that a future RFC for XML Media Types will specify XML Base as the mechanism for establishing base URIs in the media types is defines. > 2c. The spec says nothing about leading and trailing spaces in the xml:base > attribute value. We concluded that since XML Base says nothing about this, we must assume that no normalisation is done as part of XML Base processing But any normalisation implied by DTD attribute declarations is done as part of parsing, before the attribute is interpreted. There is an issue with schema languages that may change normalise attribute values. For example, if XML Base is used for schemaLocations, it can't take account of any normalisation implied by the type assigned to it in the as-yet unfetched schema. We don't propose to make any changes in XML Base on this subject before issuing a PER. > 2d. The spec says nothing useful about the situation where the base URI of > the document entity is unknown. (should be OK if xml:base is absolute) According to RFC 3986, if none of the usual mechanisms for determining the base URI apply, the base URI is application dependent (rather than there not being a base URI). So a relative xml:base is resolved against an application-dependent URI, and the result will of course be application dependent. I think this follows without XML Base having to say anything about it. > 3 (Comment on XLink v1.1 5.4.1) the spec says that to convert an XML > resource identifier to an IRI Reference, the character #0 must be escaped. > This implies that the character #0 can exist in unescaped form; but it > can't. We will treat this as a comment on XLink. It's conceivable that XML Resource Identifiers could come from some source other than the text of an XML document, so we probably won't remove #x0 from the description, but will perhaps add a note about it. > 4. It would be useful if we could all converge on the term > "percent-encoding" as used in the RFCs, rather than "escaping" which is a > much less specific term. The paragraph in 3.1 has been changed to read: The value of an xml:base attribute is an XML Resource Identifier, and may contain characters not allowed in URIs. These characters must be escaped by percent-encoding as described in [XLink11] before the value is used for retrieval of a resource. In accordance with the principle that this percent-encoding must occur as late as possible in the processing chain, applications which provide access to the base URI of an element should calculate and return the value without escaping. Please let us know whether you are satisfied with our responses. -- Richard Tobin, on behalf of the XML Core WG
Received on Friday, 8 December 2006 16:33:49 UTC