- From: Michael Kay <mike@saxonica.com>
- Date: Fri, 8 Dec 2006 16:46:54 -0000
- To: "'Richard Tobin'" <richard@inf.ed.ac.uk>
- Cc: <www-xml-linking-comments@w3.org>
Thank you for this detailed response, which is entirely acceptable. Michael Kay > -----Original Message----- > From: Richard Tobin [mailto:richard@inf.ed.ac.uk] > Sent: 08 December 2006 16:35 > To: Michael Kay > Cc: www-xml-linking-comments@w3.org > Subject: Re: Comments on new XML Base draft > > Thank you for your comments on the public XML Base draft. > > The XML Core WG has considered your comments and come to the following > conclusions: > > > 1. The rules on returning xml:base unescaped seem to have > changed too > > radically for an erratum: this needs the spec to be versioned. > > We don't regard this as a change, rather as something that > was not specified in the original. As you say, the phrase > "processors must encode and escape these characters ..." > could be taken as implying that the XML Base processor must > do this before returning a value, but the sentence continues > "... to obtain a valid URI reference" and this could simply > be a statement of what must be done in order to use the value > for retrieval. At least some specifications referring to XML > Base already expect to be able to obtain unescaped values, in > particular the Infoset, which says: > > These (i.e. base URI properties) are computed according to > [XML Base] ... > The value of these properties does not reflect any URI > escaping that may be required for retrieval of the resource > > and the XSLT2 family of specs, whose base-uri property "may > contain Unicode characters that are not allowed in URIs" > (Data Model, 6.1.3). > > Finally, the rule is a "should" rather than a "must" so > existing implementations may consider that > backward-copmatibility is a good enough reason to disobey it. > > > 2. There are several deficiencies in the existing spec that aren't > > addressed: > > > > 2a. When the spec says that the xml:base attribute "may be > used", it > > should make it clear that the attribute has no special > status as far > > as DTD or XML Schema validity checking is concerned: it may be used > > only if permitted by the DTD or schema. > > We have added a note as follows: > > This specification does not give the xml:base attribute any special > status as far as XML validity is concerned. In a valid document the > attribute must be declared in the DTD, and similar considerations > apply to other schema languages. > > > 2b. The spec doesn't say which relative URIs in a document are > > affected by xml:base. Possible positions on this are > > > > (i) no relative URI is affected by xml:base unless the relevant > > specification says it is affected > > > > (ii) relative URIs should be assumed to be affected unless the > > relevant specification says otherwise > > > > (iii) relative URIs are affected if and only if they are > dereferenced. > > > > (This is a real issue, there have been disagreements for > example over > > whether xml:base should affect the interpretation of > schemaLocation in > > XML Schema 1.0). > > RFC 3986 has a section "5.1. Establishing a Base URI". One > of the ways a base URI can be established is by a "base URI > embedded in content". XML Base fits into this framework by > describing the syntax for embedding a base URI in an XML document. > > In this view, XML Base sets the base URI for a part of the > document, and any strings within that part which are defined > (by some spec) as URI references should be interpreted > according to RFC 3986 which means that they use XML Base when > they are resolved. > > That leaves the question of which relative references are > resolved, and that seems to be an issue for whichever spec > that defines them as URI references. > > RFC 3986 says that it's the media type of the document that > determines the syntax used for emedding base URIs. The new > XML media type draft > > > http://www.w3.org/2006/02/son-of-3023/draft-murata-kohn-lilley > -xml-02.html > > does point to XML Base for this, and we have added a sentence > to the introduction noting this: > > It is expected that a future RFC for XML Media Types will specify > XML Base as the mechanism for establishing base URIs in the media > types is defines. > > > 2c. The spec says nothing about leading and trailing spaces in the > > xml:base attribute value. > > We concluded that since XML Base says nothing about this, we > must assume that no normalisation is done as part of XML Base > processing But any normalisation implied by DTD attribute > declarations is done as part of parsing, before the attribute > is interpreted. > > There is an issue with schema languages that may change > normalise attribute values. For example, if XML Base is used > for schemaLocations, it can't take account of any > normalisation implied by the type assigned to it in the > as-yet unfetched schema. > > We don't propose to make any changes in XML Base on this > subject before issuing a PER. > > > 2d. The spec says nothing useful about the situation where the base > > URI of the document entity is unknown. (should be OK if xml:base is > > absolute) > > According to RFC 3986, if none of the usual mechanisms for > determining the base URI apply, the base URI is application > dependent (rather than there not being a base URI). So a > relative xml:base is resolved against an > application-dependent URI, and the result will of course be > application dependent. I think this follows without XML Base > having to say anything about it. > > > 3 (Comment on XLink v1.1 5.4.1) the spec says that to > convert an XML > > resource identifier to an IRI Reference, the character #0 > must be escaped. > > This implies that the character #0 can exist in unescaped > form; but it > > can't. > > We will treat this as a comment on XLink. It's conceivable > that XML Resource Identifiers could come from some source > other than the text of an XML document, so we probably won't > remove #x0 from the description, but will perhaps add a note about it. > > > 4. It would be useful if we could all converge on the term > > "percent-encoding" as used in the RFCs, rather than > "escaping" which > > is a much less specific term. > > The paragraph in 3.1 has been changed to read: > > The value of an xml:base attribute is an XML Resource Identifier, > and may contain characters not allowed in URIs. These > characters must > be escaped by percent-encoding as described in [XLink11] before the > value is used for retrieval of a resource. In accordance with the > principle that this percent-encoding must occur as late as > possible in > the processing chain, applications which provide access to the base > URI of an element should calculate and return the value without > escaping. > > Please let us know whether you are satisfied with our responses. > > -- Richard Tobin, on behalf of the XML Core WG
Received on Friday, 8 December 2006 16:47:27 UTC