- From: Richard A. O'Keefe <ok@atlas.otago.ac.nz>
- Date: Tue, 8 Aug 2000 14:42:31 +1200 (NZST)
- To: www-xml-linking-comments@w3.org
Comments on XML Base WD 07-June-2000. Section 1 states that The purpose of XBase is to "provid[e] base URI services to XLink, but as a modular specification so that other XML applications benefitting from additional control over relative URIs but not built upon XLink can also make use of it." The single new attribute 'xml:base' proposed is not sufficient. Section 4 states that A. "A relative URI appearing in text content is resolved against the base URI described by the xml:base attribute of the nearest ancestor element having an xml:base attribute". How is an XML processor to discern which portions of text are appearances of relative URIs? If, for argument's sake, {x} appearing inside <e> should be so interpreted, what about <e>{<![CDATA[x}]></e> or <e><i>{</i><b>x</b><i>}</i><e>? B. "A relative URI appearing in an attribute value is resolved against the base specified in the xml:base attribute appearing on the element owning the attribute, if one exists, otherwise the xml:base attribute of the nearest ancestor of the owning element having an xml:base attribute. Note that this applies to xml:base attributes themselves." - The last sentence CANNOT be true; it is a vicious circle. Presumably the xml:base attribute is resolved against the base specificed in the xml:base attribute of the nearest PROPER ancestor having such an attribute, not against itself! - How is an XML processor to determine WHICH attributes contain relative URIs? For example, if we have 'temp="98.6"', that has the right form to be a URI, so when EXACTLY is it to be resolved as a URI and when is it to be left alone as a number? C. "A relative URI appearing in the content of a processing instruction is resolved against the base URI described by the xml:base attribute of the nearest ancestor element having an xml:base attribute." This does not say - what is to be done if there is no such ancestor element (e.g., in a PI occurring before or after the root element). - how relative URIs appearing in the content of a processing instruction are to be discerned. Considered as a string, the relative URI x.pi occurs twice in <?x.pi uri='x.pi'?>; are both affected? Each of these cases is misleading, because the rule for determing the applicable base is NOT the stated rule. Two of the additional rules that are needed, and that do apply in these cases, are "2. The base URI is that of the encapsulating entity (message, document, or none). What is an "encapsulating entity", exactly? The term is not defined in the XML 1.0 recommendation. What _is_ the base URI of a "none"? "3. The base URI is that of the URI used to retrieve the entity." But WHICH entity is "the" entity? Is it the entity that contains the root element? Is it the external entity that directly contains the point in question? Suppose we have <!-- This is in file /a.xml --> <?xml version="1.0"?> <!DOCTYPE root [ <!ENTITY e "<foo my-uri='x.xml'/>"> <!ENTITY f SYSTEM "b.xml"> <!ELEMENT root (foo)> <!ELEMENT foo EMPTY> <!ATTLIST foo my-uri CDATA #REQUIRED> ]> <root>&f</root> <!-- This is in file /b.xml --> &e When an XML processor is resolving my-uri of <foo>, which is "the" entity? If "the" entity is the innermost entity containing the relevant point, it's e, and the URI used to retrieve e is /a.xml. But if it is the innermost EXTERNAL entity containing the relevant point, it's f, and the URI used to retrieve f is /b.xml. T One interpretation of the rules for determining the applicable base can be clarified by stating them as follows: Every XML document has an "element" structure and an "entity" structure. Section 4.3.2 "Well-Formed Parsed Entities" of the XML 1.0 specification guarantees that these two structures are compatible. The context of the application determines a default base URI. Within the scope of an external entity or a document entity that was retrieved using a URI, that URI is the base URI. Within the scope of an element having an xml:base attribute, the value of that attribute is the base URI. Inner scopes of either kind take precedence over outer scopes of either kind. The URI value of an xml:base attribute is resolved in the context just outside its owning element, so does not depend on itself. The URI value of any other attribute is resolved in the context just inside its owning element, so does depend on that element's xml:base attribute if it has one. A relative URI appearing in a processing instruction is resolved in the context immediately containing that PI. The means by which such appearances are discerned is outside the scope of this recommendation. A relative URI appearing in text content is resolved in the context immediately containing that text content. The means by which such appearances are discerned is outside the scope of this recommendation. Section 4.1 appears to mean that a URI as notated in an XML document may use disallowed characters, but that an XML processor must convert URI values to the proper form. But when, exactly? Before the URI is used as a URI, or before any other code, including the application, sees the text? The major unsolved problem in this draft of XBase is "How does an XML processor know which strings are URIs?" In particular, how do XML processors that do not support XSchema or XLink know which strings are URIs? I propose the following solution for attributes only. 2.5 xml:uri Attribute. The attribute xml:uri may be inserted in XML documents to specify which attributes of an element are to be interpreted as URIs and so resolved according to the rules in section 3. The value of an xml:uri attribute must match the Names production in the XML recommendation. Each attribute of an element whose name appears in the value of an xml:uri attribute owned by the same element is to be processed as a URI. Example. <nav xml:uri='first last prev next' first='slide001.xml' last='slide024.xml' prev='slide023.xml'/> As the example shows, the presence of a name in the value of an xml:uri attribute does not mean that such an attribute MUST appear, only that IF it does, it has a URI as value. Example: <!ELEMENT nav EMPTY> <!ATTLIST nav xml:uri NMTOKENS #FIXED 'first last prev next' first CDATA #REQUIRED last CDATA #REQUIRED prev CDATA #IMPLIED next CDATA #IMPLIED> Section C leaves another major question open. Does an application get a resolved URI *as well as* the text it would have got without XBase, or *instead of* that text? This has a major effect on the XML Infoset and Document Object Model. The whole specification leaves it unclear just which component of an XML-aware application is responsible for applying the XBase rules. Suppose we have a parser communicating with an application using something like SAX. When the XBase draft says that "These URI references [in HTML beyond those expressible in XLink] might be resolved BY AN APPLICATION relative to the base URI defined by XML Base", is that a hint that URI resolution in general is the responsibility of an application, and that an XBase-conforming parser need only provide the information from which resolution could be done, rather than doing such resolution itself?
Received on Monday, 7 August 2000 22:42:37 UTC