Re: Notes on XBase 07-June-2000

Thanks for the comments Richard, they touch on nearly all the significant
issues already reported during the last call period, so I think we've dealt
with each of them, as I describe below:

You wrote:

>Section 4 states that
>
>    A.  "A relative URI appearing in text content is resolved against
>	the base URI described by the xml:base attribute of the
>	nearest ancestor element having an xml:base attribute".
>
>	How is an XML processor to discern which portions of text
>	are appearances of relative URIs?  If, for argument's sake,
>	{x} appearing inside <e> should be so interpreted, what
>	about <e>{<![CDATA[x}]></e> or <e><i>{</i><b>x</b><i>}</i><e>?

It doesn't.  This has been clarified for the next draft - that applications
or specs which do identify URIs can refer to xml:base as part of their
handling.  See XLink, for instance.

>    B.	"A relative URI appearing in an attribute value is resolved
>	against the base specified in the xml:base attribute appearing
>	on the element owning the attribute, if one exists, otherwise
>	the xml:base attribute of the nearest ancestor of the owning
>	element having an xml:base attribute.  Note that this applies
>	to xml:base attributes themselves."
>
>	- The last sentence CANNOT be true; it is a vicious circle.
>	  Presumably the xml:base attribute is resolved against the
>	  base specificed in the xml:base attribute of the nearest
>	  PROPER ancestor having such an attribute, not against itself!

Yes, this has been pointed out before and fixed.

>	- How is an XML processor to determine WHICH attributes contain
>	  relative URIs?  For example, if we have 'temp="98.6"', that
>	  has the right form to be a URI, so when EXACTLY is it to be
>	  resolved as a URI and when is it to be left alone as a number?

Again, this is up to the application.

>    C.	"A relative URI appearing in the content of a processing
>	instruction is resolved against the base URI described by the
>	xml:base attribute of the nearest ancestor element having an
>	xml:base attribute."
>
>	This does not say
>	- what is to be done if there is no such ancestor element
>	  (e.g., in a PI occurring before or after the root element).

This has been fixed.

>	- how relative URIs appearing in the content of a processing
>	  instruction are to be discerned.  Considered as a string,
>	  the relative URI x.pi occurs twice in <?x.pi uri='x.pi'?>;
>	  are both affected?

Again, this is up to the application.  XML Base simply describes what the
base URI for the PI is.  If an application wants to treat part of the PI
content as a URI reference, and combine it with the base URI, that's up to
it.

>Each of these cases is misleading, because the rule for determing the
>applicable base is NOT the stated rule.
>
>Two of the additional rules that are needed, and that do apply in these
>cases, are
>
>    "2.	The base URI is that of the encapsulating entity (message,
>	document, or none).
>
>	What is an "encapsulating entity", exactly?  The term is not.
>	defined in the XML 1.0 recommendation.  What _is_ the base URI
>	of a "none"?

This is a summary of 2396, which fact has been clarified.  Further
clarification of what 2396 might have meant here is probably unwise.  This
text is included only to show where XML Base fits with 2396.

>    "3.	The base URI is that of the URI used to retrieve the
entity."
>
>	But WHICH entity is "the" entity?  Is it the entity that
>	contains the root element?  Is it the external entity that
>	directly contains the point in question?

There is a terminological difference here between 2396 entities and XML
entities.  We have clarified that for purposes of XML, a 2396 entity is an
XML document or external entity.

>	Suppose we have
>	    <!-- This is in file /a.xml -->
>
>	    <?xml version="1.0"?>
>	    <!DOCTYPE root [
>		<!ENTITY e "<foo my-uri='x.xml'/>">
>		<!ENTITY f SYSTEM "b.xml">
>		<!ELEMENT root (foo)>
>		<!ELEMENT foo EMPTY>
>		<!ATTLIST foo my-uri CDATA #REQUIRED>
>	    ]>
>	    <root>&f</root>
>
>
>	    <!-- This is in file /b.xml -->
>	    &e
>
>	When an XML processor is resolving my-uri of <foo>,
>	which is "the" entity?  If "the" entity is the innermost
>	entity containing the relevant point, it's e, and the
>	URI used to retrieve e is /a.xml.  But if it is the
>	innermost EXTERNAL entity containing the relevant point,
>	it's f, and the URI used to retrieve f is /b.xml.

/b.xml.  Internal entities don't affect base, external entities do.  You can
find a lot of discussion about this point on this list.
	
>	T
>One interpretation of the rules for determining the applicable base can
>be clarified by stating them as follows:
>
>	Every XML document has an "element" structure and an "entity"
>	structure.  Section 4.3.2 "Well-Formed Parsed Entities" of
>	the XML 1.0 specification guarantees that these two structures
>	are compatible.
>
>	The context of the application determines a default base URI.
>
>	Within the scope of an external entity or a document entity
>	that was retrieved using a URI, that URI is the base URI.
>
>	Within the scope of an element having an xml:base attribute,
>	the value of that attribute is the base URI.
>
>	Inner scopes of either kind take precedence over outer scopes
>	of either kind.
>
>	The URI value of an xml:base attribute is resolved in the
>	context just outside its owning element, so does not depend
>	on itself.
>
>	The URI value of any other attribute is resolved in the
>	context just inside its owning element, so does depend on
>	that element's xml:base attribute if it has one.
>
>	A relative URI appearing in a processing instruction is
>	resolved in the context immediately containing that PI.
>	The means by which such appearances are discerned is
>	outside the scope of this recommendation.
>
>	A relative URI appearing in text content is resolved in
>	the context immediately containing that text content.
>	The means by which such appearances are discerned is
>	outside the scope of this recommendation.

We've done a lot of text clarifying on this issue, and I think what we came
up is equivalent to your proposal.  I wish you'd sent it months ago!

>Section 4.1 appears to mean that a URI as notated in an XML document
>may use disallowed characters, but that an XML processor must convert
>URI values to the proper form.  But when, exactly?  Before the URI is
>used as a URI, or before any other code, including the application,
>sees the text?

The draft says "this value is interpreted".  Whoever is doing the
interpreting does the conversion.  It would be quite odd for an XML
processor to expose the value of the xml:base attribute as a converted
value.  It would not be unreasonable for an XML processor to expose the base
URI as a converted value.  But this is an API design question beyond the
scope of XML Base.  See the Infoset's treatment of XML Base.

>The major unsolved problem in this draft of XBase is
>"How does an XML processor know which strings are URIs?"
>In particular, how do XML processors that do not support
>XSchema or XLink know which strings are URIs?

That's just it - they don't.  A raw XML parser doesn't recognize any element
or attribute content as representing URIs.  Therefore it can do nothing with
xml:base other than pass it on to the application.  An application may
recognize certain content as representing URIs (through Schema awareness,
XLink awareness, or some built-in mechanism).  Such an application may
combine such relative URIs with a base for some purpose like retrieval.  If
it does, it should use XML Base to determine what the base is.

Note that right now, only XLink normatively demands that XML Base be
honored.  It is my hope that XML Datatypes also defines the "uri" datatype
as dependent on XML Base.  It is through normative reference by other
specifications that XML Base is deployed.

>I propose the following solution for attributes only.
>
>    2.5 xml:uri Attribute.
>
>	The attribute xml:uri may be inserted in XML documents
>	to specify which attributes of an element are to be
>	interpreted as URIs and so resolved according to the
>	rules in section 3.
>
>	The value of an xml:uri attribute must match the
>	Names production in the XML recommendation.  Each
>	attribute of an element whose name appears in the
>	value of an xml:uri attribute owned by the same
>	element is to be processed as a URI.
>
>	Example.
>	<nav xml:uri='first last prev next'
>             first='slide001.xml' last='slide024.xml'
>             prev='slide023.xml'/>
>
>	As the example shows, the presence of a name in the value
>	of an xml:uri attribute does not mean that such an attribute
>	MUST appear, only that IF it does, it has a URI as value.
>
>	Example:
>	    <!ELEMENT nav EMPTY>
>	    <!ATTLIST nav
>		xml:uri NMTOKENS #FIXED 'first last prev next'
>		first CDATA #REQUIRED
>		last  CDATA #REQUIRED
>		prev  CDATA #IMPLIED
>		next  CDATA #IMPLIED>

Sorry, way too late for feature requests!  This is something better
accomplished through the URI datatype mechanisms available in Schemas
anyway, IMO.

>Section C leaves another major question open.
>
>    Does an application get a resolved URI *as well as* the text
>    it would have got without XBase, or *instead of* that text?
>
>This has a major effect on the XML Infoset and Document Object Model.

According to the Infoset, xml:base attributes are not supressed.  In
addition a [base URI] property is exposed, which is calculated as described
by XML Base.  DOM exposes attributes, with no exception for xml:base, and
does not expose base URIs, so it doesn't yet provide any conveniences for
dealing with base URIs (or any other URIs).

>The whole specification leaves it unclear just which component of an
>XML-aware application is responsible for applying the XBase rules.
>Suppose we have a parser communicating with an application using
>something like SAX.   When the XBase draft says that "These URI
>references [in HTML beyond those expressible in XLink] might be
>resolved BY AN APPLICATION relative to the base URI defined by XML
>Base", is that a hint that URI resolution in general is the
>responsibility of an application, and that an XBase-conforming
>parser need only provide the information from which resolution could
>be done, rather than doing such resolution itself?

Yes.

Received on Friday, 11 August 2000 14:47:26 UTC