- From: by way of <duerst@w3.org>
- Date: Wed, 12 Jul 2000 10:47:42 +0900
- To: uri@w3.org
On Tue, Jul 11, 2000 at 02:17:51PM +0900, Martin J. Duerst wrote: > Dear Members of the URI mailing list, > > An issue has recently come up in the resolution of last call > comments to XML Base http://www.w3.org/TR/2000/WD-xmlbase-20000607 > (don't hesitate to read this, it's really, really short). > > First a bit of background: > > XML Base defines an attribute xml:base for XML documents > to allow to set the base for relative URI resolution. > The xml:base attribute cannot only be set at the root element > (i.e. for the whole document), but also on any other element. > In that case, the base applies for the other attributes on > the element, and for everything within the element. > If the value of xml:base itself is relative, it is in turn > resolved based on an xml:base higher up in the element > hierarchy. In contrast to Michael Mealling, who doesn't seem to like relative URIs, I think they are great. Everything is relative anyway, even things that appear to be absolute. One of the problems I've had with relative URIs in HTML documents is that one cannot specify multiple bases. This makes it difficult to use any server-side mechanism to include documents that might have their own base. With only one BASE provided to clients (whether explicitly provided in the document or assumed from the context), all relative URIs must be interpreted relative to only that one base, so any inclusion mechanism must translate every relative URI to the base it is intended to be relative to. With a client-side inclusion mechanism, such as IFRAMEs, the client can be smart enough to correctly interpret relative URIs in the included document relative to the base of the included document. So I am glad to see that XML might have a nested base feature. It is important to notice that there would still be an implicit base for any document that applies if there is not an explicit base specified in the document. WD-xmlbase-20000607 says: The attribute xml:base may be inserted in XML documents to specify a base URI other than the base URI of the document or external entity. Also notice that "the base URI of the document or external entity" is derived not from the base of any document that merely *uses* that URI. I believe this can be concluded from a careful reading of RFC 2396. http://www.innosoft.com/rfc/rfc2396.html#sec-5.1 The "Base URI of the encapsulating entity", which seems to be the bone of contention, is about an entity (as seen by a client, not a server) that contains the document that contains the relative URI. The language is admittedly confusing, but reading 5.1.1, 5.1.2, and 5.1.3, it should be clear that the encapsulation is specified on the client side before considering how the document or its encapsulation was retrieved or obtained using the URI. Perhaps the problem is really about whether the encapsulation of one document by another happens on the client-side or server-side, and moreover, as the boundary between client and server melts away because, for example, the encapsulation of XML could be done on either the client or the server, this distinction becomes problematic. A solution to this problem might be to distinguish not based on client-side vs server-side resolution but based on whether the encapsulation is by reference or by immediate inclusion. Any reference needs to be interpreted relative to some context, but an immediate inclusion requires no interpretation to find the content - i.e. here it is. But I think this is not what is meant when considering a multi-part MIME package of a single top-level document and several parts that are "encapsulated" by the top-level document. Each part is referenced, but it is a special kind of local reference. Instead of "here it is" we get a local reference meaning "there it is", and "it" is contained in the same package somehow, so no other remote resolution is required. > XML 1.0 also defines entities http://www.w3.org/TR/REC-xml#sec-physical-struct. > There are various kinds of entities, but relevant for this > discussion are both internal general entities and external > parsed general entities. An external parsed general entity > is declared e.g. as follows: > > <!ENTITY entityName SYSTEM "http://www.example.com/example.xml"> > > An internal general entity is declared as follows: > > <!ENTITY entityName "entity Content"> Michael asks why the internal general entity is a problem regarding relative URIs and bases. I'm not sure I understand either. A new entityName is specified here, so perhaps the question is what name space this name is defined in, and a BASE specifies a new name space both for resolution of name uses and for new name definitions. > The core of the current problem comes from the following > sentence [http://www.w3.org/TR/xmlbase#IDwkAq1]: > > The scope of xml:base does not extend into external entities, > but it does extend into internal entities. Again, I don't know what it means to have xml:base extend to internal entities. It is clear what it means to have the xml:base extend to external entities, but I don't believe it should. An important question is "When is it useful to have the document that references an external entity specify what the base is for relative URIs in that external entity?" Relative URIs that are interpreted based on how we use the document (e.g. the external entity) containing the URIs make them into something like "your local weather station". I can imagine that this kind of "local" identifier would be useful, but we don't have anything else like it now. Even a 'news' URI with no domain name specifies a particular newsgroup or message, though it doesn't specify which news server to use. Actually, there are (at least) two kinds of "local" identifier that might be useful. One is local to the user; the other is local to the context of the document that uses the identifier. The second is the kind that is being considered here. > Various attempts of interpreting Section 5.1 of RFC 2396 > (see e.g. http://www.innosoft.com/rfc/rfc2396.html#sec-5.1) > have been undertaken, with no clear results. > > First, Section 5.1 speaks about a single base per document, > having multiple bases in different areas of a document > doesn't seem to have been a concern, or maybe was explicitly > rejected. You are right that a single base is assumed. I don't know whether multiple bases where considered, but the "encapsulating entity" whatever that is starts to get at allowing multiple bases. Look at RFC 2557 for a case of how they use encapsulation. http://www.innosoft.com/rfc/rfc2557.html > Second, Section 5.1 doesn't seem to consider the case of > inclusion in the way this happens with entities or similar > cases. It's not terribly clear what kind of inclusion they had in mind. > Third, the words 'entity' and 'document' are used both > in XML and in RFC 2396, but it is not clear how to relate > these together. A document in the XML sense includes > all the entities (including external ones). In RFC 2396, > the only case that seems to have been considered is that > documents can be encapsulated in entities. Nevertheless, > an XML external entity has it's own URI, and therefore > in many ways behaves like an entity as described in > RFC 2396. It is clear to me that we need definitions of and more consistent use of the terms "entity", "document", "resource", "identifier" and others. We need a better model for encapsulation and multiple contexts, and resolution of identifiers relative to contexts. We have many conflicting models now, each addressing a part of the problem, but it seems doubtful that anyone has a complete consistent model at this time. Building this model is the kind of thing I was hoping a W3C URI activity would take on, but I don't care so much where this work is done - it needs to be done anyway. -- Daniel LaLiberte liberte@crystaliz.com liberte@holonexus.org
Received on Wednesday, 12 July 2000 04:20:07 UTC