W3C home > Mailing lists > Public > www-xml-linking-comments@w3.org > July to September 2000

XML Base for relative URIs: Interpretation of 5.1 of RFC 2396

From: Martin J. Duerst <duerst@w3.org>
Date: Tue, 11 Jul 2000 14:17:51 +0900
Message-Id: <4.2.0.58.J.20000711120637.03518720@sh.w3.mag.keio.ac.jp>
To: uri@w3.org
Cc: www-xml-linking-comments@w3.org
Dear Members of the URI mailing list,

An issue has recently come up in the resolution of last call
comments to XML Base http://www.w3.org/TR/2000/WD-xmlbase-20000607
(don't hesitate to read this, it's really, really short).

First a bit of background:

XML Base defines an attribute xml:base for XML documents
to allow to set the base for relative URI resolution.
The xml:base attribute cannot only be set at the root element
(i.e. for the whole document), but also on any other element.
In that case, the base applies for the other attributes on
the element, and for everything within the element.
If the value of xml:base itself is relative, it is in turn
resolved based on an xml:base higher up in the element
hierarchy.

XML 1.0 also defines entities http://www.w3.org/TR/REC-xml#sec-physical-struct.
There are various kinds of entities, but relevant for this
discussion are both internal general entities and external
parsed general entities. An external parsed general entity
is declared e.g. as follows:

<!ENTITY  entityName  SYSTEM "http://www.example.com/example.xml">

An internal general entity is declared as follows:

<!ENTITY  entityName  "entity Content">

In both cases, the entity is invoked by &entityName;.
[Of course, it's not possible to use the same name for two
different entities.]

In some way, entities are similar to C preprocessor instructions,
internal entities correspond to #define, and external entities
to #include. But of course it's not exactly the same.

The core of the current problem comes from the following
sentence [http://www.w3.org/TR/xmlbase#IDwkAq1]:

The scope of xml:base does not extend into external entities,
but it does extend into internal entities.

The alternative proposal currently under discussion would be
to say that xml:base extends into external and internal entities.

The following is an example:

File /example/a.xml:

<!DOCTYPE example
[
<!ENTITY entity1 SYSTEM "/include/entity1.xml">
]
<example xml:base='subdir1'>
&entity1;
</example>


File /include/entity1.xml:

<a href='link.xml'>That's the question!</a>


Assuming that the href attribute in the example document
is governed by the XML Base specification, what should it
refer to?

If xml:base extends into external entities, it would
refer to /example/subdir1/link.xml. If xml:base doesn't
extend into external entities, it would refer to
/include/link.xml.

What to do in the absence of xml:base would have to be
aligned with the decision. This means that in the above just
example without xml:base, href would refer to /example/link.xml
in the case things extend into external entities, and would
refer to /include/link.xml in the case things don't extend
into external entities.


Various attempts of interpreting Section 5.1 of RFC 2396
(see e.g. http://www.innosoft.com/rfc/rfc2396.html#sec-5.1)
have been undertaken, with no clear results.

First, Section 5.1 speaks about a single base per document,
having multiple bases in different areas of a document
doesn't seem to have been a concern, or maybe was explicitly
rejected.

Second, Section 5.1 doesn't seem to consider the case of
inclusion in the way this happens with entities or similar
cases.

Third, the words 'entity' and 'document' are used both
in XML and in RFC 2396, but it is not clear how to relate
these together. A document in the XML sense includes
all the entities (including external ones). In RFC 2396,
the only case that seems to have been considered is that
documents can be encapsulated in entities. Nevertheless,
an XML external entity has it's own URI, and therefore
in many ways behaves like an entity as described in
RFC 2396.


Any opinions and any help on this issue is greatly appreciated.


Regards,    Martin.
Received on Tuesday, 11 July 2000 03:45:43 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 08:39:40 GMT