General issues, how to replace entities in DTDs using XInclude.

I have been a heavy user of entities for a long time in SGML and then XML.
I am quite reluctant to give up entities, because they fit so many 
different
use cases.  As such, I find XInclude redundant until it can replace 
entities.

While XInclude does not claim to replace all entities, replacement
is the only basis upon which I can accept yet another way of including
external stuff -- I feel we must be in a position to retire one thing  for
domains where we use the new thing.

Here is an informal description of additions to XInclude which could solve
these problems while remaining as an infoset transform.  This would
augment, rather than replacing, the existing XInclude specification, 
which is
useful as it exists for referencing external entities and as base principles
for these proposed extensions.

1.  The ability to have local entity declarations.

I suggest 2 new element types called "xinclude:entity" and "xinclude:eref"
to model entities as an infoset transform, which can be intermingled with
other elements and other xinclude elements:

For example:

...
<!ENTITY bar "value">
...
<fooroot>
...
          &bar;
...
</fooroot>

Would become:

<fooroot xmlns:xi="http://www.w3.org/1999/XML/xinclude">
  <xi:entity name="bar">value</xi:entity>
...
          <xi:eref name="bar"/>
...
</fooroot>

Included infosets may not contain unresolved erefs -- each infoset
must be independently resolved.  But included infosets may include
first-level entities which remain active to the end of the scope where it
is included permitting sharing of common entities between infosets.

This allows entities to be scoped, making it easy to nest XML documents
with entities inside of other XML documents.

2.  Ability to reference inclusions and entities from attribute values.  
This
is a common use case for SGML and XML.  It was probably left out of
XInclude because local declarations were left out.

I suggest a new element type called "xinclude:attr" used to model
unspecified attributes as child elements at the start of the child list
so that they may contain inclusions and erefs.

For example:

...
<!ENTITY name "Ray Whitmer">
...
<fooroot>
...
         <use1 title="My name is &name;"/>
...
</fooroot>

Would become:

<fooroot xmlns:xi="http://www.w3.org/1999/XML/xinclude">
  <xi:entity name="name">Ray Whitmer</xi:entity>
...
         <use1><xi:attr name="title">My name is <xi:eref 
name="name"/></xi:attr></use1>
...
</fooroot>

3.  Compatibility with legacy syntax.  There are lots of old documents, and
the existing syntax for general entity references, used throughout XHTML,
for example, is far more convenient than using xinclude elements.

XInclude-aware parsers should make the following transformations
automatically:

a.  Create entity elements inside root element instead of internal subset
general entity declarations (using nested xinclude:include element
for external entities).

b.  Transform entity references to the equivalent XInclude elements,
using alternative attribute syntax where they occur in attribute values.

c.  Bootstrapping the document type could occur by automatically adding
xincludes to the top of the root element based upon document type.  The
parser could refer to a document to map doctypes to appropriate
bootstrapping xinclude declarations.

4.  Compatibility with DOM.  This proposal would solve all of the problems
DOM has today with entity references.  It would permit nearly-transparent
use of the new XInclude transforms instead of entities, or the DOM WG
could choose to support them in some new way.  Entity references would
need to update their read-only child content whenever a new entity
declaration becomes visible in the scope, which is a new problem that
seems quite solvable.  xinclude:include's might need some alternative
form of representation.

This is my take on XInclude.  I am not (yet) speaking for my company or
any other working group.  I realize that my proposed changes are probably
considered out of scope.  People keep trying to tell me that we don't need
DTDs any longer with XSchema and XInclude.  I would like for this to be
true.

There are corner cases which remain unsolved, but it solves enough of the
domain for me to consider outlawing the corner cases and it permits me to
get excited about a new world that uses XInclude/Xschema instead of DTDs.

I can provide more use cases if you are interested.

Ray Whitmer
rayw@netscape.com

Received on Tuesday, 6 March 2001 17:45:29 UTC