W3C home > Mailing lists > Public > www-tag@w3.org > December 2002

Re: XML-*

From: Tim Bray <tbray@textuality.com>
Date: Sun, 15 Dec 2002 18:42:19 -0800
Message-ID: <3DFD3D8B.80304@textuality.com>
To: Norman Walsh <Norman.Walsh@Sun.COM>
Cc: www-tag@w3.org

Norman Walsh wrote:

> I would wager that almost every significant document that I've ever
> written has included at least a few entities. By my crude grep|wc -l
> estimate, there are more than 2800 entity declarations (transitively)
> in the internal subset of book.xml for "DocBook: The Definitive
> Guide".

I think we'll be talking about this Monday.  Granted that significant 
publication-oriented XML apps, particularly with an SGML heritage, make 
heavy use of entities.

Let's ignore the issue of parameter entities because they are a 
necessary function but are isolated off in the DTD not in the instance.

I think that the use of entities has worked reasonably well for the 
cases where they're simple macros of zero arguments.  I think it has 
fallen down miserably whenever you've tried to use them for their 
ostensible purpose (cf the SGML Handbook) as some sort of 
storage/content management infrastructure; to start with the ID/IDREFs 
break instantly.

> There are ten or fifteen in the sources for most W3C specs that I've
> seen.

For (a) naming odd characters and (b) simple boilerplate.  Hardly 

> It might be useful to have a standard for "Prescriptive Standalone
> XML" that people could point at in those applications where such a
> prescription is necessary, but removing entities from XML in general
> is just not something I'm willing to consider.

Nothing we can do will take entities out of XML 1.* at this point in 
time.  I've proposed XML-SW which has no entities or DTD at all (but 
does have a DOCTYPE declaration so you can use XML 1.* validation on it 
just fine).

There are various intermediate points you could think of, such as:
(a) entities, internal only
(b) (a) + internal subset only
(c) (b) + no recursion
(d) (c) + replacement text has to be shorter than name
(e) (b) + names for character references only

I do think we can agree that no existing or proposed schema effort is 
apt to re-invent the entity mechanism, so I could be persuaded that in 
something like XML-SW, we could live with one of (a) through (e) or a 

As for the DTD's other role, sticking something right in the document 
that lets it point at the schema that it claims you ought to validate 
with, I have no patience with that at all any more, for a variety of 
reasons.  -Tim
Received on Sunday, 15 December 2002 21:42:20 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:32:35 UTC