Re: A serious detail point

At 17:57 17/4/97 CDT, Michael Sperberg-McQueen wrote:

>But why is it plausible to suppose that it will be more common to
>have tens of thousands of links all pointing at the same host
>document, than scores of thousands of links all pointing at various
>documents in a corpus or collection?

In general if you have a reference to a fragment in a document you are more
likely to have pointers to other fragments of the same document in your
document than to fragments of other documents. There is a good case that can
be made for allowing some form of reusable location source identifier which
is shorter than the full address in any addressing scheme. (Hence locsrc in
HyTime, which you managed to drop in the latest revision of XML.) Without
such a facility XML is of limited use. 

Incidentally I have just had to recommend dropping the use of XML from a
project simply because I could not manage multiple references to a single
document efficiently. Without the concept of allowing references to a single
document to point to the current address of the file via a short-name, and
without having to update every reference each time the file is moved, it
makes no sense to try to manage link sets using XML.

>The update and maintenance problem is handled nicely by the general
>entity mechanism you illustrate.  The caching problem can be handled
>with affinity groups / BOS / whatchamacallits that say "Cache this
>one, I'm going to need it often".

The problem is that XML has no concept of BOS, and no way of knowing which
entities being referenced need to be cached. If locsrc was used to indicate
items that shared the same source document there would be a good clue to
caching priority in the markup. Without it there is no way of determining
the priority to be applied to caching referenced documents when there are a
large number of references in a document.
Martin Bryan, The SGML Centre, Churchdown, Glos. GL3 2PU, UK 
Phone/Fax: +44 1452 714029   WWW home page:

Received on Friday, 18 April 1997 06:51:54 UTC