Re: [dgd@cs.bu.edu: BOS confusion (analysis; suggestion to resolve Newcomb/Bryan conflict)] from Steven R. Newcomb on 1997-01-02 (w3c-sgml-wg@w3.org from January 1997)

From: Steven R. Newcomb <srn@techno.com>
Date: Thu, 2 Jan 1997 11:26:54 -0500
To: w3c-sgml-wg@www10.w3.org, peter@techno.com, vtn@techno.com
Message-Id: <199701021626.LAA00730@bruno.techno.com>
(Martin Bryan:)

> This is why I want the link information decoupled from the rest of the
> document in a way that will allow it to be referenced from any start-point
> (hub) document that needs to reference it. That way I can set up one easily
> maintainable link set and point all my documents to it. Ideally what I would
> then love to be able to do is to prune this application BOS from within the
> called document so that I can get the most efficient effective BOS for the
> current user. For my, somewhat unusual, web pages I would want this done
> under author control rather than user control as I can easily identify how
> to prune the well-maintained tree.

The light finally dawns.  Aha!  I think you have an _excellent_ set of
ideas here, Martin, if I understand what you're saying.  Please check
my playback (expressed using HyTime terminology so as to avoid any
unnecessary ambiguity or misunderstanding between Martin and me, of
which I have had a bellyful):

1.  The HyTime bounded object set (HyTime BOS) list specification
    facilities should permit a session-start ("hub") document to
    specify that another document actually specifies something I'm
    going to call, for a moment, the "base" HyTime BOS list.  I'm
    going to call this phenomenon "delegation" of the HyTime BOS
    specification.  (The fact that this other document might also
    contain all the links in the HyTime hyperdocument is a reasonable
    and possible scenario, but it is only important that such a
    scenario be supported; it is neither necessary nor desirable that
    it be required.)

2.  The hub document should then be able to make adjustments to the
    base HyTime BOS list thus specified, perhaps adding objects to it,
    and perhaps subtracting objects from it.  Such addition and
    subtraction might be done in terms of the entity tree rooted in
    the delegated-to document, and it also might be done on an
    object-by-object basis.

3.  After the base HyTime BOS list is assembled and various
    hub-document-specific adjustments have been made to the list, only
    then is the resulting HyTime BOS actually loaded by the
    application, thus becoming the effective BOS.

Is that right so far?  If so, I have some comments that may be
relevant to both XML and HyTime.  If not, the following comments may
be irrelevant to our XML discussion, but they are still mighty
interesting (to me, anyway) on account of their implications for
HyTime.  (I hope nobody on this list minds if HyTime benefits from the
XML discussion!)

I think you can do almost all of this with HyTime now, but I think not
well enough.  The hub document can, in effect, delegate specification
of the BOS to another document simply by saying that the other
document is in the BOS, and simply not saying that any other document
is in the BOS.  The hub document can prune the resulting BOS by
limiting the number of recursions permitted, once the
entity-tree-discovery process has already moved to the delegated-to
BOS specification document.  But this lacks the kind of surgical
finesse needed in the real world, since the entity-tree-discovery
process must abort itself at the same recursion level for all
branches.  Not good for HyTime.  So, first of all, I conjecture that
the HyTime BOS-entity-tree-discovery process needs to be capable of
recognizing, optionally, pruning instructions that occur at some point
(or maybe points) other than the root (the hub document).

Once the "base" BOS list is established, using the hub
document to add objects to the BOS is trivially easy; you just declare
them along with the delegated-to document in the same way.

However, you can't subtract arbitrary members from the "delegated
base" BOS list in HyTime as it stands today; all you can do is prune
the entire delegated base at the same level of recursion.  These are
problems that need to be fixed, I think.

So here are some thoughts to ponder.  They are intended to be useful
for XML as well as HyTime, but, again, they are expressed in HyTime
terms.

I propose a new concept (and yet another de novo term, sorry): "subhub
document".  A subhub is regarded by a HyTime application's BOS
entity-tree-discovery process as if it were a hub document, in that
its declared HyTime BOS (that would have resulted from a BOS discovery
process had it been the session-start document) is added to the HyTime
BOS declared by the hub document.  The hub document delegates such
control by designating all subhubs using a special new common notation
attribute, "subhub (subhub|nosubhub) nosubhub".  In the entity
declarations of subhubs, the subhub attribute has a value of subhub.

I think Martin's scenario also demonstrates that HyTime users might
benefit from having a way to have hub (and subhub) documents not only
be able to add entities to the BOS on a case-by-case basis, but also
to exclude them from the BOS in a similar way, after the
entity-tree-discovery process (the BOS list building process) is over.
Maybe to do it by a pathlist of recursively declared entity names,
each path leading to an entity to be excluded?  I think we'd better
avoid specifying these things by physical address, because their
physical addresses might by subject to change without notice to the
hub document owner, and, more to the point, there is often no easy way
of determining whether two physical addresses in fact address the same
object.  Would we attach the list of excluded entity pathlists to the
document element of the hub document?

> >Ouch!  Also, doesn't this mean that an author can't specify companion
> >documents that he can't write on?  Sounds impractical.
> 
> To me there is a clear distinction between "what is essential" (i.e. defined
> in the application BOS as documents that are specifically referenced as
> transclusion type references) and what could be added if the user has time
> to get them in the background (i.e. defined in the companion documents). The
> first list defines the set of entities that must be retrieved before
> starting the session (i.e. those likely to result in delay), the second
> defines the set that the system can usefully pre-fetch during any spare
> system cycles while the author is reading the file. The second set of
> documents may not have links pointed at specifically from the hub document:
> it could just be a set of files that are referenced in bibliographic
> entries. In this case you don't need to write on the second set. (There will
> of course, be cases where you do, so we need some mechanism to whereby we
> have dynamic linking to documents once they have been retrieved, as a
> background operation.)

I like this requirement because it's real.  I hesitate to recommend
doing something about it in HyTime because it's in the gray zone of
almost application-specific-ness.  If making this distinction is
needed only because some links in some applications indicate
transclusions, I'd say we shouldn't make a change in HyTime to cover
it.  If, however, (as I believe), some associated objects are critical
for proper operation of a document (such as legal notices, maybe?),
and some aren't so critical, then maybe we do need a two-tiered BOS,
one "foreground" BOS to be assembled prior to the giving of any access
whatsoever, and the other "background" BOS to be assembled in
background time.  Is that right?  Would this need be answered by yet
another HyTime common notation attribute on external entity
declarations:

foregnd (foregnd|backgnd) backgnd

?  

I think that would give an author everything necessary without
requiring write access to anything but the hub document.  (If I
understand the requirement, that is.)  What do you think?

--Steve

             Steven R. Newcomb   President
         voice +1 716 271 0796   TechnoTeacher, Inc.
           fax +1 716 271 0129   (courier: 23-2 Clover Park,
      Internet: srn@techno.com    Rochester NY 14618)
           FTP: ftp.techno.com   P.O. Box 23795
    WWW: http://www.techno.com   Rochester, NY 14692-3795 USA
Received on Thursday, 2 January 1997 15:45:25 UTC