Re: [dgd@cs.bu.edu: BOS confusion (analysis; suggestion to resolve Newcomb/Bryan conflict)] from Steven R. Newcomb on 1997-01-02 (w3c-sgml-wg@w3.org from January 1997)

From: Steven R. Newcomb <srn@techno.com>
Date: Thu, 2 Jan 1997 14:46:33 -0500
To: w3c-sgml-wg@www10.w3.org
Message-Id: <199701021946.OAA00705@bruno.techno.com>
> I changed the terminology, not for its own sake, but because I am not
> convinced that a HyTime BOS meets the explicitness requirements I think we
> require. I could also argue that "working set" is a commonly understood
> term in storage management, and in this case the analogy reveals the
> intention of the term better. I have another reason as well, which we will
> get to later on.
> 
> >>    My contention is that the notion of (even restricted) recursive document
> >> requirement could be harmful.
> >
> >How so, if it's restricted?
> 
> Because think that any restriction that's more general than the explicit
> selection of indiviual files for inclusion is leaving too many files open
> for inclusion.

And I think you're asking everyone to accept the idea that they can
never delegate responsibility for keeping track of the changing
physical addresses of any linked-to or linked-from documents.  To me
your position seems untenable, simply as a policy matter.  It's
already too expensive to maintain hyperlinks, and you seem to want to
make it even more expensive.  And I still fail to see any significant
increase in the risk of the BOS getting too large.  You can't control
the sheer size of linked-to or linked-from documents, or the number of
links they contain, even if you list each one separately.  So,
control-of-overhead horse has already left the barn as soon as you
allow any document that you don't control to participate in your BOS.

> There's a big difference between what I want to reference,
> and what I want to tell a browser to download -- they may sometime be the
> same, but they are at least as frequently different.

All too true.  I think the HyTime BOS management facilities need some
judicious tweaking.  See my note today in this list on that subject.
I'd be interested to know if the two-tiered BOS idea answers obviates
your concerns.

> HyTime BOS is defined in terms of entity references, and not all XML links
> will be via an external entity. At least, I take it as axiomatic that we
> are dead in the water if we don't _at least allow_ the creation of links
> that contain just a URL in an attribute value.

True enough, but URLs that are not backed up by entity declarations
will still work just fine in the same way that they already do in HTML
-- in one direction only.  (However, if you want applications to be
aware of the fact that an object is the anchor of some particular
external link, you do have to declare where the application should
look for that external link.)  In HyTime, it's perfectly OK for a link
to refer to an anchor that is outside the BOS, and it's perfectly OK
for an application to support traversal to such an anchor.  The act of
traversing such a link means bringing the document or object in which
the anchor exists into the effective BOS on an ad hoc basis, that's
all.

>    So some references will not be part of the BOS. I am also one of the
> people who argued vociferously (and successfully) that an XML parser _must
> not_ be required to follow all entity references. This also makes the BOS
> potentially problematic, even for the entities making up a single document.

 ...but not if you can declare entities without having them
automatically wind up in the BOS, right?  Again, does the two-tiered
BOS idea answer your concern?

> As noted above, and for the same reasons we have been re-arguing here for
> links, XML processors are _not_ required to pull in a whole entity tree at
> once.

 ...and that's absolutely reasonable.  No argument.

> Well, my problem is that I think that the term that we need does _not_
> match BOS in every particular, but only in some particulars. So I can gain
> both perspicuity (my opinion) and accuracy.

Well, David, as you can see, I'm trying to get the best match I can.
For whatever it's worth, I'm willing to propose changes to HyTime's
BOS notion that will make it work for all of us.  The whole question
of what exactly constitutes the subject of hyperdocument processing is
so terrifyingly fundamental that neither XML nor HyTime can afford to
get it wrong.

> > If at the end of the ramp there's a huge step up
> >to HyTime, or if the ramp doesn't even take you anywhere near HyTime, we
> >have blown a unique opportunity to interconnect the world's knowledge
> >bases.
> 
>    There is a working example of interconnecting the world's knowledge
> bases. It is the WWW. This is in some ways sad, but undeniably true.

I can't believe that you mean to imply that the game is already lost
and there's nothing to be done.  If you thought that, you wouldn't
be reading my tedious letters to this list.

>    I don't have an explicit goal either way with respect to HyTime. The
> brief of the group does _not_ include HyTime compatibility. Personally, for
> functionality, I am applying a standard to HyTime that I describe as "no
> gratuitous incompatibility". I see no reason, technical or offical, why we
> should stick to HyTime if we think that HyTime got something wrong.

Amen.  But you'll forgive me if I keep working to make HyTime right, I
hope.  (And since that's the primary focus and goal of my working
life, I'd have to be crazy to ignore what the most amazing aggregation
of hardworking smart people I've ever been privileged to observe has
to say on the matter.)

> On the
> other hand, there is a large amount of good work on the relationship of
> hypertext to markup in HyTime. So I think the semantics are mostly right. I
> think we can manage to be mostly compatible with the standard as well
> (modulo a few PIs a user can add if they need them). I only say mostly in
> case there are places (I feel anchor roles are one) where HyTime may impose
> a restriction that we can avoid.

That happens to be an issue that I have been passionately on both
sides of.  (As most HyTime committee people can attest.)  I think I'm
content to allow XML and HyTime to go their separate ways on the
question of whether anchor roles should be #FIXED (or effectively
#FIXED).  XML intentionally does *not* require that the information it
expresses conform to a particular model (a DTD).  SGML, on the other
hand, very solidly (and with extremely good reasons) insists that
everything that is expressed conforms with an a priori model (DTD).
When you have a conforming SGML document, you know it meets the terms
of an explicit "contract" between document authors and application
software developers.  It means something, therefore, to say "SGML".
It's the same situation with the question of fixing anchroles.  In the
case of XML, evidently there is a requirement that there be no
consistency in terms of the semantic roles played by the linkends.  In
the case of HyTime, however, I want to be able to say that there is a
similar contract in effect between applications developers and
hyperdocument authors: that it means something to say "HyTime".  This
does not mean that I am any less interested in XML; on the contrary,
the on-ramp may not be the same as the highway, but the highway is
pointless unless people can get access to it.  My hope is that XML
will help make HyTime accessible.  (I'm already satisfied that XML is
helping HyTime immeasurably in other ways.)

> I think we will only need 1 of these senses (Application BOS). This
> reflects the reality that an application can do what it wants. 

Ah, but it can't.  What if somebody's server is down, and the BOS that
the application would like to assemble can't be fully assembled?  You
need all three concepts: the one the author specified, the one the
application wanted to assemble, and the one the application actually
did assemble for use in this session.

> We also need
> a method (to be determined, though I suggested one route) for an author to
> suggest a minimal set of documents that should be in the (application BOS,
> Working Document Set) for a document to be at its best.

Absolutely.

>    So I don't want to just say BOS, since that is confusing to those who
> know that there are other senses, and I don't want to say "application BOS"
> because then we will be using a modifier ("application") on every
> occurrence of a realtively recondite term none of whose other senses need
> come into our standard.

OK, David.  I'm certainly not going to make trouble over what to name
the baby.  I just wanted to make everyone aware that there are lots of
things to consider when doing the naming.  I didn't mean to imply that
you are egomaniacally wedded to the terms you yourself invent.  Far
from it.  I hate that kind of thing and it offends me when I'm accused
of it, which is all too often.  If I thought you had that attitude,
there would be no doubt in your mind about that that was how I felt.
Precision is everything.  As long as it's got an agreed-upon
definition, we can call it "baldoolifling" for all I care.

>     You could change it in the "hub" or in the local companion declaration.
> But you have to change it _somewhere_, and only in _one_ place. 

Does the "subhub" concept I proposed in my other note today answer
this concern?  It doesn't demand that there be exactly one subhub,
but I think it would make it possible to do what you want to do,
without constraining everyone to use exactly one subhub.

>     On my proposal, the companion of a companion is a companion (we take
> the transitive closure of the companion relation). So we get the same
> ability to express things that we have following entity tree. What we have
> gained is decoupling of "companionship declaration" from entity
> declaration. I can imagine authors who want to do a lot of linking, but not
> use entity declarations -- they would be accommodated by this. I can also
> imagine that many authors will have links that should _not_ indicate
> companions.

I detect a possible difference here between what I've proposed in my
other note today (subhubs) and what you seem to want.  The difference
I detect is that you want to declare the physical addresses of
companion documents in some fashion other than SGML entity
declarations.  Perhaps by using UR{L|N}s.  Is that right?

Best regards,

--Steve

             Steven R. Newcomb   President
         voice +1 716 271 0796   TechnoTeacher, Inc.
           fax +1 716 271 0129   (courier: 23-2 Clover Park,
      Internet: srn@techno.com    Rochester NY 14618)
           FTP: ftp.techno.com   P.O. Box 23795
    WWW: http://www.techno.com   Rochester, NY 14692-3795 USA
Received on Thursday, 2 January 1997 15:45:17 UTC