Re: [dgd@cs.bu.edu: BOS confusion (analysis; suggestion to resolve Newcomb/Bryan conflict)] from David G. Durand on 1997-01-02 (w3c-sgml-wg@w3.org from January 1997)

From: David G. Durand <dgd@cs.bu.edu>
Date: Thu, 2 Jan 1997 17:39:44 -0500
To: w3c-sgml-wg@www10.w3.org
Message-Id: <v02130500aef1dd3f6b44@[165.90.139.123]>
At 2:46 PM 1/2/97, Steven R. Newcomb wrote (quoting me occasionally):
>> >>    My contention is that the notion of (even restricted) recursive
>>document
>> >> requirement could be harmful.
>> >
>> >How so, if it's restricted?
>>
>> Because think that any restriction that's more general than the explicit
>> selection of indiviual files for inclusion is leaving too many files open
>> for inclusion.
>
>And I think you're asking everyone to accept the idea that they can
>never delegate responsibility for keeping track of the changing
>physical addresses of any linked-to or linked-from documents.

I am baffled. I have shown how this can be delegated to an arbitrary
document, or not delegated, depending on the author's desires. I just don't
think you understood the proposal in my last posting.

 Noe last time: Each document should be able to have a list of documents
that clients _should_ process along with it (separate from link
declarations, entity references or entity declarations). If the documents
referenced in this list make a similar requirement of other documents, then
the set required expands. _Only_ those documents referenced in this way are
suggested for such simultaneous processing, regardless of entity
declarations or link occurences.

This is recursive inclusion, but along a brand new, author controlled,
independent set of references. No-one who is not using ilinks will ever
need to use this, as far as I can tell, but it may provide helpful hints as
to when _not_ to apply "lazy fetch" policies.

> To me
>your position seems untenable, simply as a policy matter.

I think it would be untenable if that were my position. I want to increse
the precision of the author, not decrease it.

> It's
>already too expensive to maintain hyperlinks, and you seem to want to
>make it even more expensive.  And I still fail to see any significant
>increase in the risk of the BOS getting too large.  You can't control
>the sheer size of linked-to or linked-from documents, or the number of
>links they contain, even if you list each one separately.
Yes, but the linked-to documents are irrelevant to my proposal. Only
explicit "companion documents" would be included. The notion of
pre-fetching along links is what I don't like. It is both incomplete (in
the case that I want to include a set of ilinks stored externally without
creating an explicit navigational link between them), and over-zealous (I
have to do work to exclude documents that I link, so that they don't end up
in the BOS).

>  So,
>control-of-overhead horse has already left the barn as soon as you
>allow any document that you don't control to participate in your BOS.

But if I only allow documents that I do control to be explicit companions
of my document, this is not a problem. I don't control the BOS, but merely
suggest that the document working set _should_ include certain documents
(because they make sense only in relation to one another).

>> There's a big difference between what I want to reference,
>> and what I want to tell a browser to download -- they may sometime be the
>> same, but they are at least as frequently different.
>
>All too true.  I think the HyTime BOS management facilities need some
>judicious tweaking.  See my note today in this list on that subject.
>I'd be interested to know if the two-tiered BOS idea answers obviates
>your concerns.

   It seems too complex to me, because it requires tagging every link with
whether or not it is a "BOS0-augmenting" link or note. I don't want to have
to control my BOS every time I make a link. And I don't want to have to
make "user-invisible" links when I want to control by BOS.

And since the notion I am defining is not the HyTime BOS (based as it is on
entity reference trees), I find a new term clearer.

>.... URLs that are not backed up by entity declarations
>will still work just fine in the same way that they already do in HTML
>-- in one direction only.  (However, if you want applications to be
>aware of the fact that an object is the anchor of some particular
>external link, you do have to declare where the application should
>look for that external link.)

Yes, I do. I want to separate that process from link creation entirely,
because I don't see that they are closely related. My other point about
URLs is that I think we are foolish if we make an important facility like
this depend on the use of entity declarations -- most current users will
find this indirection quite onerous. In fact, _I_ think that it is onerous
for some kinds of simple document reference.

> In HyTime, it's perfectly OK for a link
>to refer to an anchor that is outside the BOS, and it's perfectly OK
>for an application to support traversal to such an anchor.  The act of
>traversing such a link means bringing the document or object in which
>the anchor exists into the effective BOS on an ad hoc basis, that's
>all.

But I have to declare notation attributes to accomplish this, don't I? Do
we even have notation attributes in XML?

>
>>    So some references will not be part of the BOS. I am also one of the
>> people who argued vociferously (and successfully) that an XML parser _must
>> not_ be required to follow all entity references. This also makes the BOS
>> potentially problematic, even for the entities making up a single document.
>
> ...but not if you can declare entities without having them
>automatically wind up in the BOS, right?  Again, does the two-tiered
>BOS idea answer your concern?

I don't want most entities to end up there, and I don't see why I need to
declare an entity just to get this behaviour: All I want to say is
<companion-doc url="http://foo.com/bar.xml"/>

>> Well, my problem is that I think that the term that we need does _not_
>> match BOS in every particular, but only in some particulars. So I can gain
>> both perspicuity (my opinion) and accuracy.
>
>Well, David, as you can see, I'm trying to get the best match I can.
>For whatever it's worth, I'm willing to propose changes to HyTime's
>BOS notion that will make it work for all of us.  The whole question
>of what exactly constitutes the subject of hyperdocument processing is
>so terrifyingly fundamental that neither XML nor HyTime can afford to
>get it wrong.

   If you're willing to decouple BOS from entity references, then maybe
we're on the same wavelength. Personally, I'm not convinced that there is a
unique right answer, which is why I emphasise suggesting co-processing and
not requiring it.

>> > If at the end of the ramp there's a huge step up
>> >to HyTime, or if the ramp doesn't even take you anywhere near HyTime, we
>> >have blown a unique opportunity to interconnect the world's knowledge
>> >bases.
>>
>>    There is a working example of interconnecting the world's knowledge
>> bases. It is the WWW. This is in some ways sad, but undeniably true.
>
>I can't believe that you mean to imply that the game is already lost
>and there's nothing to be done.  If you thought that, you wouldn't
>be reading my tedious letters to this list.

   I just mean what I say. We have already seen an object example that
getting basic functionality correct and _as easy as possible_ seems to beat
getting the details right. So I want to keep things as simple and
orthogonal as possible. And I want to preserve the kinds of simplicity that
people expect from the web -- ie. any operation that refers to a document
should possible by naming the operation and a URL, without any other stuff
intervening. That means _no  mandatory entity declarations_ in the
definition of linking.

>>    I don't have an explicit goal either way with respect to HyTime. The
>> brief of the group does _not_ include HyTime compatibility. Personally, for
>> functionality, I am applying a standard to HyTime that I describe as "no
>> gratuitous incompatibility". I see no reason, technical or offical, why we
>> should stick to HyTime if we think that HyTime got something wrong.
>
>Amen.  But you'll forgive me if I keep working to make HyTime right, I
>hope.  (And since that's the primary focus and goal of my working
>life, I'd have to be crazy to ignore what the most amazing aggregation
>of hardworking smart people I've ever been privileged to observe has
>to say on the matter.)

I have no objections to improving HyTime, as you should know. In fact,
that's one of the reasons I keep paying attention to it -- even attention
that I can't afford to spare.

> I only say mostly in
>> case there are places (I feel anchor roles are one) where HyTime may impose
>> a restriction that we can avoid.

The rest
>That happens to be an issue that I have been passionately on both
>sides of.  (As most HyTime committee people can attest.)  I think I'm
>content to allow XML and HyTime to go their separate ways on the
>question of whether anchor roles should be #FIXED (or effectively
>#FIXED).  XML intentionally does *not* require that the information it
>expresses conform to a particular model (a DTD).  SGML, on the other
>hand, very solidly (and with extremely good reasons) insists that
>everything that is expressed conforms with an a priori model (DTD).
>When you have a conforming SGML document, you know it meets the terms
>of an explicit "contract" between document authors and application
>software developers.  It means something, therefore, to say "SGML".
>It's the same situation with the question of fixing anchroles.  In the
>case of XML, evidently there is a requirement that there be no
>consistency in terms of the semantic roles played by the linkends.  In
>the case of HyTime, however, I want to be able to say that there is a
>similar contract in effect between applications developers and
>hyperdocument authors: that it means something to say "HyTime".  This
>does not mean that I am any less interested in XML; on the contrary,
>the on-ramp may not be the same as the highway, but the highway is
>pointless unless people can get access to it.  My hope is that XML
>will help make HyTime accessible.  (I'm already satisfied that XML is
>helping HyTime immeasurably in other ways.)
>
>> I think we will only need 1 of these senses (Application BOS). This
>> reflects the reality that an application can do what it wants.
>
>Ah, but it can't.  What if somebody's server is down, and the BOS that
>the application would like to assemble can't be fully assembled?  You
>need all three concepts: the one the author specified, the one the
>application wanted to assemble, and the one the application actually
>did assemble for use in this session.

Sorry, the one the application _didn't assemble_ has no effect on any
behavior that we can specify. There are two concepts still: what the author
specified, and what the application did. What can we suggest a processor do
with the set of documents that it didn't assemble? The production of error
messages is one possibility, but the application can presumably already
detect that it failed to access a document, so that base is already
covered.

>>    So I don't want to just say BOS, since that is confusing to those who
>> know that there are other senses, and I don't want to say "application BOS"
>> because then we will be using a modifier ("application") on every
>> occurrence of a realtively recondite term none of whose other senses need
>> come into our standard.

>OK, David.  I'm certainly not going to make trouble over what to name
>the baby.  I just wanted to make everyone aware that there are lots of
>things to consider when doing the naming.  I didn't mean to imply that
>you are egomaniacally wedded to the terms you yourself invent.  Far
>from it.  I hate that kind of thing and it offends me when I'm accused
>of it, which is all too often.  If I thought you had that attitude,
>there would be no doubt in your mind about that that was how I felt.
>Precision is everything.  As long as it's got an agreed-upon
>definition, we can call it "baldoolifling" for all I care.

For that matter, I could live with "Application BOS" as long as the
definition is clear. I am partly using a separate term because I want to
differentiate my companion proposal from HyTime facts, until I am sure that
they are the same thing. So far I am not convinced they are the same.

>>     You could change it in the "hub" or in the local companion declaration.
>> But you have to change it _somewhere_, and only in _one_ place.
>
>Does the "subhub" concept I proposed in my other note today answer
>this concern?  It doesn't demand that there be exactly one subhub,
>but I think it would make it possible to do what you want to do,
>without constraining everyone to use exactly one subhub.

I am proposing that every document mentioned as a companion is a "subhub"
in the sense that you proposed (I think) and that the current document
being processed is always the "hub" (but only respect to explicitly listed
companions).  And that subhubs can recursively include their companions (if
any) as sub-subhubs.

   Rub-a-dub-dub a sub-subhub in a tub.

   I just want to keep refrencing, linking, entity declaration, and working
set specification orthogonal. The examples are only intended to show that
we don't need any additional mechanisms than I have proposed.

>>     On my proposal, the companion of a companion is a companion (we take
>> the transitive closure of the companion relation). So we get the same
>> ability to express things that we have following entity tree. What we have
>> gained is decoupling of "companionship declaration" from entity
>> declaration. I can imagine authors who want to do a lot of linking, but not
>> use entity declarations -- they would be accommodated by this. I can also
>> imagine that many authors will have links that should _not_ indicate
>> companions.

>I detect a possible difference here between what I've proposed in my
>other note today (subhubs) and what you seem to want.  The difference
>I detect is that you want to declare the physical addresses of
>companion documents in some fashion other than SGML entity
>declarations.  Perhaps by using UR{L|N}s.  Is that right?

yes. yes. yes. Hey, we may be getting somewhere!


   -- David

I am not a number. I am an undefined character.
_________________________________________
David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________
Received on Thursday, 2 January 1997 17:33:09 UTC