Re: unmarked linkend awareness by XML engines from Steven R. Newcomb on 1996-12-30 (w3c-sgml-wg@w3.org from December 1996)

From: Steven R. Newcomb <srn@techno.com>
Date: Mon, 30 Dec 1996 17:18:43 -0500
To: w3c-sgml-wg@www10.w3.org
Message-Id: <199612302218.RAA10551@bruno.techno.com>
> The word "responsible" is one that gives me great problems.
> If I create a page that has a link to some text on a page you have created,
> in what way am I "responsible" for the contents of this page. Obviously I
> have no copyright to it. I have no right to add any contents to it. I have
> no right to add any ids to it to identify the text that I was referring to.
> I have no mechanism for ensuring you do not a) remove the text I was
> referring to, b) remove links in the text that I was referring to, c) add
> links to the text I was referring to, d) add or remove any other part of the
> document.

Of course you are right in everything you say.  You're arguing with a
point I don't believe and never intended to make.  I meant to say that
a HyTime *application* is responsible for keeping track of the
linkends of all the links in the BOS.  "Responsible" means only that
if the user stumbles across one of those linkends, the application
must be able to inform the user that it is indeed a linkend, perhaps
by rendering it as a hotspot.  It must also be able to provide
traversal services to the other linkends.  No more and no less.  This
has nothing to do with copyright or other any other rights issues.  It
has only to do with what a HyTime application is expected to know
about the current BOS, full stop.

> >  It [the BOS concept] gives
> >an application much more power to make information useful and
> >accessible, by limiting the amount of information over which such
> >power must be exercised.
> 
> Not unless you have proper versioning control and fully record the state of
> the referenced file at the time you created the links. In practice what you
> suggest however does not "limit the amount of information over which power
> is exercised" but unnecessarily extends it to a level where it is unmanageable.

For the life of me, Martin, I can't see why you say this.  It's quite
manageable.  I can only conclude that you don't understand what I'm
trying to say, and/or that we have disturbingly different
understandings about what a HyTime BOS is.

> Not according to the model you give below - there is no guidance, simply a
> mass copying of pointers found in related documents.

What "mass copying of pointers"?  I have no idea what you are talking
about.  There is nothing in my mind about copying anything at all.
Please explain what you are thinking.

> We cannot expect link management to be
> done by either an authoring system or an XML browser (which I presume is
> what David is referencing when he talks about a user's application here).
> Link management has to be a separate function, run at regular intervals
> between link creation and link navigation.

What do you mean by "link management"?  If it's link maintenance and
periodic testing, then it's not what I have been talking about
(although it's a good idea for anyone who owns links to dynamic
documents).

> >According to HyTime, the BOS *suggestion* presumably takes effect when
> >the user begins a session by entering a particular document.  Nothing
> >requires that the user accept the document author's suggestion.
> 
> How an earth can a browser or its user determine which of the "BOS
> suggestions" are relevant?

If you're only in one document at a time, there's only one suggestion
in effect at a time.  If you accept the document's suggested BOS, then
you are trusting the author of that document to take you where you
want to go.  There's no "which".  There's only one "whether" per
document, and therefore only one "whether" at a time.  What's the
problem?

> >Indeed, if the user insists on making the whole Web the BOS of his
> >session, he can sit there and wait until his beard hits the floor.
> 
> How will he ever know that he was insisting on this? Surely he is not going
> to be given a browser that has functions that say "follow all links in all
> documents linked to this one"? Even crawlers run on supercomputers are not
> stupid enough to adopt this approach: I don't see anyone realistically
> suggesting that this approach would work on a 286!

We can only assume that people who insist on things know what they're
insisting on and are prepared to live with the consequences.  On the
other hand, if an author of a document provides a suggested BOS of
such immensity that it causes users' computers to stop dead for hours
at a time, then that's a pretty bad document and a pretty
irresponsible author, or it's a document intended for the use of a
pretty specialized audience.  Ordinary users would do well not to take
that author's suggestions seriously.

> In Web terms what the "UWOS" [Unbounded Web Object Set] needs to state is
> the set of entities for which access _must_ be provided by the application
> to make sense of the "document" being accessed by the browser via that
> start-page. 

Are you proposing a new term, "UWOS"?  What do you mean by this?
In what sense is it "unbounded" if only certain entities need to
be accessed?

> The concept to "arbitrary limits on how deeply these entities
> can recursively declare other entities" is both irrelevant and highly
> dangerous on the web.

Why is it irrelevant, and why is it dangerous to limit how deeply
entities can recursively declare other entities?  On the contrary, I
think it would be highly dangerous *not* to limit such recursion.

> >If, in the course of traversing the links in the session-start
> >document, the user encounters another document, the user certainly has
> >the option of accepting *that* document's suggestion as to what the
> >BOS should be (making that document effectively the session-start
> >document), or, of adding that document's BOS to his existing,
> >session-dependent BOS, or of simply ignoring the suggestion and
> >keeping the present BOS, or of doing something else entirely.
> 
> This will not work, for a number of reasons. 
> 
> Firstly look at the situation where I make a reference to just part of one
> of your documents. The links in the part of the document that I refer to
> _may_ be relevant to the BOS, but those that occur in the rest of your
> document are unlikely to be relevant.

So what?  If they do not have any linkends in the BOS, then they
have no effect.

> (otherwise I would have referred to that part of the document as
> well).

There is no reason to make any such assumption.  Maybe the author of
the annotation document got tired and went to bed.

> How can I determine which of the entries
> in the referenced document's set would be relevant? If I copy them all then
> my object set is no longer "bounded" - it is distinctly unbounded as far as
> my document is concerned!

There is no requirement that you add any traversed-to document's
suggested BOS to your BOS.  You insist that I am saying this, but I
have repeatedly and very explicitly said exactly the opposite.  Yes,
the traversed-to document itself is necessarily in the BOS.  BUT THE
TRAVERSED-TO DOCUMENT'S SUGGESTED BOS IS NOT ADDED TO THE BOS UNLESS
THE USER DECIDES TO DO THAT!  The user is under no compulsion to do
any such thing, and neither is the application.

One way to keep down the number of documents in the BOS is to add them
only when the user requests them, perhaps by attempting to traverse a
link to a document which is not currently in the BOS.  The fact that
the document is not currently in the BOS only means that the
application is not currently responsible for the links contained in
that document.  Once you do enter that document, however, the
application becomes responsible for the links in that document that
have linkends in any documents that are already in your BOS, as well
as any linkends in the new document of links that are already in your
BOS.

> Now look what happens when you edit the referenced document, and
> delete or change some of the links in it. If those links are in the
> area I have referenced then the copy of the entity list that I took
> when authoring is now out of date. How do I determine this? If they
> are outside the area that I am referring to I don't care about the
> changes, even if this means that the "complete copy" of the entity
> set I took earlier is now out of date. But how can I tell that the
> change in the entity set is one that does not affect my UWOS?

What is all this about copying?  What's being copied, where is it
being copied to, and why?  I didn't say anything about copying.  If
you are discussing the creation of documents that record the BOSs that
are accumulated in sessions, that's an interesting idea, but I have
never suggested it.

> (What I need is to be able to reference the BOS of the referenced
> document independently of the document itself in such a way that I
> can identify when it has been changed so that I can then revalidate
> my own document's references.)

The BOS is forgotten as soon as the session is over.  It doesn't exist
anywhere but transiently and very dynamically in the memory of a
computer running a HyTime application.  Are you worrying about the
fact that a BOS's physical addresses can go stale during a session?
If so, I submit that we don't need to worry about that now.

Your remark seems to indicate that you think I am suggesting that the
each session-start document contain a copy of all of the suggested
BOSs of all the entities in its own suggested BOS.  I am not
suggesting that, and I have never intended to suggest that.

> >The existence of a BOS suggestion *enables* ilinks by making them
> >practical and scalable, which in turn *enable* the creation of
> >annotations of read-only materials which can be seen in the context of
> >the read-only document.

> If I reference your read-only material by an existing link, how can
> I determine whether there are already sets of annotations associated
> with this document?

The answer to this question is obvious: you can't.  Without recourse
to something else -- such as a publisher's catalog, or the online
equivalent of _Books In Print_, you can't know whether, for example,
someone has published a commentary on Winston Churchill's _The
Gathering Storm_.  You certainly can't tell from any copy of _The
Gathering Storm_, even those printed after the publication of the
commentary.  Despite the fact that the commentator would love to make
an advertisement for his commentary appear right inside all copies of
_The Gathering Storm_, fortunately or unfortunately, he doesn't have
write access to that document, so he can't.  I can't believe you would
suggest that everyone have write access to everything, so I'm now
wondering if you're suggesting that XML should undertake some sort of
cataloging task to produce something like _Books In Print_?  If so, I
think this idea is impractical for the XML project and unnecessary in
any event.  People are already doing that and making money at it.

I'm suggesting something quite practical and valuable, and, I would
argue, badly needed.  Please see my last note giving an example of how
Martin's annotations of Len's old document, now owned by Steven, are
used with that document by David under one scenario and by Eliot under
another.

> The only way I can know that a set of annotations exists for a
> document at present is to access the annotations separately
> first. What I really need to be able to is to have the document
> identify for me which sets of annotations point to it. In other
> words the document needs to "keep a record" of all files which
> reference it in their UWOS. All a BOS associated with a set of
> annotations could possibly tell you is the set of files the
> annotations refer to.

I take it all back.  Incredibly, you *are* suggesting that everybody
have write access to everything.  To put in bluntly, this proposal
will be as popular with publishers and information owners as a turd in
a punch bowl.

> I think David is pointing more to my idea of "the files that you
> need access to to make sense of this document", rather than a
> bounded object set.  What I think is more relevant is to use the
> History of links visited by the author of the annotations to create
> a list of the files that must be made available to users, rather
> that just saying that any file linked to any file that is referenced
> should be recorded as part of the BOS.

I have not proposed that "any file linked to any file that is
referenced should be recorded as part of the BOS."  I have repeatedly
said something quite different and much more subtle.  Hello?

Whether a document may be created by recording the author's travels in
browsing sessions is irrelevant, as is the broader question of how
documents are created in general.

> >Let me again rephrase the question I want answered:
> >
> >"Do we want XML to be able to make hot links available in documents
> >when those documents do not themselves contain those hot links?"
> 
> Yes - but this is totally different from the concept of a BOS. This is
> inheritence of anchors dependent on the route by which you get to the
> document. This must be a good thing.

Martin, please explain to me what you think a HyTime BOS is, so we can
determine why we evidently have such different ideas about it.  But
please do it offline -- not in this list.  Most of these people don't
care about what HyTime says anyway.  I think, in fact, we agree about
the ideas we're talking about (with the exception of providing write
access to everything by everybody), but we incomprehensibly disagree
about what HyTime BOSs are.

> >It's do-able if we have, as Dave puts it, "additional notations to
> >express that some document _may_ be processed when a document is
> >processed."  Or, as HyTime puts it, the bounded object set (BOS)
> >concept.
> 
> No - Dave has it correct - the HyTime BOS is not bounded enough!

  ... and please explain why you think the HyTime BOS is not bounded
enough.  This remark absolutely baffles me.


--Steve

             Steven R. Newcomb   President
         voice +1 716 271 0796   TechnoTeacher, Inc.
           fax +1 716 271 0129   (courier: 23-2 Clover Park,
      Internet: srn@techno.com    Rochester NY 14618)
           FTP: ftp.techno.com   P.O. Box 23795
    WWW: http://www.techno.com   Rochester, NY 14692-3795 USA
Received on Monday, 30 December 1996 21:06:22 UTC