Re: unmarked linkend awareness by XML engines

At 16:14 29/12/96 -0500, Steven R. Newcomb wrote:
>
>I agree with Martin Bryan that SDIF is irrelevant to this discussion.
>The BOS concept, however, is very relevant.  It is a way of limiting
>the number of links for which an application is responsible.

The word "responsible" is one that gives me great problems.
If I create a page that has a link to some text on a page you have created,
in what way am I "responsible" for the contents of this page. Obviously I
have no copyright to it. I have no right to add any contents to it. I have
no right to add any ids to it to identify the text that I was referring to.
I have no mechanism for ensuring you do not a) remove the text I was
referring to, b) remove links in the text that I was referring to, c) add
links to the text I was referring to, d) add or remove any other part of the
document.

>  It [the BOS concept] gives
>an application much more power to make information useful and
>accessible, by limiting the amount of information over which such
>power must be exercised.

Not unless you have proper versioning control and fully record the state of
the referenced file at the time you created the links. In practice what you
suggest however does not "limit the amount of information over which power
is exercised" but unnecessarily extends it to a level where it is unmanageable.

>(Dave Durand:)
>
>> No, we simply have to face the fact that end users are the only ones
>> who can decide what documents need to be in their processing set.
>
>Right.  But that doesn't mean users won't find it valuable to have an
>author's *suggested* BOS on hand.  On the contrary, most serious users
>of documents are delighted to have the document's author provide all
>possible help.

Only so long as that help remains up-to-date. On the web that typically
means for a few weeks at the most.

>> Other links to that document will only be found if the user has
>> explicitly or implicitly (via some strategy) specified what other
>> documents may contain ilinks of interest (for example by opening a
>> bookmark set of ilinks, or a guided tour set of ilinks, or an
>> annotation set of links, etc.).
>
>But, then, we agree!  This is precisely what a BOS does; it's a
>guided tour.

Not according to the model you give below - there is no guidance, simply a
mass copying of pointers found in related documents.

>> We can't check the whole world, and we can't just leave it to the
>> author (without damaging the ability to create external
>> annotations), so we have to leave it to the user (via their
>> application).
>
>I strongly agree with you, but you evidently think I disagree.  I
>think maybe you're making a false assumption about what I'm proposing.

I strongly disagree with both of you. We cannot expect link management to be
done by either an authoring system or an XML browser (which I presume is
what David is referencing when he talks about a user's application here).
Link management has to be a separate function, run at regular intervals
between link creation and link navigation.

>According to HyTime, the BOS *suggestion* presumably takes effect when
>the user begins a session by entering a particular document.  Nothing
>requires that the user accept the document author's suggestion.

How an earth can a browser or its user determine which of the "BOS
suggestions" are relevant?

>Indeed, if the user insists on making the whole Web the BOS of his
>session, he can sit there and wait until his beard hits the floor.

How will he ever know that he was insisting on this? Surely he is not going
to be given a browser that has functions that say "follow all links in all
documents linked to this one"? Even crawlers run on supercomputers are not
stupid enough to adopt this approach: I don't see anyone realistically
suggesting that this approach would work on a 286!

>Far be it from HyTime to stop him from engaging in such folly,
>however.  That session-start document contains the suggested BOS
>expressed as a set of <!ENTITY declarations with arbitrary limits on
>how deeply those entities can recursively declare other entities which
>will become part of the BOS.  The resulting pruned entity tree
>contains all the links for which the HyTime application is recommended
>to be responsible by the author of the session-start document.

In Web terms what the "UWOS" [Unbounded Web Object Set] needs to state is
the set of entities for which access _must_ be provided by the application
to make sense of the "document" being accessed by the browser via that
start-page. The concept to "arbitrary limits on how deeply these entities
can recursively declare other entities" is both irrelevant and highly
dangerous on the web.

I agree a UWOS has benefits - I disagree that a UWOS is a bounded object set.

>If, in the course of traversing the links in the session-start
>document, the user encounters another document, the user certainly has
>the option of accepting *that* document's suggestion as to what the
>BOS should be (making that document effectively the session-start
>document), or, of adding that document's BOS to his existing,
>session-dependent BOS, or of simply ignoring the suggestion and
>keeping the present BOS, or of doing something else entirely.

This will not work, for a number of reasons. 

Firstly look at the situation where I make a reference to just part of one
of your documents. The links in the part of the document that I refer to
_may_ be relevant to the BOS, but those that occur in the rest of your
document are unlikely to be relevant (otherwise I would have referred to
that part of the document as well). How can I determine which of the entries
in the referenced document's set would be relevant? If I copy them all then
my object set is no longer "bounded" - it is distinctly unbounded as far as
my document is concerned!

Now look what happens when you edit the referenced document, and delete or
change some of the links in it. If those links are in the area I have
referenced then the copy of the entity list that I took when authoring is
now out of date. How do I determine this? If they are outside the area that
I am referring to I don't care about the changes, even if this means that
the "complete copy" of the entity set I took earlier is now out of date. But
how can I tell that the change in the entity set is one that does not affect
my UWOS?

(What I need is to be able to reference the BOS of the referenced document
independently of the document itself in such a way that I can identify when
it has been changed so that I can then revalidate my own document's references.)

>The existence of a BOS suggestion *enables* ilinks by making them
>practical and scalable, which in turn *enable* the creation of
>annotations of read-only materials which can be seen in the context of
>the read-only document.

A BOS dos not "*enable* the creation of annotations of read-only materials"
because it fails completely to do the main thing that is required. If I
reference your read-only material by an existing link, how can I determine
whether there are already sets of annotations associated with this document?
The only way I can know that a set of annotations exists for a document at
present is to access the annotations separately first. What I really need to
be able to is to have the document identify for me which sets of annotations
point to it. In other words the document needs to "keep a record" of all
files which reference it in their UWOS. All a BOS associated with a set of
annotations could possibly tell you is the set of files the annotations
refer to.

> If the user begins his session with the
>annotation document and traverses to the annotated document, because
>the annotation document is necessarily in the user's BOS, all the
>annotations in the annotation document can then be made available by
>the application as hot links from within the annotated (read-only)
>document.  (I don't need to point out that you can't do this with
>HTML, or that that is one hell of a useful capability.)

While such link inheritance based on traversal routes is undoubtedly useful
it fails to meet one set of needs. What I, as a writer, want to know is not
what links I have added to the referenced document, but what other people
have thought to have been of relevance. For a researcher, rather than a
document deliverer, this is the important aspect of the web. (We should not
forget that much of the use of the Web is by researchers and should try to
meet their needs as well where possible.)

(Before you start responding to this paragraph, I know that I am treading on
dangerous grounds re the right to view/manage links - what I am pointing to
is that writers/reasearchers have different uses for links than those found
in controlled document delivery environments, and that XML should try to
allow for these as well.)

>> We may well want to consider adding additional notations to express
>> that some document _may_ be processed when a document is
>> processed. Sort of a "This document best when parsed along with ..."
>> notation.
>
>How is this different from HyTime BOS notation?  I don't see any
>difference, Dave.  I think we're actually on the same side of this
>question.

I think David is pointing more to my idea of "the files that you need access
to to make sense of this document", rather than a bounded object set. What I
think is  is more relevant is to use the History of links visited by the
author of the annotations o create a list of the files that must be made
available to users, rather that just saying that any file linked to any file
that is referenced should be recorded as part of the BOS.

>Let me again rephrase the question I want answered:
>
>"Do we want XML to be able to make hot links available in documents
>when those documents do not themselves contain those hot links?"

Yes - but this is totally different from the concept of a BOS. This is
inheritence of anchors dependent on the route by which you get to the
document. This must be a good thing.

>It's do-able if we have, as Dave puts it, "additional notations to
>express that some document _may_ be processed when a document is
>processed."  Or, as HyTime puts it, the bounded object set (BOS)
>concept.

No - Dave has it correct - the HyTime BOS is not bounded enough!
----
Martin Bryan, The SGML Centre, Churchdown, Glos. GL3 2PU, UK 
Phone/Fax: +44 1452 714029   WWW home page: http://www.u-net.com/~sgml/

Received on Monday, 30 December 1996 04:37:57 UTC