Re: anchor awareness (was Re: Richer & richer semantics?) from Steven R. Newcomb on 1996-12-22 (w3c-sgml-wg@w3.org from December 1996)

From: Steven R. Newcomb <srn@techno.com>
Date: Sun, 22 Dec 1996 14:32:35 -0500
To: w3c-sgml-wg@www10.w3.org
Message-Id: <199612221932.OAA01348@bruno.techno.com>
(Tim Bray:)

> I think that Steve was making an important point, but I think that I
> didn't really get it.  So this is a request for amplification, with some 
> questions
> 
> >The question is:
> >"Does an anchor know that it is an anchor?"
> 
> What does it mean for an anchor to know it's an anchor... and I guess,
> what exactly are you terming an anchor?  

Apologies for my lack of clarity.  Long-established habits of thought
(aka brain damage from overlong exposure to HyTime) are responsible.
I should have defined my terms, and maybe a lesson here is that we
need to agree on a set of definitions for the terms we use.  I
strongly urge the adoption of HyTime parlance simply because it's
designed for maximum generality, it gets us out of the limiting rut
that exposure to HTML has imposed on our thinking about
hyperfunctionality, and it's easy to explain.  I propose the following
definitions, all of which are derived from HyTime:

anchor        

    Any thing(s) that serve(s) as any terminating point of any
    hyperlink.  Unlike many other hypertext jargons, in HyTime, a
    thing is NOT an anchor unless it actually serves as a terminus of
    one or more extant hyperlinks.  Just because something has a
    unique identifier, or because some address somewhere points at it,
    does not make it an anchor.  Because of the universality of
    HyTime addressing, literally everything is potentially an anchor.
    I guess it's reasonable to say that things that have known
    addresses have, in some sense, more potential to be anchors than
    things that don't.  But worrying about that detracts from the
    usefulness of the term, "anchor".  Better to make a clean
    definition.  Either it's linked, and therefore it's an anchor, or
    it isn't.

link or hyperlink

    An element that expresses a relationship between n anchors.  It
    may or may not contain or include all of the information necessary
    to address the anchors, but it always contains or includes
    sufficient information to obtain the addresses of all of the
    anchors, perhaps by indirect addressing and/or notations (such as
    URL notation).  The fundamental thing about a link is NOT the
    address information, however.  It's the relationship that the link
    expresses.  The relationship expressed could be anything.
    Examples of relationships that we've been concerned about in
    recent notes include:

    * transclusion.  (e.g.: insert anchor 2 at the position of anchor 1)

    * metadata.  (e.g.: license notice and subject of license notice)

    * traversibility.  (e.g.: "see also", for some (un)specified reason)

anchor role (sometimes confusingly called "link end")

    One of the roles that anchors can play in the relationship
    expressed by the link.  An anchor role (or link end) is a feature
    of a link type (i.e., of an abstract relationship).  A link often
    confers different semantics upon each of its anchors.  Each such
    semantic is expressed in the link element declaration, as an
    anchor role name.  For instance, in an "html-a" link, the declared
    roles might be "origin" and "target."

address

    Something (set of elements, attribute values, and/or
    other information objects, including but not limited to TEI
    addresses) that expresses the location of one or more things.


> Consider the following:
> 
> Example 1: http://www.textuality.com/sgml-erb/mprdv.html
> 
> not as an example, but in and of itself, embedded in the email you
> are now reading.  

First of all, your example is an address, not a link, and, as far as
we can tell so far, not an anchor either.  The thing your example
points at is not an anchor either, because we don't know of any link
to it yet.  For discussion purposes, let me replace it with a slightly
different example:

  Example 1a: 
  <a href="http://www.textuality.com/sgml-erb/mprdv.html">...</a>

Example 1a is a link (to be precise, a relationship with the semantic
of unidirectional traversibility) which confers anchor status upon two
objects: one anchor is the element shown in Example 1a, and the other
anchor is the html file mprdv.html at Tim's web site.  (I'll answer
the rest of Tim's question in terms of Example 1a.)

> I would assume this is not an anchor in the sense that you
> mean; 1-way www semantics make it a link-end but not an anchor.

I don't think we're communicating yet, but we'll get there if we're
both patient and very careful (:^).  Might I be correct in assuming
that your definition of "link-end" maps to what I would call, "anchor
considered as the origination of a traversal"?  And that your
definition of "anchor" might map to what I would call, "anchor
considered as the target of a traversal?"  If so, I agree with your
statement.

> The person who placed mprdv.html at www.textuality.com and sent the
> URL out by email was consciously creating an anchor that in some
> sense knows it's an anchor since there is an httpd server that will
> give anyone a copy, no questions asked.

I would say no, the target anchor doesn't know it's an anchor, because
the semantics of Example 1a, as defined by HTML, do not require that
conforming applications be able to report to users who start their
browsing in mprdv.html the existence of the element 

    <a href=http://www.textuality.com/sgml-erb/mprdv.html>...</a>

in whatever document it happens to be, and to support traversability to
that element.  True bidirectionality of linking requires that applications
cause *both* anchors of a two-ended link to "know" about the other anchor.
That's what I meant by "anchor awareness": the awareness of the fact that
an anchor has the status of being an anchor by virtue of the fact that
it is linked, even when that link is in an as-yet-unbrowsed document and
has never been traversed.

> On the other hand, when 
> 
> Example 2: <A NAME="sec3.17"> 
> 
> appears in an HTML document, I assume you would call this an anchor that 
> knows it's an anchor?  It exists only to provide addressing hooks.

No.  HTML's semantics do not require that Example 2 "know" that it's
an anchor in the sense that I'm trying to explain.  Example 2 is just
a potential anchor, like everything else.  It happens to be easy to
address, because it has a unique identifier, but that doesn't
necessarily mean that any link actually addresses it.  More to the
point, HTML's semantics do not require that an HTML application be
able to traverse from the element shown in Example 2 to the other
anchor of any link that confers anchor status upon it.

In HTML applications, <a href="...">...</a> links *always* know that
they themselves are also anchors; that's defined by HTML and it's easy
to implement.  It's the other end that's hard to make aware of its
status as an anchor.

> On the other hand, (Example 3: ) with some analogue of ilink, where
> you can point into a document from outside using locaddrs or some
> such, you clearly have a case that what's being pointed-at does not
> and cannot in principle know it's an anchor... or am I missing your
> point?

I think I'm saying precisely what you seem to think I couldn't
possibly be saying!

If XML's semantics so specify, XML applications *can* be made
responsible for knowing the anchor status of pointed-at anchors.  This
is exactly the question my "anchor awareness" note asks the ERB to
decide: "Should XML's linking semantics require XML applications to be
responsible for knowing the anchor status of pointed-at anchors?"  I'm
also saying:

* there are powerful arguments on both sides of this question;

* it's not easy to implement;

* it is implementable;

* it pretty much requires implementers to move to the grove paradigm;
  and

* true bidirectionality could be a compelling reason for users to move
  en masse from HTML to XML.

> This seems waaaaaaay outside our scope.

Maybe so.  Maybe what I'm asking for is a decision on that question.
It's too important a question not to consider, and the fact that
HyTime already makes these demands on implementers, and the fact that
there are working HyTime implementations out there, and the fact that
there is now a standard cookbook object model for doing it (the HyTime
property set and the DSSSL grove paradigm), all conspire to make the
issue quite real and quite unavoidable.  This decision is VERY
important and fundamental, and it MUST be decided.

> Can we not have multiway ilink-style thingies and perhaps even a
> modest amount of locaddr and so on, without getting into major
> paradigm re-engineering?  This is the key point - I had asserted that
> an ilink thing, particularly if you restrict to addressing by entity
> & ID attribute, was a low-hanging fruit that was easy to understand,
> easy to implement, and clearly better than today's HTML.  Are you saying
> that (a) this is a can of worms,

Yes, I am indeed saying that it is a (vanquishable) Hydra.  What's the
use of ilink if none of the anchors know that they are linked?  After
all, an ilink is nothing more or less than a link whose own location
can be different from [Independent of] the locations of *all* of its
anchors.

> or that (b) it is not in fact better,

Not at all.  True anchor awareness is impressively better.  It makes
many heretofore impossible things possible and even easy.  It even
addresses Terry's license notice problem in a meaningful and workable
way.  To paraphrase Terry, doing it successfully could be a
home-run/slam-dunk for XML.

> or (c) have I just missed the whole point?

I think not.  I surmise that you just couldn't believe I was proposing
something that might make such stiff demands on implementers.  Is
there sufficient will among the voting members to confront (or at
least size up) the Hydra monster of anchor awareness?


--Steve

             Steven R. Newcomb   President
         voice +1 716 271 0796   TechnoTeacher, Inc.
           fax +1 716 271 0129   (courier: 23-2 Clover Park,
      Internet: srn@techno.com    Rochester NY 14618)
           FTP: ftp.techno.com   P.O. Box 23795
    WWW: http://www.techno.com   Rochester, NY 14692-3795 USA
Received on Sunday, 22 December 1996 14:31:10 UTC