Re: Initial draft of XML-Link spec now available

I'm relieved to see this draft, because I can actually understand most of
it!  Here are some comments.  Since I mostly haven't taken part in the
wide-ranging link discussions so far, I hope I'm not rehashing anything in
my "content" comments, but I desperately needed to do some paraphrasing to
check my understanding of some of the parts.  I hope none of this starts a
new firestorm.

1.4 Terminology
  I agree with Lee that the terminology is getting in the way.  The
relationship between link-end and anchor seems all wrong.  This affects
several places:

  o The definitions of anchor, link-end, locator, and out-of-line links
  o The explanation of "Anchor" in section 3.1
  o The discussion of ALINKS in section 3.3 (how can an ALINK have only a 
    single link-end? shouldn't it have two?)

  Link-type and anchor role should be broken out and given their own
definitions.  Link database should also be defined, if it is to have such a
significant role in the proceedings (though I agree with Lee's comment
about link DBs).

  I agree with Lee about bi-directional versus multi-directional.

  I really like the terms "in-line links" and "out-of-line links."  In
fact, the salient feature of a link seems to be its linearity :-) and not
its directionality, so I wonder if it's better to have something like
ILLINK and OLLINK rather than ALINK and MLINK.  (I have no allegiance to
HTML's A, whose name makes no sense to me; better to come up with really
effective, evocative names for these...not that my suggestions meet these
criteria either!)

1.5 Types of Links

  One significant rendering instruction often associated with links (at
least in the stuff I do) is generated text.  This is perhaps more relevant
for paper output than for Web access, but it might be worth mentioning.

  I woudl think that link topology, in addition to being a matter of
in-line/out-of-line, is also a matter of (directional) traversal patterns.

  I don't see the difference between link formatting and link behavior.
For example, where is the link explainer "shown"?  If it's balloon help,
that's definitely behavior, isn't it?

2. Link Recognition

  First of all, *one* prefix needs to be chosen for all these
"architectural" purposes.  Is it XML-, or XHL-, or -XML-, or -XHL-?  The
leading hyphen now feels bizarre when it comes to element GIs, and suddenly
I feel a lot more willing to invade the user's namespace...  We should
decide whether all those attributes (like BEHAVIOR) need prefixes when the
default element GI isn't being used.  

  Can we officially consider PI "architectural form" summaries?  One of our
XML design principles was to offer only one way to do things where
possible.  I suspect that this won't fly for link recognition and other
types of "recognition."  (In other words, this will obviously come up in
phase 3 too.)  Are we willing to consider offering even more methods?

2.3 Link Recognition by Other Means

  The spec should note that such links are not highly interoperable among
XML-aware applications.

3.1 Information Associated with Links

  I'm getting confused here.  There's info associated with whole links and
info associated with link ends.  In the case of ALINK, they're all smooshed
together.  Can the separation be made more clear, and (I can't believe I'm
asking this) can the formal specs be shown as architectural forms?

  The "principle" of link info being in attributes is the first seriously
unclear text in the spec.  "Markup" -> "attribute values" and "character
data" -> "element content", I think.  Also, if this really is a principle,
it should be listed in a section at the top.

  Some other principles demonstrated by the spec:

  o Making the simple cases easy to mark up
  o Enabling sophisticated link databases
  o Requiring a "floor" of processing but not precluding new location
    addressing, link typologies, etc.

  The "musts" and "mays" of attributes should be further clarified; this
should be normative.  At the same time, we may want (in principle!) to
avoid #REQUIRED attributes, e.g. for TYPE and ROLE, and (who knows?) maybe
even for HREF.  (Again, I have no particular attachment to the name HREF.
It's better than A, though, because the "H" and the "REF" mean somewhat
obvious things.)

  Can people make their own "master" types and roles?  Or are they
constrained to making subtypes of the pre-defined starter sets we supply?
In talking with Terry, I've come to agree that we shouldn't prescribe *any*
minimum/starter set of types/roles.  This is an area for value to be added.
 Terry pointed out that in DocBook, we allow for subject-classifications of
content along the lines of "schemes" (such as the Library of Congress
Subject Headers), and suggested we use the same approach for link types.
If this group really wants to suggest some possible link types, we should
(a) dictate a method for associating a link type with a scheme, (b) make
our list of types informative rather than normative, and (c) label our list
with a reserved scheme name.

4. Addressing

  Is "documents, nodes, and regions" a complete list?

  For the special case of anchors that are themselves link elements, how
about having an attribute (or a BEHAVIOR attribute value) on the "calling"
link-end that indicates that the buck should stop at the next anchor?
Doesn't TEI have something for this?

4.3 SGML Reference Types

  The reference type "SGML" is very cool!  But it isn't named well; would
EXTID be better?  Is INTID or just ID also needed, for simple IDREF
references in the current document?

4.5 TEI Locator Reference Types

  For non-query xenoforms, is the TEI locator reference type supposed to be
used with FOREIGN?  This hardly seems fair to the inventors of other
location addressing methods.  Is a section 4.7 needed for "Other" locator
reference types?

  What are the reasons for chucking regular expressions and links to spans?
 Just for making the spec a manageable size?

5.1 Identifying Extended Link Groups

  Here's where I really start showing my considerable confusion about links
and BOSes and whatnot.

  If I have a document full of MLINKs for whatever purpose, and in the
course of its linking it points to a particular "content" document, why do
I want to have something in that "content" document that identifies where
the MLINK is stored?  This (a) makes a nasty interdependency, (b) doesn't
cleanly allow the "content" document to conditionally point to different
"linking" documents, and (c), if I'm not mistaken, can't be done in the
case of read-only "content" documents, which are the juiciest application
of using MLINKs in the first place.

  In a sense, the "extended link group" information in a document is a
*contextual* link that binds the document to the interlinked set.  Why not
make it an *independent* link by storing an "interlinked document set
manifest" in an MLINK itself, perhaps in a document of a special
XML-MANIFEST document type?  (Terry has suggested that perhaps MIME could
be used instead; that's even further out of my depth.)

5.2 LINKS and LINKSET Elements

  More confusion:

  I don't see how restricting the contexts where MLINKS can be found will
simplify anything, if they're sufficiently identified as MLINKS already.
Also, if you only allow one LINKS element, it's hard to manage
"recombinant" document objects (e.g., as entities) that travel with their
own LINKS elements.

  I'm also confused about the nature of LINKSET documents.  A collection of
out-of-line links is one thing, but if you want to present it to the user
(as a guided tour or whatever), either it's going to be extracted from a
"link database" by some link-aware tool externally to the XML goings-on,
*or* it's going to be embedded in a guided-tour *document* that the user
will begin with as Step One.  Right?  If this is true, then *any* document
should be allowed to contain collections of out-of-line links, along with
titles, instructions, and whatever other hoo-ha you want.  If you had a
manifest stored externally to both this guided-tour document and the "tour
fodder," and the guided-tour links had an approprate type mapped to
appropriate behavior, then is there even a need for special LINKSET documents?

A. The TEI extended pointer syntax

  The subsections here are all artificially demoted two levels, no doubt
because of the cutting and pasting.