Re: Initial draft of XML-Link spec now available

At 7:31 PM 1/27/97, Eve L. Maler wrote (lots of good things):
>1.4 Terminology
>  I agree with Lee that the terminology is getting in the way.

   Definitely.

>  o The discussion of ALINKS in section 3.3 (how can an ALINK have only a
>    single link-end? shouldn't it have two?)

I think so. This is a question of whether we want to interpret existing
one-way links as having an "implicit" other way. I think that we should, as
we then have a nice model, and we really remove all essential differences
between link types Inline links become an obvious consequence of allowing a
link to treat itselfs as an implicit referent. This might be profitably
extended to multi-way links as well, which might be good for some kinds of
multi-way comment, where the link could be sensibly embedded in one of the
docments at a point of reference.

>  I agree with Lee about bi-directional versus multi-directional.
so do I. Bi-directional seems confusing when applied to a link that is the
hypertextual equivalent of the Place de la Concorde.

>  I really like the terms "in-line links" and "out-of-line links."  In
>fact, the salient feature of a link seems to be its linearity :-) and not
>its directionality, so I wonder if it's better to have something like
>ILLINK and OLLINK rather than ALINK and MLINK.

I think you may be right, that these terms (in-line and out-of-line) are
more self-explanatory than any others we have had. I don't like the element
names you suggest, because they're too acronymic, and too similar.

Maybe we should collapse the two elements, and simply have an attribute
that can express that one linkend is the link element itself.

>1.5 Types of Links

>  I don't see the difference between link formatting and link behavior.
>For example, where is the link explainer "shown"?  If it's balloon help,
>that's definitely behavior, isn't it?

Well, this brings us the long discussion just finished between Len and me.
I think these are not only exactly the same thing, as you observe, but that
separating them is actively harmful.

We should offer link-behavior and link rendering as things that can be
_based on_ link type, and pointer role (To use the terms I proposed in
another message -- did they make sense?).

There's no need to wire in non-generic markup unless we're going to
acutally list behaviors. I may think that having any behavior marking in
XML-link is a terrible idea, but if we do offer the option, it is
meaningless unless we say _what_ behaviors can be specified, and _how_.
Putting in hooks to user defined type strings has real value, as our
document design experiences have shown us, but putting in hooks to an
undefined scripting language is not useful at all.

>2. Link Recognition
>
>  First of all, *one* prefix needs to be chosen for all these
>"architectural" purposes.  Is it XML-, or XHL-, or -XML-, or -XHL-?

Yes. I hope we will have the separate attribute lists to work with, so we
can do this right. I think we should avoid hardwired element names, with or
without the hyphens, and we can do that one of two ways.... Attributes are
one way

>The
>leading hyphen now feels bizarre when it comes to element GIs, and suddenly
>I feel a lot more willing to invade the user's namespace...

Please, don't touch my namespace!

>  Can we officially consider PI "architectural form" summaries?  One of our
>XML design principles was to offer only one way to do things where
>possible.  I suspect that this won't fly for link recognition and other
>types of "recognition."  (In other words, this will obviously come up in
>phase 3 too.)  Are we willing to consider offering even more methods?

If we can do attributes in the Doctype declaration, we can offer that one
way... otherwise, I think we should offer the PI method as the one way to
indicate these linking architectures.

If we have to offer element names as a shortcut for the HTML crowd (and
vile thought it is, I think the argument has much force), then we should
stop at offering _two_ ways: PI/attributs and element name. Applications
will remain free, as they always are with generic markup, to define other
link architectures, so I don't see that XML link needs to be as flexible or
extensible as, say, HyTime is.


>2.3 Link Recognition by Other Means
>
>  The spec should note that such links are not highly interoperable among
>XML-aware applications.

Since this is always an option for XML, we should just delete all
references to other means. Let's define what XML link is, and leave what it
might also be out. This helps simplify things too.

>3.1 Information Associated with Links
>
>  I'm getting confused here.  There's info associated with whole links and
>info associated with link ends. In the case of ALINK, they're all smooshed
>together.  Can the separation be made more clear, and (I can't believe I'm
>asking this) can the formal specs be shown as architectural forms?

I think this may be a good idea...

>  The "principle" of link info being in attributes is the first seriously
>unclear text in the spec.  "Markup" -> "attribute values" and "character
>data" -> "element content", I think.  Also, if this really is a principle,
>it should be listed in a section at the top.
>
>  Some other principles demonstrated by the spec:
>
>  o Making the simple cases easy to mark up
>  o Enabling sophisticated link databases
>  o Requiring a "floor" of processing but not precluding new location
>    addressing, link typologies, etc.

Great suggestions. There should be a little bulleted list of principles
like this, up at the top of the definitions.

>  The "musts" and "mays" of attributes should be further clarified; this
>should be normative.  At the same time, we may want (in principle!) to
>avoid #REQUIRED attributes, e.g. for TYPE and ROLE, and (who knows?) maybe
>even for HREF.

This seems a bad principle to me, I found myself occasionally wondering why
more things were not required.

>  Can people make their own "master" types and roles?  Or are they
>constrained to making subtypes of the pre-defined starter sets we supply?
>In talking with Terry, I've come to agree that we shouldn't prescribe *any*
>minimum/starter set of types/roles.

I agree after seeing the total dissimilarity of vocabulary and style of the
lists that people have suggested here (I can remember at least Len's,
Eliot's and mine).

> This is an area for value to be added.
The only thing we might do, is add some rules for implied values (like for
instance numbers for pointer roles that are not specified in the document).
This would mean that stylesheet authors would nto have to worry about
missing values, as there would alway be _some_ values in question. If the
implied values were well defined, it could even ease processing
specifications by people who don't believe in roles and types -- they would
just write stylesheets that used the (purely formal) default values. So
behavior-only stylesheets would actually become fractionally esier.

> Terry pointed out that in DocBook, we allow for subject-classifications of
>content along the lines of "schemes" (such as the Library of Congress
>Subject Headers), and suggested we use the same approach for link types.
>
>If this group really wants to suggest some possible link types, we should
>(a) dictate a method for associating a link type with a scheme, (b) make
>our list of types informative rather than normative, and (c) label our list
>with a reserved scheme name.

If we suggest link-types this kind of two-level mechanism is a very good
idea. I now incline against predefining a set, though.


>4. Addressing
>
>  Is "documents, nodes, and regions" a complete list?
>
>  For the special case of anchors that are themselves link elements, how
>about having an attribute (or a BEHAVIOR attribute value) on the "calling"
>link-end that indicates that the buck should stop at the next anchor?
>Doesn't TEI have something for this?

This is a place where the current spec falls on its face. The indirection
behavior of HyTime is complex but can be useful. The single-level behavior
of HTML and TEI is easy to understand, but can cause management problems
that XML can only solve by the ugly use of entity values.

But we need to decide either to live with indirection, or to live without
it. This section 4 intorduction offers the worst kind of option -- An
author can't assume that indirection is available, and an application has
no way to tell if apparent indirection is intended or not (since it can't
know what the author intended).

I made an argument for a single level of pointer indirection, which is more
than powerful enough, but no indirection at all is better than not knowing
whether it is there or is not there.

>4.5 TEI Locator Reference Types
>
>  For non-query xenoforms, is the TEI locator reference type supposed to be
>used with FOREIGN?  This hardly seems fair to the inventors of other
>location addressing methods.  Is a section 4.7 needed for "Other" locator
>reference types?

Like Terry, I dislike the interaction between attribute values here, but I
don't see a way around it, so far.

I think that the HRTYPE attribute should be a fixed list of values. If
people want to extend the list for a special purpose they will, but we
should be unambiguous about what XML-LINK includes. I think, actually that
we can live with The SGML type of reference, and URLs plus TEI locator
fragment specifiers, as we discussed on the list.

We might even be able to use purely syntactic means (like parenthesizing
SGML references to distinguish them from URLs.

If we changed the syntax of TEI locations to use a / as separator, rather
than a space, we could avoid the escaping problem.

e.g.:
"http://foo.com/bar/bletch.xml#(ROOT/DESCENDANT(2/DIV1)/DESCENDANT(3/DIV2)"
A little gross, but not too bad.

>
>  What are the reasons for chucking regular expressions and links to spans?
> Just for making the spec a manageable size?

Links to spans is a real loss. There goes the ability to reference CD-based
resources. It is also funny, because in the discussion of anchor types, the
"regions" is explicitly mentioned, but we don't actually support it, for
the obvious kind of textual region.

I'm afraid that we also need some simple method for linking to
two-dimensional (or n-dimensional) areas. In-line imagemaps are already an
instance of a limited ilink (MLINK), and we should bear in mind that they
were the first such link implemented on the web! This has a lot to do with
the fact that we don't have any accepted technology to tag images.

>5.1 Identifying Extended Link Groups
>
>  Here's where I really start showing my considerable confusion about links
>and BOSes and whatnot.

Seems to me that you are actually whowing your understanding, as you seem
essentially right to me...

>  If I have a document full of MLINKs for whatever purpose, and in the
>course of its linking it points to a particular "content" document, why do
>I want to have something in that "content" document that identifies where
>the MLINK is stored?
You don't.
> This (a) makes a nasty interdependency,
Yes, and threatens to muck with your markup, too, with its LINKS element.

>(b) doesn't
>cleanly allow the "content" document to conditionally point to different
>"linking" documents,

We may not be able to mandate this anyway, as I think people don't want to
get into the situation where you browse to a starting document that changes
your active linksets, and bounces you along to your starting place.

We need some document somewhere that asserts the binding of a document to a
linkset.

>(c), if I'm not mistaken, can't be done in the
>case of read-only "content" documents, which are the juiciest application
>of using MLINKs in the first place.

Actually, you can do it, but you have to start at some ancillary document,
and then navigate to the read-only resource. (and this navigation might be
an automatic part of the rendering of some links in the document).

>  In a sense, the "extended link group" information in a document is a
>*contextual* link that binds the document to the interlinked set.  Why not
>make it an *independent* link by storing an "interlinked document set
>manifest" in an MLINK itself, perhaps in a document of a special
>XML-MANIFEST document type?  (Terry has suggested that perhaps MIME could
>be used instead; that's even further out of my depth.)

I'm not sure that this buys us enough to be worth the complexity. This is
essentially what I was trying to describe above.

>5.2 LINKS and LINKSET Elements
>
>  More confusion:
>
>  I don't see how restricting the contexts where MLINKS can be found will
>simplify anything, if they're sufficiently identified as MLINKS already.

Nope, but it does pointless reduce the author's ability to put markup where
wshe wants.

>Also, if you only allow one LINKS element, it's hard to manage
>"recombinant" document objects (e.g., as entities) that travel with their
>own LINKS elements.

And it's an _element_. I don't want people mucking with my DTD!

>  I'm also confused about the nature of LINKSET documents.  A collection of
>out-of-line links is one thing, but if you want to present it to the user
>(as a guided tour or whatever), either it's going to be extracted from a
>"link database" by some link-aware tool externally to the XML goings-on,
>*or* it's going to be embedded in a guided-tour *document* that the user
>will begin with as Step One.

That's what I want to do!

>  Right?  If this is true, then *any* document
>should be allowed to contain collections of out-of-line links, along with
>titles, instructions, and whatever other hoo-ha you want.

Yes, you've made the argument beautifully...

> If you had a
>manifest stored externally to both this guided-tour document and the "tour
>fodder," and the guided-tour links had an approprate type mapped to
>appropriate behavior, then is there even a need for special LINKSET documents?

Nope. If you want linkset documents you can make them, but I don't see how
having them preprogrammed in helps much at all...

And it we are going to specify XLGs we are going to have to explain what
happens if other documents in a group specify XLGs. so Why not just have
recusive traversal?

This is the same argument we've already had, but I've still seen no
technical explanation of why it's any harder to implement or control than
external entity references.

  -- David

I am not a number. I am an undefined character.
_________________________________________
David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________

Received on Wednesday, 29 January 1997 17:28:20 UTC