Re: Link-1: Syntax

On Sun, 18 May 1997 05:38:56 -0400 (EDT) Tim Bray said:
>We have a botch at the moment in that the hyperlink #HERE is ambiguous,
>does it mean the linking element, or does it mean ID(HERE)?  We need to
>fix the syntax to solve this.  Solutions I have thought of include:
>
>(1) putting the whole xpointer in parentheses, unless it's just an ID value
>    e.g. HREF="#(HERE), or
>(2) requiring that keywords such as HERE always be followed by a comma,
>    even if there's nothing after the comma,
>    e.g. HREF="#HERE,"
>
>Any other nice syntax tricks? - Tim

(3) Shifting the parentheses in the extended pointer, so that instead of
following the pattern

  keyword(args)

they follow the pattern

  (keyword,args)

Then "#HERE" is unambiguously an ID, and "#(HERE)" is unambiguously a
pointer to the pointer element itself -- a circular pointer.

This would have the effect of disallowing the current rules which allow
keywords like CHILD to take more than one set of parameters:

   CHILD(1)(3)(2)

would have to be rewritten

   (CHILD,1)(CHILD,2)(CHILD,3)

Since some people have found the multiple-arguments notation confusing,
this might count (for them at least) as an improvement.

It also takes the extended pointer notation closer to S-expressions
(at least in the traditional TEI variant with spaces instead of
commas), which may or may not make some of us break out in hives.

(4) Providing a disambiguation rule like "In cases of ambiguity, try the
IDREF interpretation; if it doesn't work, try the circular pointer."
So "#HERE" is an ID short hand if and only if there is an element with
an ID of HERE somewhere in the target document; otherwise, it's
a circular pointer.  If there *is* an ID of HERE, then either we find
some way of expressing HERE more verbosely and unambiguously (see
item (5) below) or else you just can't use HERE by itself any more.
Ditto with ROOT.

(5) In cases of ambiguity, prefer the IDREF interpretation
unconditionally:  ROOT and HERE are valid as keywords only as the first
in a series of location steps, and cannot be used on their own.  To get
the effect of the old HERE and ROOT, just prefix the keyword with
another location step; since in TEI pointers HERE and ROOT ignore their
location source, the prefixed location step has no effect.  So the old
HERE could be expressed HERE HERE, or ROOT HERE, or CHILD (1) HERE, and
the old ROOT as HERE ROOT, or ROOT ROOT, or CHILD (1) ROOT ...

(6) In cases of ambiguity, prefer the keyword interpretation; this
interpretation will never fail, so there is no need for a try/fail/retry
loop.  As a result, the short form "#foo" cannot be used with all IDs,
only with IDs other than HERE and ROOT -- for those, you have to use the
full form ID (HERE) or ID (ROOT).

(7) Since the only extended pointer expressions which can be
mistaken for SGML IDs are HERE and ROOT standing alone, require them
to be parenthesized, without requiring a parenthesis around the
entire expression.  So

  HERE             is an ID
  (HERE)           is a circular pointer
  HERE ANCESTOR(2) is a pointer to the grandparent of the link element

---

Of these solutions:

  1 is bearable but ugly in that it requires a sizable change to the TEI
notation (parentheses around *all* extended pointers) in order to avoid
a conflict introduced by our own insistence on having the #foo shorthand
-- and also in that it requires parens around all pointers to avoid a
grand total of two cases of ambiguity.  This cannon will certainly kill
the sparrow, but I'd prefer something lighter weight.
  2 is too ugly to contemplate
  3 is kind of pretty in the original TEI notation -- at least, if you
like Lisp -- but I don't like it much with the commas needed to shoehorn
extended pointers into URLs.  I'd say it's bearable, but not inviting.
  4 has the virtue of requiring no syntactic change, and providing
absolutely interoperable results, but the minor flaw of requiring too
much work to decide what to do, and the fatal flaw of allowing the
innocent addition of an ID in one part of a document to break links
elsewhere in the document.
  5 is probably better than 4, but I believe the current xml-link
draft doesn't even allow ROOT and HERE anywhere but the beginning
of the expression; we'd have to go back to the TEI rules (under
which the sequence of keywords is unconstrained) to make this work.
  6 is my preference: requires no change to the syntax, and has
clear rules about what happens
  7 I like less, because the parens around just HERE and ROOT feel
too ad hoc.  But I like it better than 1.


-C. M. Sperberg-McQueen

Received on Thursday, 22 May 1997 17:03:13 UTC