- From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
- Date: Thu, 22 May 97 15:07:25 CDT
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
On Sun, 18 May 1997 05:38:56 -0400 (EDT) Tim Bray said: >We have a botch at the moment in that the hyperlink #HERE is ambiguous, >does it mean the linking element, or does it mean ID(HERE)? We need to >fix the syntax to solve this. Solutions I have thought of include: > >(1) putting the whole xpointer in parentheses, unless it's just an ID value > e.g. HREF="#(HERE), or >(2) requiring that keywords such as HERE always be followed by a comma, > even if there's nothing after the comma, > e.g. HREF="#HERE," > >Any other nice syntax tricks? - Tim (3) Shifting the parentheses in the extended pointer, so that instead of following the pattern keyword(args) they follow the pattern (keyword,args) Then "#HERE" is unambiguously an ID, and "#(HERE)" is unambiguously a pointer to the pointer element itself -- a circular pointer. This would have the effect of disallowing the current rules which allow keywords like CHILD to take more than one set of parameters: CHILD(1)(3)(2) would have to be rewritten (CHILD,1)(CHILD,2)(CHILD,3) Since some people have found the multiple-arguments notation confusing, this might count (for them at least) as an improvement. It also takes the extended pointer notation closer to S-expressions (at least in the traditional TEI variant with spaces instead of commas), which may or may not make some of us break out in hives. (4) Providing a disambiguation rule like "In cases of ambiguity, try the IDREF interpretation; if it doesn't work, try the circular pointer." So "#HERE" is an ID short hand if and only if there is an element with an ID of HERE somewhere in the target document; otherwise, it's a circular pointer. If there *is* an ID of HERE, then either we find some way of expressing HERE more verbosely and unambiguously (see item (5) below) or else you just can't use HERE by itself any more. Ditto with ROOT. (5) In cases of ambiguity, prefer the IDREF interpretation unconditionally: ROOT and HERE are valid as keywords only as the first in a series of location steps, and cannot be used on their own. To get the effect of the old HERE and ROOT, just prefix the keyword with another location step; since in TEI pointers HERE and ROOT ignore their location source, the prefixed location step has no effect. So the old HERE could be expressed HERE HERE, or ROOT HERE, or CHILD (1) HERE, and the old ROOT as HERE ROOT, or ROOT ROOT, or CHILD (1) ROOT ... (6) In cases of ambiguity, prefer the keyword interpretation; this interpretation will never fail, so there is no need for a try/fail/retry loop. As a result, the short form "#foo" cannot be used with all IDs, only with IDs other than HERE and ROOT -- for those, you have to use the full form ID (HERE) or ID (ROOT). (7) Since the only extended pointer expressions which can be mistaken for SGML IDs are HERE and ROOT standing alone, require them to be parenthesized, without requiring a parenthesis around the entire expression. So HERE is an ID (HERE) is a circular pointer HERE ANCESTOR(2) is a pointer to the grandparent of the link element --- Of these solutions: 1 is bearable but ugly in that it requires a sizable change to the TEI notation (parentheses around *all* extended pointers) in order to avoid a conflict introduced by our own insistence on having the #foo shorthand -- and also in that it requires parens around all pointers to avoid a grand total of two cases of ambiguity. This cannon will certainly kill the sparrow, but I'd prefer something lighter weight. 2 is too ugly to contemplate 3 is kind of pretty in the original TEI notation -- at least, if you like Lisp -- but I don't like it much with the commas needed to shoehorn extended pointers into URLs. I'd say it's bearable, but not inviting. 4 has the virtue of requiring no syntactic change, and providing absolutely interoperable results, but the minor flaw of requiring too much work to decide what to do, and the fatal flaw of allowing the innocent addition of an ID in one part of a document to break links elsewhere in the document. 5 is probably better than 4, but I believe the current xml-link draft doesn't even allow ROOT and HERE anywhere but the beginning of the expression; we'd have to go back to the TEI rules (under which the sequence of keywords is unconstrained) to make this work. 6 is my preference: requires no change to the syntax, and has clear rules about what happens 7 I like less, because the parens around just HERE and ROOT feel too ad hoc. But I like it better than 1. -C. M. Sperberg-McQueen
Received on Thursday, 22 May 1997 17:03:13 UTC