- From: W. Eliot Kimber <eliot@isogen.com>
- Date: Sun, 22 Dec 1996 11:11:15 -0900
- To: w3c-sgml-wg@www10.w3.org
At 05:51 PM 12/21/96 -0800, Tim Bray wrote: >I think that Steve was making an important point, but I think that I >didn't really get it. So this is a request for amplification, with some >questions > >>The question is: >>"Does an anchor know that it is an anchor?" > >What does it mean for an anchor to know it's an anchor... and I guess, >what exactly are you terming an anchor? Consider the following: In HyTime terminology, an "anchor" is an object (or list of objects) that is addressed by a hyperlink as a particular anchor role. This definition of "anchor" is slightly different from that used in HTML and in some other hypertext formalisms (e.g., the Dexter hypertext model). In HyTime, anything that can be addressed by any means can *potentially* be an anchor. As you can address things without explicit identifiers, things can be addressed without their knowledge. Also, in HyTime terms, a thing becomes an anchor only when a hyperlink addresses it, not before. Thus, putting an ID on something does not, in and of itself, make it an anchor. A hyperlinking element can be one of its own anchors (a "self anchor", meaning the link links to itself). To relate this to HTML, the A element is a hyperlink when the HREF attribute is used. It is a "contextual link" because it is also one of its own anchors. The other anchor is whatever the URL points to, which could be another entire page, a named A element, or something returned by a query (HyTime considers CGI scripts to be a form of query, using the notation of "query" to mean "anything that HyTime doesn't define directly"). >Example 1: http://www.textuality.com/sgml-erb/mprdv.html > >not as an example, but in and of itself, embedded in the email you >are now reading. I would assume this is not an anchor in the sense that you >mean; 1-way www semantics make it a link-end but not an anchor. Actually in this example, the URL is an anchor, but not a link. The link exists virtually in the mailer. In HyTime, it would be defined something like this: <link refmark="url-in-text" refsub="objects-with-url"> Links all occurrences of URL strings in body of mail message (reference marks) to the objects with that URL (reference subjects). Uses the Perl support of the mail program to resolve the addresses. Location source for both queries is current mail message. </link> <!-- Queries that implement semantics of link shown above. --> <!NOTATION perl PUBLIC "-//L.Wall//NOTATION Programming perl//EN" "<isbn>0-937175-64-1" > <queryloc id=url-in-text notation=perl> @URLS = (); open(MAIL, "< current-message"); while (<MAIL>) { chop; if (~/(("http:"|"ftp:"|"mailto:")[\w\d\.\/\-\~\?\#]+)/) { # Above regex probably flawed but you get the point. push($1,@URLS); } return(@URLS); # returns list of URLs addresssed </queryloc> <queryloc id=objects-with-url notation=perl> @objects = (); foreach $url (@URLS) { push(&resolve_url($url), @objects); } return(@objects); </queryloc> In other words, you have a two-anchor link with two anchor roles, "refmark" and "refsub". In this case, the strings addressed by the "url-in-text" function are the "refmark" anchor. The objects with those URLs are the "refsub" anchor. Both anchors are lists in this case and defined using inter-dependent queries (a very useful technique). You probably wouldn't actually implement a mail program's functionality this way (but you could and it would make for a very interesting general facility to be able to define new links of this sort). But it shows how you can use links and queries to express the precise relationship of function (matching URLs in text and resolving them) to the relationship between the two funtions in a standard way. > The person >who placed mprdv.html at www.textuality.com and sent the URL out by email >was consciously creating an anchor that in some sense knows it's an anchor >since there is an httpd server that will give anyone a copy, no questions >asked. But the anchor doesn't know it's an anchor, it only knows it's a string. The application of linking semantics that make it an anchor are separate from the string (and there's no requirement that they ever be applied even though the author may reasonably expect them to be most of the time--if I'm reading the mail with RN or something, the URL is not an anchor at that moment). >On the other hand, when > >Example 2: <A NAME="sec3.17"> > >appears in an HTML document, I assume you would call this an anchor that >knows it's an anchor? It exists only to provide addressing hooks. Even though the *semantic* of the A element when the NAME attribute is used is to provide a point that *can* be linked to, as far as HyTime is concerned, it isn't an anchor until someone *does* link to it. In other words, just putting an ID on something doesn't make it an anchor. Or, said another way, because you can potentially address anything, everything is always potentially an anchor. Putting an ID on something doesn't make it more or less likely to be an anchor except to the degree that limits in your addressing functionality make it easier or harder to address things with particular properties. >On the other hand, (Example 3: ) with some analogue of ilink, where you can >point into a document from outside using locaddrs or some such, you clearly >have a case that what's being pointed-at does not and cannot in principle >know it's an anchor... or am I missing your point? I think Steve might be asking about three things: 1. Should we allow completely independent links (unilateral addressing of all anchors of a link). If we disallow it, then at least one anchor of every link will know it's an anchor because the link is always one of it's own anchors. I think we're all agreed that we need independent links. 2. Should the elements always indicate, in their syntactic representation, that they are anchors (e.g., an attribute called "anchored-by" that lists the addresses of those things that point at it). This could be reasonable in a tightly-controlled, closed system of documents, but is probably not reasonable or possible in a Web environment and I doubt that anyone would seriously propose it. 3. Should the *methods* associated with objects *always* be informed when they are addressed as an anchor? This is a bit more subtle, because it can be difficult or impossible to do this in all environments (e.g., when the anchors are addressed by a query against the entire Web). In other words, in the general case its useful or necessary to defer resolving some anchor addresses until the anchor is traversed to (or access to the anchor is otherwise requested). This means that there will always be anchors that do not know they are anchors at the time link is created, only at the time an attempt is made to address the anchor. The data represented by XML documents is dead and lifeless--it is just data. But there are always presumably processing methods associated with the data--browser styles, retrieval methods, whatever. Thus there will always be *something* that is "responsible" for the data objects and could potentially be informed about their anchor status. These methods might be specialized link management systems, e.g., HyTime engines, Xanadu systems, Hyper-G servers, etc. ["Dynamic" or "active" documents are, I presume, documents where the methods for certain data objects are tightly bound to the data object, e.g., a script embedded in an HTML document. However, such documents are no more or less active than documents to which the equivalent function is applied by external methods--in fact I'd argue they're *less* dynamic because the tight binding tends to limit you to only one possible behavior, instead of an infinity of possible behaviors.] Note that in the HyTime model where you have a HyTime engine, the method for any object can always interrogate the HyTime semantic grove to determine if the object is an anchor, because the HyTime semantic grove is where the information about all the links it is managing is maintained. Thus you should expect any general-purpose HyTime engine to have an "anchored-by?" function that will return a list of either all the links of which the object is an anchor. Given that list of links, you can then ask the HyTime engine about the other anchors of those links. [And note that since the HyTime semantic grove is a grove, it can be interrogated by DSSSL styles and transforms using normal DSSSL functions, so there's an obvious path to using DSSSL to apply presentation and/or behavior to HyTime-managed hyperlinks. Groves are so cool.] This is essentially what systems like Hyper-G/HyperWave or Xanadu do: maintain a database of what is linked to what. HyTime just defines a specific data model and some base semantics for it. No magic here, just standardization of typical practice (I'm not sure there's a enough concensus to say "standard practice"). Cheers, E. -- W. Eliot Kimber (eliot@isogen.com) Senior SGML Consulting Engineer, Highland Consulting 2200 North Lamar Street, Suite 230, Dallas, Texas 75202 +1-214-953-0004 +1-214-953-3152 fax http://www.isogen.com (work) http://www.drmacro.com (home) "Rats in the morning, rats in the afternoon...if they don't go away, I'll be re-educated soon..." --Austin Lounge Lizards, "1984 Blues"
Received on Sunday, 22 December 1996 13:13:01 UTC