Link indirection from Peter Flynn on 1997-01-04 (w3c-sgml-wg@w3.org from January 1997)

From: Peter Flynn <pflynn@curia.ucc.ie>
Date: 04 Jan 1997 16:33:45 +0000 (GMT)
To: w3c-sgml-wg@www10.w3.org
Message-id: <199701041633.QAA09963@curia.ucc.ie>
   <urlloc id="bookrev">http://www.drmacro.com/bookrev</urlloc>
   <P>
   See <link linkend="bookrev">my book review site</link> for
   a draft introduction to HyTime.

I see two problems here: 

   a. the current paradigm (however faulty or misleading) is that
      element _content_ is destined for display/formatting: browser
      authors will need a shift if they are to handle content which
      gets used for some other purpose;

   b. the extra level of indirection is going to be a major
      stumbling-block for authors accustomed to HREFs.

I have always favored the principle that if you strip away all element
markup from an instance, you should be left with readable continuous
prose -- unformatted, yes, but essentially "the text" of the document.
I've run up against this before -- Eve and Terry know my mixed
feelings about the way DocBook expects INDEXTERM to be used -- but I
am still left with a very uneasy feeling about DTDs which expect
indirect referential material to be inserted as element content
instead of attribute values. 

   But surely you can also do something like:
    <p>See <link linkend="http://www.drmacro.com/bookrev">my book review
   site</link> for ...

This is the obvious way to do it for simple one-way links where the
return unspecified, and presumed to be the function of the browser's
BACK button: it may not have the formal elegance of a queryloc, but to
someone whose business is the text, not the markup, it seems to make
more sense. For more complex associations, I agree completely that we
absolutely need the more powerful methods of HyTime or TEI or whatever.

      We can't make link formats depend on the server: so if we have path
   extensions on URLs, like his proposed "/chapter=4/p=23" then we need
   browsers to be able to parse these addresses -- and I think that violates
   the URL opaqueness condition that I stated, along with its consequencs.

If I read this right, we have three choices for implementing the more
complex addressing:

   i) break and rewrite the rules for URLs (and probably HTTP as well)
      and allow arbitrary strings including spaces -- somehow;

  ii) stick to the existing rules, hexify [^A-Za-z0-9], and either
      educate authors to do it in their sleep, or someone write nice
      little pop-up editor functions that do x-www-urlencoding on the
      fly;

 iii) extend the existing mechanisms (in transmission as well as at
      the server end) to catch TEI-EPN or whatever, and pass it to an
      embedded parser-searcher which will perform the relevant
      functions. 

David wrote:

   Call me reactionary, but I'm just trying to speak for the people
   who already know HTML who just want to get the job done -- they may
   accept additional complexity for additional power, but if what they
   can already do becomes harder, they won't necessarily take the
   trouble to enter our brave new world of descriptive markup.

I think you're right to be concerned, and I think we do need two
levels: a transitional level which provides essentially what we can be
done with HTML functionality, but on a more structured and rational
basis; in preparation for a higher level which allows the
implementation of much greater things.

///Peter
Received on Saturday, 4 January 1997 11:34:11 UTC