Re: [dgd@cs.bu.edu: BOS confusion (analysis; suggestion to resolve Newcomb/Bryan conflict)] from W. Eliot Kimber on 1997-01-03 (w3c-sgml-wg@w3.org from January 1997)

From: W. Eliot Kimber <eliot@isogen.com>
Date: Fri, 03 Jan 1997 12:03:06 -0900
To: w3c-sgml-wg@www10.w3.org
Message-Id: <3.0.32.19970103112849.00b08e80@uu10.psi.com>
At 05:42 PM 1/3/97 +0000, Martin Bryan wrote:
>
>>I don't know what Martin means by "link chains". This is not a HyTime term.
>>If he means "location ladders", then you can have location ladders without
>>requiring entity declarations (although the use of entities would be the
>>SGML/HyTime recommended method).
>
>I did mean location ladders. The HyTime TC says (twice, so you have to
>believe it!) that "The ultimate location source for any location ladder is
>the root of a grove". It also says "If the location source is a document,
>subdocument or data entity, the location source is within the grove
>constructed from that entity". (Note the use of the term entity!)

This is all true, but it's not the whole story.  The TC also defines rules
for "implied location sources".  For queries, the location source may be
inherent in the query--thus, the query may define the grove that is its
ultimate location source and it may do so *without reference to a declared
entity*.  In the case of URLs used as a query, the location source is
inherent in the URL and the grove would be the grove constructed from the
object at the address specified by the URL.  Thus, there is no requirement
to declare entities in order to establish the location source of query
location addresses.

Note also that the term "entity" is the *only* term by which SGML (and thus
HyTime) can refer to SGML documents as all SGML documents are, by
definition "document entities".  Therefore, any HTML, SGML, or XML document
located by a URL is an entity, whether it has been declared explicitly or not.

>What I am unclear about is what would constitute a the grove of something
>pointed to using a URL of the form report.html#point1 if the HTML page
>pointed to was constructed as follows:
>
><p>The points raised were:
><ul><li><a name=point1></a>How useful is HyTime
><li><a name="point2"></a>How useful is HyTime

The grove would have to be constructed by an "HTML grove constructor" that
knows how to account for these differences (the HTML parser in SoftQuad's
HoTMetaL is an example of a tool that has the smarts to put most HTML
documents into an SGML "grove"--in this case, HoTMetaL's internal object
model of SGML documents, which is functionally a grove).

Remember that to be addressible, each data content notation must have a
corresponding "grove construction process" that knows how to interpret the
data and express it in terms of a particular property set.  The property
set is simply an aggreement on the object model to be used among a set of
applications--there's no magic, here.

This requirement exists for all addressing systems: HyTime merely codifies
it and provides some formalisms for expressing the details of the process
in an abstract and generalized way.  But regardless of how you do it, you
have to define for each data notation what the objects are you can address
and how they are related, otherwise you have chaos and inconsistent
behaviors.  You can defined it either using something like property sets or
you can define it behaviorally by "when you do X you get Y".  I prefer the
property set approach (we had much too much of the latter in HyTime
orginally and it lead to a lot of ambiguitity and confusion).
Unfortunately, behaviorial specifications are usually easier to state
initially and appear to be easier to understand (until you run into
ambiguous cases, then it gets very difficult to determine what should
happen).  Of course, only providing a behavioral specification does let the
first implementor define the real rules through their implementation
choices, which is to the implementor's proprietary advantage....

>Given the fact that most documents on the Web would not allow sensible
>groves to be constructed from existing named anchors the relevance of HyTime
>location sources for linking XML files to HTML anchors must be considered.

As demonstrated above, your inference is incorrect.  Sensible groves can be
built from HTML documents given the appropriate grove construction process
(using either the SGML property set or some other property set--the
property set used doesn't affect the ability to use HyTime location
addressing).  Note that the "grove construction process" doesn't have to be
literally implemented--it can consist simply of a design document that says
how to *behave as if* you had constructed the grove--in other words, it's
an API spec where the input is location addresses of a particular form and
the output is the data addressed (or pointers to it, obviously).

Cheers,

E.
--
W. Eliot Kimber (eliot@isogen.com) 
Senior SGML Consulting Engineer, Highland Consulting
2200 North Lamar Street, Suite 230, Dallas, Texas 75202
+1-214-953-0004 +1-214-953-3152 fax
http://www.isogen.com (work) http://www.drmacro.com (home)
"Rats in the morning, rats in the afternoon...if they don't go away, I'll be
re-educated soon..."                 --Austin Lounge Lizards, "1984 Blues"
Received on Friday, 3 January 1997 14:04:50 UTC