Re: Permitting non-indirect links from David G. Durand on 1997-01-09 (w3c-sgml-wg@w3.org from January 1997)

From: David G. Durand <dgd@cs.bu.edu>
Date: Thu, 9 Jan 1997 11:57:05 -0500
To: w3c-sgml-wg@www10.w3.org
Message-Id: <v02130501aefacd4fb5f7@[128.148.19.115]>
At 4:19 AM 1/9/97, Martin Bryan wrote:
>At 15:51 8/1/97 -0500, Steven J. DeRose wrote:
>>HyTime (until the TC) provided no support for cases similar to this. So, all
>>those documents would have to be re-authored even though what they're doing
>>is reasonable.
>
>How many HyTime books would need to be reauthored for Web delivery?
>(Do I need a second pair of hands to count them?)

Steve's point was that this requirement of HyTime was a defect, that would
require re-authoring all SGML documents to _become_ HyTime Documents. This
may contribute to the 2 hands scenario yi=ou mention. I agree that XML
should _allow_ indirection, but it _cannot_ require indirection. At least,
not if we want people to use it.

>There's no question that we need to allow a construct for embedded links to
>URLs. The question is whether this is really a "comparably simple construct"
>which might require the authors to re-author their links or an "link
>identifying architectural form" that can be used to tell XML browsers that
>this existing element is to be processed as if it were an XML link. I
>contend the latter is what is needed.

I think we are all agreed that some kind of AF is the ideal for this. But
no-one is yet dealing with the problems of attribute defaulting straight
out. When we get to it, we need to deal with (or find ways to make
acceptable) several conflicting forces:

   1. People tag by hand. The tagging must _appear_ sensible to
hand-taggers to gain acceptance. (especially since good XML authoring tools
will lag the introduction of XML viewing support, if history is to be any
guide).

   2. DTDs will frequently be unavailable -- so we either need to use DTD
subsets, or PIs (Please please no!), or namespace pollution, or explicit
attributes (which conflicts with 1 severely), or #CURRENT style variants)
which introduce the kind of linear parse dependency we suffered so to
eliminate in XML itself.

   3. For most plausible applications of XML, linking support, while
modularized, is not optional. This means that any decision we make that
violates a goal we tried to satisfy for XML parsing (e.g. DTD optionality
for simple documents, no long-distance dependencies, etc.) is _de facto_ a
decision to abandon that property for essentailly all actual
implementations.

>>A link with a URL on an attribute is structurally the same, it just has
>>slightly different syntax: it combines the two attributes into one, and it
>>puts a system identifier (essentially) right there, instead of indirecting
>>through the DTD.
>
>The problem is that because it is not indirected it cannot be managed. XML
>must have manageable links if we are to get people to move from the simple
>world of HTML to the more complex world of XML.

You don't need to use this feature of XML if you can't manage it. People
expect it and want it (and not all of those people are totally clueless).
If we don't have direct links XML will simply not be used. At least some of
the time, I work with typical web-designers, and they would just laugh and
go onto the next technology.

>> Lots of SGML "cheats" with public or system ids on
>>attributes, rather than indirecting through entity declarations. We can say
>>that's awful if we like, but its frequency shows it is attractive, perhaps
>>because:
>>
>>a) it keeps the whole reference in once place, which greatly simplifies
>>readability and maintainability.
>
>The last thing we want is to keep the "whole reference in one place" - thats
>just the problem with managing URLs. They are kept all over the place, not
>at a central, controllable point.

You are arguing strategies of link management, not features. Agreed you
have a requirement for indirection, that everyone seems to buy into. Others
have a requirement for non-indirection, and you should leave them to make
that tradeoff (of immediate simplicity versus dentralized management) on
their own.

>>d) it saves system resources because you don't have to keep all your entity
>>dcls around on the off chance that the last element in the document may
>>mention one a second time. since most entities that are declared to allow
>>linking, are referenced only once or a very few times, you gain a lot on
>>average even though we can dream up scenarios where you'd lose.
>
>On my site 10% of the entities are referenced more than 20 times, and 30-40%
>are referenced more than once. The phrase "off chance" hardly applies! The
>overhead for referencing the ones only pointed to once is more than offset
>by those that are referenced many many times.

Fine. Then the strategy you think is best for you actually is. This is not
really a surprise.

   -- David

I am not a number. I am an undefined character.
_________________________________________
David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________
Received on Thursday, 9 January 1997 11:50:05 UTC