Re: Permitting non-indirect links from Martin Bryan on 1997-01-16 (w3c-sgml-wg@w3.org from January 1997)

From: Martin Bryan <mtbryan@sgml.u-net.com>
Date: Thu, 16 Jan 1997 10:48:30 +0000
To: w3c-sgml-wg@www10.w3.org
Message-Id: <1.5.4.32.19970116104830.006b72d4@mail.u-net.com>
David Durand wrote:
>>There are two problems with this appoach. One is the need to define entities
>>for each part of each URL in the BOS or document, and the second is that
>>each link has to look different so my second link has to read:
>><a href="http://&base-server2;/&rootpath2;/&that-subject-path;/&that-doc">
>
>They only have to look this different if it uses a different server,
>different root path, different subject path, and different document path.

I was just making it clear to other readers that the entity names you gave,
which looked very nice for the first example, are not reusable for every URL
as entities cannot be redeclared within docs. The result is that by the
100th source of files you get some pretty wierd entity names, and a chance
of clashing with other entity names in the DTD.

>Declaring entities is no harder than the extra elements that the HyTime
>approach would require.

I'm not saying it would be - remember that I have said before that entities
in a BOS are a good way of assigning referencable names to files. The same
applies to naming reusable fragments of file identifiers, if you are willing
to live with parsing the entity declarations. The problem is that our
DTD-less browser would be required to parse the entity declarations to get
the URLs. I thought we wanted to avoid this.

>>A more user friendly approach is:
>>
>><a URL-method="http" URL-DNS="www.echo.lu" URL-path="oii/en"
>>URL-file="alpha.html" URL-fragment="X">
>
>This seems syntactically equivalent to a hard-coded link. Where does the
>indirection come in in this example?

It is a hard-coded example: it was just a simple example to show how managed
info on DNS names, path names at DNS and files within paths could be
presented using attributes - the indirection came latter.

>>This is the same as the capture on HoTMetaL, but instead of resolving the
>>whole URL into a single string at capture stage it leaves the resolution to
>>the point at which calling of the file is needed.
>
>I don't understand this sentence. The above example has a seriws of
>hard-wired values, and seems utterly equivalent to the following:
><a URL="http://www.echo.lu/"oii/en/alpha.html#X>

The difference is that I can change one part of the hardwired string without
changing the whole string: therefore changing all files that have moved
directory or moved server becomes a lot easier.

>I see the breakup of the URL into attribute values, but I don't see that
>this does anything but save a programmer from writing a 50-line URL parser.

The problem is that link management is done by real users, not programmers.
We don't want to have to write a URL parser to change all relevant directory
pathnames!

>>With indirection of the strings to XML location address IDREFS (beginning
>>xla for clarity) this would become:
>>
>><a URL-method=http XML-DNS=xla12345 XML-path=xla34567 XML-file=xla98765
>>XML-fragment=xla76345>
>
>Now we see some indirection, but how does it differ semantically from:
><a URL="http://&xla12345;/&xla34567;/xla34567;alpha.html#X">

Only in that it points to id name space rather than entity name space. Id
name space is created in the document instance, which must be parsed by the
browser, whereas entity name space is created in the DTD, which is not
necessarily parsed by the browser.


>>>HyTime location ladders can specify this, and I think that the use of
>>>relative URLs, along with the location facilities we have been talking
>>>about, would actually let you do the equivalent with links.
>>
>>The problem is that HyTime location ladders don't seem to allow you to point
>>to subsections of location sources. For example, can I do a location address
>>just to a DNS, and then another just to a directory path at the DNS. I don't
>>think I can using pure HyTime, but I would sure like to for XML link >
>>management.
>
>You could do something like this, I think (Eliot: apologies for any
>syntactic gaffes, but I am prety sure of the semantics)
>
><a dest="url-chain">
><location id="url-chain" locsrc="chain1" url="#x">
><location id="chain1" locsrc="chain2" url="alpha.html">
><location id="chain2" locsrc="chain3" url="/oii/en/"

The problem is that the last location address is wrong as far as HyTime is
concerned as it does not point to a referencable entity. We would need DNS
and pathnames to be allowed as valid "entity-source-identifiers" for this to
work. There seems to be no reason why it should not, it just hasn't been
envisaged to date.

>We would need to define that relative URL semantics apply when a URL is the
>locsrc for a URL. This makes sense.
>
>However, I don't like the HyTime solution that much, as locsrc has all the
>decoupling problems of ilink, but offers less obvious value.

The value of locsrc is its reusability. You only need to define it once and
then reference it via the ID. This means you only have one entry to manage
for each location source. This is a vast saving in management for heavily
reused address components. It is also infinitely quicker to update, and
generally a safer management approach.

>I am not saying that it's worthless, but I think we can live without locsrc
>and still have a _lot_ of power.
>
>As far as I know, the attribute value syntax that martin is proposing could
>not be a valid HyTime location latter (though it could of course be a
>query, since any markup at all can be defined as a query).
>
>So Martin's proposed markup is only HyTime compatible in the weakest way.

I agree that the HyTime compatibility is weak, due to the problems stated
above, but the problem of name space management in a DTD-less environment
require that we rely on names in instances rather than names in DTDs. My
concern in all this is not whether it is technically feasible, but whether
it is realistically manageable over a long time period. As far as this goes
I will continue to play devil's advocate and suggest alternate ways of doing
things that might have better management potential.

>>Similarly the SOIBase concept of formal system identifiers does not allow
>>for segmentation of the base, which is what I am asking for.
>
>Entities can be used in the same way there too.

But they cannot be used in catalogs: how do I move my URL locators to
catalogs if they have entity declarations in them. Do I have to resolve them
and then update them latter? Bang goes my management idea of leaving
resolution of the URL until the point the file is actually called.

>Now I know that entities are (over) used in SGML to make up for language
>deficiencies, but I think that URL composition is actually a pretty natural
>use of entities: certainly it would not take long to explain to authors.

It would take no  longer to explain how my four/five attributes map to the
URL syntax: they are a direct mapping of the contructs in the URL RFC! And
they do not change name for each URL locator, while the entity names would
need to.

>And although it is somewhat verbose, any indirection as powerful as Martin
>seems to want is going to be verbose, so I don't feel that we're condemning
>naive users to endless obscurity if we decide that entities are good
>enough.

If you consider the HoTMetal URL creation menu verbose then how come
HoTMetaL is a leader in the field? All I am doing is say store the info
HoTMetaL creates in its separate boxes as separate attributes until such
time as you call the file, instead of resolving the complete URL the minute
you close the attribute menu.

The fact that URL definitions using ID locators is more verbose than doing
an entitiy declaration is irrelvant to users. The IDs and associated
declarations will be generated automatically by the editor without user
intervention. The question is whether browser recognition of entity
declarations takes longer than browser resolution of IDs. If the entity
declarations have to be read as part of the DTD I would contend that at
browser level HyTime location addressing may be faster than entity resolution!
----
Martin Bryan, The SGML Centre, Churchdown, Glos. GL3 2PU, UK 
Phone/Fax: +44 1452 714029   WWW home page: http://www.u-net.com/~sgml/
Received on Thursday, 16 January 1997 05:55:33 UTC