Re: URLs/MIME only? from Gavin Nicol on 1997-01-02 (w3c-sgml-wg@w3.org from January 1997)

From: Gavin Nicol <gtn@ebt.com>
Date: Thu, 2 Jan 1997 12:47:37 -0500
To: dgd@cs.bu.edu
CC: w3c-sgml-wg@www10.w3.org, w3c-sgml-wg@www10.w3.org
Message-Id: <199701021747.MAA06470@nathaniel.ebt>
>>Why? I think this depends upon whether you see documents or
>>(p)elements as the atomic objects. I think there is a *very* strong
>>strong case for looking at elements as atoms, especially when we start
>>thinking about transclusion etc. We can build an awful lot upon the
>>two forms above in a *simple* and *standard* way.
>
>The problem that I see is that HTTP and current implementations make a
>pretty strong URL<->transmission unit equivalence. XML has a notion of
>transmission unit that is a natural one, and that is the entity. So we are
>making entities into findamental addressable units. Now we could also make
>elements into addressable units, but doing so makes some pretty hard
>requirements on servers, if those URLs are to be resolvable. 

Then I would suggest that linking to servers that do *not* support such
a capability be done only with the URL's that that server can
understand. Each server has a list of addresses that it manages, and
it controls the addresses (ie. servers that don't support sub-document
addressing never serve out URL's implying they do).

>Now I agree that a dynabase-like approach is the "right" way to build
>sophisticated servers, but I think the demands of document parsing,
>etc. that are entailed by making that approach mandatory will prevent
>the implementation of the lightweight, quick-hack servers that have
>driven the evolution of the web.

I don't think so. Again, each server controls it's own address
space. However, I also think that any XML document that you
*could* address directly, will also be small enough so that you can
parse it on the fly, and serve out chunks, without significant delays
(especially in the face of caching proxies, and client-side
caching). For larger documents (SGML/XML), you're going to need
something like DynaWeb (not DynaBase) because nothing else will scale
as well (ie, linear parsing becomes a real loser after a few hundred
kilobytes).

Then there is also the question of adressing vs retrieval. Being able
to address something is entirely different from being able to retrieve
it, and in many cases, you do wish to adress something, but *not*
retrieve it, except in the context of the original base document.

>>Points 2-4 are red herrings, and fall out of point 1.
>
>They may not be logically independent, but I don't think they are
>red-herrings, either. Doing as you ask requires a rather different model
>from the one we have been implictly using:
>   XML entities <-> URLs to access them
>   XML elements must be parsed to be detected.

I have never seen it explicitly stated that there must be a 1-1
correspondence between a URL and a single given entity. In fact, given
content negotiation and whatnot, I doubt that you could actually
guarantee it anyway (ever seen a URL that retrieves different objects
based on HTTP fields?). 

While I do agree that this is the *assumed* method, I (a) cannot see
such a big problem in creating servers that can serve out elements,
and, (b) never argued that all servers *must* have this
capability. Rather, I was suggetsing a URL format that is generally
applicable to heirarchical data structures. If your server doesn't
*have* a heirarchical structure, then it doesn't need to use such
heirarchical addresses (I should note that this is a sub-class of the
heirarchical case, so the same adress format handles both cases).

>Yes, but it prevents the "slap XML files into an existing server, with a
>new mime type" deployment apporach. It may be one hell of a convoluted noun
>phrase, but it is going to be a common implementation strategy.

It doesn't hinder such deployment at all (and in fact, I fully expect
that most *initial* implementations will be precisely this).

>>God forbid. That would (logically) mean that you could switch encoding
>>mid-stream... which is precisely how this fellow interpreted it!!
>
>Unless I'm wrong, PIs are still legal in XML in general, and they are
>unconstrained as to functionality: (though we might want to state that PIs
>_cannot_ change parsing behavior, if we don't already). So he might even
>_have_ a conforming extension to XML,  albeit a pretty vile one.

We probably need more language discussing the properties of the <?XML>
PI, or better, get the encoding stuff removed from it (especially as
now we have quite a bit of parsing to do before we can even come to
the supposed encoding declaration).
Received on Thursday, 2 January 1997 12:49:35 UTC