Re: MIME, SGML, UDIs, HTML and W3

Bob Peterson (peterson@choctaw.csc.ti.com)
Fri, 19 Jun 92 13:29:14 CDT


From: peterson@choctaw.csc.ti.com (Bob Peterson)
Message-Id: <9206191829.AA23472@choctaw.csc.ti.com>
Subject: Re: MIME, SGML, UDIs, HTML and W3
To: timbl@zippy.lcs.mit.edu (Tim Berners-Lee)
Date: Fri, 19 Jun 92 13:29:14 CDT
Cc: connolly@pixel.convex.com, enag@ifi.uio.no, www-talk@nxoc01.cern.ch,
In-Reply-To: <9206111622.AA03819@zippy.lcs.mit.edu>; from "Tim Berners-Lee" at Jun 11, 92 12:22 pm

You said:
> 
> 	...
> PHILOSOPHY
> 
> 	In the W3 world, the model is of a dynamic world of
> 	documents which generally have some "home" or
> 	(or several), which can be found using sufficient
> 	intelligence and the help of ones friends given the UDI.

  My group has thought about the identity issue in the context of
object identifiers in a distributed object-oriented database.

> 	A mail message has no home, and so in principle the parts
> 	of it have no home. When a hypertext multipart message
> 	(really consisting of multiple hypertext documents)
> 	has links between its parts they refer to each other
> 	within a completely isolated conetext.

  In the OODB we think of an address (UDI or object identifier) as
relative to some enclosing context.  Different parts of an address make
sense only in the correct context.  For example, the mail system
accesses several address contexts to resolve a mail address such as
peterson@csc.ti.com: .com, ti.com, csc.ti.com, and the email address
namespace.  Each context understands its part and returns a reference
to the next, usually more specific, context.  The program(s) attempting
to resolve the address understand the result of an address lookup, and
use each result appropriately.

  I claim a UDI makes sense only in a particular context.  If a UDI
makes explicit all contexts except the most global, then a UDI easily
refers to a different part of the same multipart message.

> 	There are now two possibilites when the message is in fact
> 	archived and made readable. One is we say that the parts
> 	are then addressed as parts of the message, wherever it
> 	may be.

  This might enable operating on a message when the "home" is the
process' address space, i.e., before the message is placed into a file
system or other addressing context.  In effect the context is the
machine and the process' address space, but these can be, and generally
are, defaulted or assumed rather than explicitly stated.

>	         The other is to say that the parts of the message
> 	are very likely things which had some original home.
> 	In that case, the message is just giving the reciever
> 	a copy to save him the (perhaps insurmountable) trouble
> 	of retrieving it.  In this case the parts should be
> 	identified with thier original UDIs so that the
> 	receiver is not confsed with multiple documents which
> 	are in fact the same thing. 

  I wonder about attaching two UDI's to a message: a (required)
absolute UDI, referring to the original home, and a second (optional)
UDI referring to a "less expensive" copy.  ("Less expensive" is, of
course, arbitrarly defined.)  Think of the latter as a hint, i.e., if
the user attempts to resolve the UDI the system first looks for the
hint and, if found, uses it.  If the hint is absent or fails, then the
system tries to use the (more expensive) required UDI.

  Of course thinking about this might be simpler if we refer to one UDI
with two parts: one required, the other optional.

  Benefits of this approach include retaining the reference to the
original site while, at the same time, supporting replication of the
document in an arbitrary number of locations.  If the optional UDI is
relative to the containing message then (1) the reference never fails,
and (2) performance is excellent.  Retaining the original UDI should
help some applications monitor the original for revisions, e.g., an
archive site could cache a document but check periodically with the
original site for an updated version.  Retaining the original can also
help resolve the validity of a document, e.g., by enabling comparison
of the original and cached copies.

  One could implement the optional UDI as a table external to the
document.  When dereferencing a UDI the table is checked first and, if
the UDI is found, the associated optional UDI is used.  This has the
advantage of not modifying the original document, including not
changing the result of any error detection arithmetic, e.g., checksums.

> 
> I think that's all the comments I have on what I've read so far..
> 
> 	Tim

    Bob

-- 
Bob Peterson             Work: peterson@csc.ti.com              Expressway Site
Texas Instruments        Home: peterson@zgnews.lonestar.org     North Building
P.O. Box 655474, MS238   TIMSG: RWP  Landline: +1 214 995 6080  Aisle A4
Dallas, Tx USA 75265                 FAX line: +1 214 995 0304  2-88V97