Re: Simple solution? Pub. Idents. vs URN. from Jon Bosak on 1996-11-28 (w3c-sgml-wg@w3.org from November 1996)

From: Jon Bosak <bosak@atlantic-83.Eng.Sun.COM>
Date: Thu, 28 Nov 1996 11:22:54 -0800
To: w3c-sgml-wg@w3.org
CC: bosak@atlantic-83.Eng.Sun.COM
Message-Id: <199611281922.LAA03161@boethius.eng.sun.com>
[Paul Prescod:]

| > Domain name resolution is when I'm given one of a possibly unlimited
| > number of names for a machine and I need to find the one machine
| > that it refers to.
| >
| > Document resolution is when I'm given the name of a document and I
| > need to find one of its possibly unlimited number of copies.
| 
| Not so. The W3C Addressing/Activity page describes how it is already
| possible for one domain name address to point to multiple IP addresses. So
| domain name resolution is *also* sometimes a many to one, one to one, or 
| one to many mapping. That this process is currently difficult, and underused
| is a bug in DNS and DNS systems, not in the basic DNS problem space.

I didn't know that, and if I had, I would have written what I said
differently.  But it doesn't change the point I was trying to make.

| > To think of "the document" as a thing bound to a location in the same
| > way that one thinks of "the machine" as a thing bound to a location
| > is, in my opinion, to commit a category error that hopelessly muddles
| > all further thought on the subject.  
| 
| The same is true of machines. If DNS does not allow me to set up a machine in
| Tokyo and Johannesburg with the same name a one in California, to serve the
| same information, then DNS (or DNS implementation) is incomplete.

The parallel between the multiple copies of a document and the
multiple mirrors of a machine is logically incorrect.

The key fact upon which everything else depends is that texts are
fundamentally different from every other thing in the world.  They
occupy a different level of reality and follow different rules.  This
is not some crackpot notion of mine (I have a few of those, but this
is not one of them).  There happens to have been some intense analysis
of this question over the last few decades, and I'm simply reporting
some of what I remember.

Consider two parallel cases.  You and I each have a copy of a book,
let us say Burton's Anatomy of Melancholy, and you and I each have a
computer that has been assigned the name boethius.eng.sun.com (that's
actually the name of my machine at work; let's suppose that it's the
name of yours, too).  Let us further hypothesize that my copy of
Burton is ever so different in its physical characteristics from
yours; mine is in three octavo volumes, two of which have been
repaired with some tape, in a green cloth binding, and was published
in 1893, whereas yours (let us imagine) is in a single leather-bound
quarto volume published in 1925.  And let us suppose by contrast that
the two computers are as physically alike as two computers possibly
can be: the same system configuration, the same peripherals, even the
same data, right down to the last bit.  We'll even have duplicate
serial number tags put on them.

But while our copies of Burton are as physically different as can be
imagined, and our copies of boethius are as physically similar as can
be imagined, the fact is that we are in possession of the identical
text, but we are not in possession of the identical computer.  And the
reason is that by "text" we do not mean its physical incarnation.
This is how texts are different from everything else in the world.

Here's the proof.  According to the Law of Identity, two things, A and
B, are identical if, and only if, everything that can truly be
predicated of A can truly be predicated of B, and vice versa.  Let us
apply this test to our parallel.

Suppose I say to you that in Burton's Anatomy of Melancholy, Part III,
Sect. IV, Mem. II. Subs. VI (Burton was an early user of TEI location
addressing), he says that the Cure of Despair can be summed up in the
statement

   Be not solitary, be not idle.

If you check your copy, you will find my statement to be true.
However dissimilar the physical realizations may be, we are in
possession of the identical text.  But if I say to you that the top of
the second hard drive housing of boethius.eng.sun.com is adorned with
a small clay bird that my daughter gave me as an office-warming
present, and you check your installation, you will find this not to be
true.  However similar their physical realizations may be, we are not
in possession of the identical computer.  (The fact that they do not
occupy the same physical space and therefore do not share the same GPS
coordinates would guarantee this anyway, but I wanted to make a nice
illustration.)

Thinking of copies of texts as being parallel to copies of machines is
a category error because the predicates that we are logically allowed
to apply to texts are different from the predicates that we are
logically allowed to apply to everything else.  We may predicate of a
text, for example, that it is well-written, or reeks of sentiment, or
evokes the nineteenth century, or indicates that the author had an
unhappy childhood, or supports my claim to the throne of the Kingdom
of Belgravia.  We cannot predicate these things of the file referred
to by a URL or of the pile of paper upon which the document may be
printed.

The point is that machine identifiers and document identifiers are
doing logically distinct things.  At the end of the DNS trail, a
domain name identifies some physical object or objects.  At the end of
the FPI trail, the FPI (if properly implemented; take ISBNs as the
canonical example) does not identify a physical object but rather any
of a possibly unlimited number of realizations of a nonphysical
object.  Yes, an FPI resolver has to be able to deliver you a physical
thing that you can read, but that thing is *not* what is being
identified by the FPI; the thing that's being identified is the text
itself, and the text itself does not have a physical location.

All of this would be completely irrelevant and certainly out of place
in this discussion if it weren't for my inability to escape the
feeling that a failure to understand the distinction between the text
and its realizations is at the root of a great deal of difficulty in
solving the document location problem.  Since my misgivings do not
point directly to a solution, however, this thread should probably be
taken as no more than a piece of free holiday philosophizing.

Jon
Received on Thursday, 28 November 1996 14:25:09 UTC