Re: Documents, Cars, Hills, and Valleys

----- Original Message -----
From: "Mark Nottingham" <mnot@mnot.net>
To: "Sean B. Palmer" <sean@mysterylights.com>
Cc: <www-rdf-interest@w3.org>; "Tim Berners-Lee" <timbl@w3.org>
Sent: Wednesday, April 10, 2002 3:57 PM
Subject: Re: Documents, Cars, Hills, and Valleys


> My .02 -
>
> I'm on the Car side. The URI identifies a thing, whether it be document,
> car, or doomsday device.

Yes, in general.

> You manipulate and view it with things that you
> might call 'documents' (although I prefer Jeff Mogul's attempt [1] at a
> more precise terminology).

The documents however, and the cars are different, for HTTP.
This is not a property of URIs. It is a fact of HTTP.

> Otherwise, what are CGI/ASP/JSP/CFM/PHP/XSLT/etc. scripts?

They are not themselevs on the web. As Roy would say, don't confuse
the resource with its implementation. If your home page is implemented
with a cgi script, that doesn't mean it *is* a cgi script.
Some php3-implmeneted resources actually have a link to the php3-script
as a separate resource.  That asside, the script isn't a resource on
the web any more than the cooling fan of the server which runs the script is
the resource.

The http: part of the web is a web of well-defined resources, even if some
are generic with respect
to specific axes such as content-type, version, and language.

> They're
> clearly not just identifying documents; they're interpreted on the
> server, and involve state that often doesn't reside in those "documents."

You are introducing concepts of "residing inside" which are not part of the
architecture.
The script is part of the implememtation fo teh server for a document
whose contents changes with time and maybe other things.


> People are a bit more difficult, yes, but some form of gateway should do
> the trick, even if it only involves levers, lights and food pellets.
>
> [ As such, gateways act as a form of information hiding; when you access
> them, you're accessing a constrained interface to the identified thing,
> rather than the thing itself. I don't think this introduces a problem,
> though, because HTTP itself is a constraining interface. ]
>
> All of this said, I think it generally bad practice to say
>    <http;//www.mnot.net/> a :Person .
> but instead, it's better to say
>    [ :homepage <http://www.mnot.net/> ] a :Person .

Much better! In fact, the only tenable position.
Because if you adopt the notion that
<http;//www.mnot.net/> a :Person.

I would be forced to conclude that you, Mark, will expire
alas too soon: [1]

Expires: Thu, 11 Apr 2002 09:14:08 GMT

<http://www.mnot.net/> a :Person;
         http:expires "20020411T091408".

which gives you only a few hours. Sad.

("Off with her HEAD!" cried the Queen. "Oh, surely
you mean off with the HEAD of the HTTP reply
which returned a representation of a picture of me?"
protested Alice. "Same thing! GET me her HEAD!"
cried the Queen)


My point, which I  have made again and again, is that HTTP GET
is a protocol for talking about generic documents.
(One can argue different things about POSTable services,
but lets stick to the web as an information space for now.)

You could imagine a protocol (say SWTP) which directly
responds to requests about things.

<swtp://www.mnot.net/> a :Person.

is quite reasonable, and the SWTP protocol would be written
to return a document containing information related to
the thing identified. It could contain all sorts of information
about the person involved, and when different bits of that
information expire.  It would maybe give a separate identifier to
the documement it returns.

But that protocol is not HTTP.
HTTP has a lot of sophisticated design for the rendering of
generic documents.   To try and force it into swtp: functionality
is a kludge which would ruin it.

The semantic web must model HTTP faithfully.

The solution of course is very simple, because the # allows us
to jump from documents to things through the mime type.
We just invent a new language, not a new protocol. This cost is
much smaller.

Because even a semantic web which talks about Mark and his cars
is still doing it with documents, and the document way of working
is still useful, and the hTTP machinery for talking about the
properties of the documents themselves is important.

Given that the # allows us to be free of any restriction, we avoid forcing
HTTP to be what it ain't and still get all we need.

> Because AFAIK all homepages *are* documents or scripts or some other
> form of machine-based state, not gateways to people (counterexamples
> gratefully accepted). This situation, however, shouldn't be used to
> justify a restrictive characterisation of the resources which happen to
> be identified with the scheme 'http'.
>
> It may be interesting to look at other schemes and see how they restrict
> what is identified. From what I can see, schemes which can be
> dereferenced - whether it be imap, ftp or tel - tend to restrict how you
> access something, rather than define what something is. A 'tel' URI
> might identify a person, an answering machine, or dial-a-date.

That would be very fuzzy thinking.  The notion of a telphone number is
a very commonly understood one.  They have certain very nice properties.
You can use them indirectly, through a property to identify someone
but they are NOT that person.   a "tel:" URI does not identify
a person (idenify in the sense of the I in URI).

You can write

:mark a :Person;  contact:home [ contact:phone "+1-123-456-7890" ].

and that may indeed identify Mark.  And the concepts of  home
and phone number as relationships may be good or bad or indifferent in
your application. But the phone number space identifies endpoints
in the POTS protocol, things with which there is an expectation
(from the POTS protocol definition) that
one can establsih some form of limited bandidth communication
but that is all.

> On that point, I'm a bit surprised that TBL advocates the document view
> (or perhaps I just misunderstand the framing of the issue).

I only advocate it for HTTP.

You, your home page, and your phone number are all distinct.
It is important never to give the same URI.
Properties make it trivial to indirectly identify someone though their
home page or their telephone number.
In natural language, these things don't matter, but when building
machines, they do.

> The axiom of
> universality [2] dictates that "The Web works best when anything of
> value and identity is a first class object." Part of the test of
> independent invention [3] says that it should be possible to gateway new
> systems into old ones (e.g., the Person Identity Protocol into HTTP).
> Taken together, these arguments evoke more than 'document.'

You are arguing here generally, about evoking things.
this applies to URIs in general.

Let us design the gateway in practice.  We have
to imagine what the pip: space offers. Suppose
you can get information about a person and call them.

   pip:/us/ny/nyc/1965/12/11/Allen/Joe.745  is mapped to
   http://pip-gateway.us/ny/nyc/1965/12/11#Allen_Joe_745

The gateway has the property of giving an HTTP client (which
understands documents) a document about a bunch of people,
where the local identifier within that document Allen_Joe_745
identifies not a part of the document but a person,
Joe Allen.  The document would have to be in a semantic web
language which can talk about people. RDF would do fine.
So an RDF client would get information about a person
through a gateway from a system which has identifiers for
people.  It wouldn't get the person, of course.

The RDF could contain information that the person could
be called  at the H.XXX endpoint

callto://ny_nyc_1965_12_11_Allen_Joe_745.pip-gateway.us/

so we have mapped the functionality as well as we could.
That is all you can do with a gateway.

Tim
>
> 1.
>
http://www.research.compaq.com/wrl/people/mogul/www2002/mogulwww2002preprint
.
> pdf
> 2. http://www.w3.org/DesignIssues/Axioms.html#uri
> 3. http://www.w3.org/DesignIssues/Evolution.html#ToII
>
> --
> Mark Nottingham
> http://www.mnot.net/


[1]
http://cgi.w3.org/cgi-bin/headers?url=http%3A%2F%2Fwww.mnot.net%2F&auth=on
  contained when I sampled it:

200 OK
Server: Apache/1.3.12
Content-Type: text/html
Expires: Thu, 11 Apr 2002 09:14:08 GMT
Accept-Ranges: bytes
Date: Wed, 10 Apr 2002 21:14:08 GMT
Cache-Control: max-age=43200
Connection: close
Etag: "2ec2a-2319-3cac5305"
Content-Length: 8985
Last-Modified: Thu, 04 Apr 2002 13:20:05 GMT

Received on Wednesday, 10 April 2002 18:21:07 UTC