Re: Uniform access to descriptions from Michaeljohn Clement on 2008-04-13 (www-tag@w3.org from April 2008)

From: Michaeljohn Clement <mj@mjclement.com>
Date: Sun, 13 Apr 2008 16:17:16 -0600
To: wangxiao@musc.edu
CC: Pat Hayes <phayes@ihmc.us>, "www-tag@w3.org WG" <www-tag@w3.org>, noah_mendelsohn@us.ibm.com, Jonathan Rees <jar@creativecommons.org>, Phil Archer <parcher@icra.org>, "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>
Message-ID: <4802866C.1030905@mjclement.com>
Xiaoshu Wang wrote:
> Michaeljohn Clement wrote:
>> Either the URI from which you get a 200 OK response
>> identifies an information resource, in which case we can make
>> statements about it, or it does not in which case we cannot any longer
>> make statements about the page by using the URI.
> 
> Then, you haven't get to the essence of what I tried to say. It does not
> lose that capability.

You are right.  My statement "we cannot any longer make statements 
about the page by using the URI" was too strong.  However, we cannot 
do so as easily.  If the URI does not unambiguously identify the Web 
page, then we must go the extra step of creating something that does.

> First, whatever information resource is, 200
> doesn't allow you to identify that resource unless you know it is a
> byte-copy of that resource. 

That assertion is essentially the negation of the httpRange-14 
resolution.

But you are wrong, there is no such thing as a byte-copy of an 
information resource (in the AWWW sense).  Obviously a clear 
definition of what an IR is has been hard to come by, but it is 
not something of which one can take a byte-copy.  Any such thing 
would be an awww:representation.

> If there are more than one content-types
> bound with the URI, regardless the old view of my view, you already lose
> that capability.

Not at all.  Those content-types are all awww:representations of 
the same awww:resource, and the "old view" is carefully constructed 
to preserve just that capability.

>> We can't even say what the URI identifies anymore without getting
>> out-of-band data about it, which in will not often exist.
>   
> This again is wrong.  Don't you know what
> "http://www.ihmc.us/users/phayes/PatHayes" denotes by reading it?

Of course: it (the URI) denotes a resource.  Which turns out to be 
a Web page which thinks it is a person.

In your view, of course, it denotes Pat Hayes.

Now if I was to say:

"http://www.ihmc.us/users/phayes/PatHayes is wrong."

This becomes ambiguous.

Incidentally, this corresponds well with normal natural language 
usage.  If I want to talk about Pat Hayes, I will use his name, 
not his URI.  If I do use the URI casually and without further 
qualification, it is likely that I mean to say something about 
the page itself.

(Note that that URI in our own discussions has become, not a 
synonym of "Pat Hayes", but rather the name of "Pat's famous 
page".)

> Don't
> you know what "http://www.w3.org" is about by reading it?

I guess you mean reading the content of the page, not reading 
the URI, but even if I read both, I don't know what 
"http://www.w3.org/" is /about/, though I do know what it names, 
namely, the homepage of the W3C.

> What do you mean *out-of-band* data? 

In ordinary usage, say, in a browser, once a page is loaded, I 
can use the URI to identify the page and make statements about it.

If, as you say, the status code is not sufficient to tell me that I 
can do this, then I need additional data to say that the URI I 
loaded actually identifies the page, rather than, say, the moon, or 
Pat Hayes.

My browser may give me a button that I can use to rate the page, or 
to indicate that I think it is a spam page, or some assertion of that 
nature.  This could be done by publishing triples to my own personal 
Web space.  All of those obvious applications are hindered if I cannot 
easily use the URI to identify the page, are they not?

We can also embrace ambiguity, as Roy Fielding would have us do, and 
throw up our hands and say that a URI may well identify both the moon 
and a Web page, and we will rely on context to distinguish them.  Then 
we may have to invent something new to do what RDF was intended to do, 
since it will become impossible to reason about triples that use URIs 
without first implementing a general AI to distinguish when two uses 
of the same URI actually identify the same thing and when they do not, 
even when that URI is only used correctly.

> Or you intend to get a *complete* knowledge of
> what a URI denotes? 

Is there is a distinction you are making between "denotes" and 
"identifies"?  In any case, I don't want to have to get a complete 
knowledge about anything in order to be able to say that 
"http://www.w3.org/" identifies the W3C homepage.

Once I see "HTTP/1.1 200 OK" come over the wire, I know enough.

> But isn't it a reality that we never has the
> complete knowledge about  a reality?

An information resource isn't a reality, it's a part of an abstract 
model.  That model is a useful approximation of what we want our 
software to do, so it helps us write interoperable software.

Within that model, we have complete knowledge about what kind of 
thing a URI identifies, because we get to choose and define what 
our formalisms mean.

> See above.  In human language, use prepositions such as "the HTML
> representation of resource x".  In machine, build some terminologies,
> such as "_:y abc:htmlOf x".  Then, is the meaning of _:y clear now?
> Wouldn't this, in fact, give you a more precise way to describe something?

It gives a more precise way by taking away the easy way to make 
statements about resource x without having to first create and 
define a new name for it.  In a distributed system like the Web, 
if we have to re-name other people's Web resources before we can 
make statements about them, it complicates doing so.

Michaeljohn
Received on Sunday, 13 April 2008 22:18:04 UTC