Re: httpRange-14 from Roy T. Fielding on 2003-07-29 (www-tag@w3.org from July 2003)

From: Roy T. Fielding <fielding@apache.org>
Date: Tue, 29 Jul 2003 01:33:58 -0700
To: Tim Berners-Lee <timbl@w3.org>
Cc: Norman Walsh <Norman.Walsh@Sun.COM>, www-tag@w3.org
Message-Id: <65F93DB6-C19F-11D7-BF88-000393753936@apache.org>
>> You have left out all of the HTTP-based services that are not
>> documents, by any stretch of the imagination, and yet still are
>> identified by "http" URIs.  "almost all" is not ALL.
>
> You are of course right that there are services,
> but those still are not people. Norm's argument stands.

How can it stand?  If you point at one person and observe that they
are tall, does it stand to logic that all people are tall?  Or even
most?  No.  Assertions about an entire set are false if they are
found to be false for any member of that set.

> You are formally right, but with respect to Norm's argument,
> you are splitting hairs.
> If we call the things which are there for GET "documents"
> and the things which are there for POST, "services",
> and the things which are  just there for HEAD "hopefulls",
> then all these classes of thing (of which we talk about documents
> mostly, as Norm did in his simplification) are still not people.
>
> If you take the union of all these things, it does not include people.

How do you know?  Do we need to turn this into a Turing test?

You are claiming that the identifier places restrictions on what
can be named because it can be used as an identifier in an anchor
(or submit, "Location" dialog, etc.) and activated.  That is the
basis of "http" means "document" claim of httpRange-14.

I am saying that the information system matches identifiers to
representations according to a set of rules completely unknown
to the client.  Given any URI of any scheme, I can place that URI
into a correctly implemented user agent and it will be acted upon
as if it were what you call an "information resource".  That is
because the ONLY thing that makes it an information resource is
the context in which the URI was used: the retrieval context of
an information retrieval system.

If we take the same identifier and place it in a non-retrieval
context, such as an xmlns attribute or an RDF assertion, then
it no longer acts like an "information resource."  The URI does
not change, nor does the resource, so any claim that the scheme
causes the resource to fall into one category or another is false.
That holds true for all URI schemes, without exception.
There must not be any exceptions, since it is an axiom of the current
Web architecture that identification is orthogonal to interaction.

> It is still important for somebody who visits Mark Baker's web page
> to be able to make comments about it as a work without having
> to call him to find out whether the URI is being used for him or
> his dog or a galaxy somewhere.

It is important for somebody to be able to make comments about the
resource, what might be obtained from that resource, who manages
that resource, and how all of that might vary over time.  That stuff
is rarely defined by the scheme ("data" being one exception).

>> The fact is that "most users" don't know that the target of a submit
>> button is a URI that usually begins with "http".  It is also a fact
>> that most users of automated tools based on libwww-perl never see
>> the URI.  It is reasonable to expect the same will be true of WSA.
>> If most users think that "http" means document, then all I can say
>> is that most users are not developers of WWW technology.
>>
>> http URIs identify resources via the http naming convention.  That
>> is all that ever needs to be said, anywhere, for any system that
>> makes use of http URIs (semantic or otherwise).
>
> I am sorry, while there are two distinct things which are meant
> by a single URI (say, Mark and his web page), then the system
> is unusable for semantic web purposes until we can resolve which is 
> used.

I don't believe that is true.  Use an explicit assertion; that is more
reliable than a false assumption, regardless of the system being 
defined.

> [...]
> | In any case, making claims about resources by examining their
>> scheme completely fails when considering all of the other schemes,
>> especially "urn".
>
> Nonsense.
> The mailto: URIs denote endpoints in a store and forward 
> message-passing system
> called email.  The operations which can be performed are primarily to 
> mail to them using SMTP, hence the name "mailto".  They also appear in 
> the protocol in other ways.
> That was the intent of the Web design, and it remains it from my point 
> of view.

Tim, you are fond of saying that the scheme refers to a specification
that, in turn, defines the meaning of identifiers within that scheme.
The mailto specification does not define itself as you claim.  mailto
is a means for obtaining a pre-filled composition for Internet mail.
mailto does not contain a naming authority -- at most it contains a
mailbox address (not necessarily qualified with a DNS domain) and a
set of field=value pairs for establishing the content of a message.
SMTP is not identified -- it is left to the user agent configuration
to determine how the message should be transferred.  It isn't possible
to follow that specification and claim that mailto == mailbox.
In an case, "mailto" != "urn".

> They do not support GET, and I know you can say that you can make a 
> gateway to them dreaming up something which is for you a 
> representation of them, and I can too, but that is a kludge.

It is running code.  A system that real people purchased, made use of,
and later got swallowed by the Microsoft empire.  The only reason you
don't see it every day is because the folks at Netscape reduced the
flexibility of the Web by creating a "user-friendly" dialog with a
fixed set of known schemes to replace the prior unbounded set of
{scheme}_proxy environment variables -- a dialog that was later copied
by MSIE and Safari and doubtless many others.

> It doe not render a representation of an information resource, it 
> tells you what some program knows about messages sent  to and from or 
> the owner of the mailbox. Useful, but not a GET of the mailbox.  
> (Folks, we are talking SMTP, not IMAP here!  Not a random access 
> retrieval protocol, ot a space of accessible information objects, but 
> a posting and forwarding system)

No, we aren't talking SMTP here.  It would be really useful if this
discussion were grounded in what was implemented.

> It is really usful ht we know that ISBNs apply to books, ISSNs to 
> magazines, SSNs to people, vehicle license number to vehicle licenses.

SSNs to accounts.  Yes, it is useful.  It is also unnecessary.  More
importantly, in the case of both http and urn (the one I was talking
about), no such distinctions can be discovered by looking at the scheme.

> It is also really important in the architecture to be able to declare 
> new different types of conceptual object and make a new subspace of 
> the URI space for them to be.

Why?  I think it would be useful if we had a mechanism for stating
metadata about a resource using assertions.

>> There are no special categories of "information resource".
>
> You say no, I say yes.
> For me an information resource is an important concept.
> For me that is what the information space which is the (HTTP GET) web 
> is made of.
> IT seemed, independently, quite clear to Pat too - in fact, he was 
> appalled and confused by the arch doc's lack of distinction of the 
> concept.

He was appalled and confused by the claims made in the document about
the uniqueness of meaning for URIs.  Those claims did not come from me
or any discussion amongst the TAG.  We don't need separate classes of
resources to clear up the confusion, particularly since such a
distinction doesn't disambiguate a situation where one information
resource is representing another information resource.  What we need
is a clear separation of concerns between the process of identification
and the use of an information retrieval system.

>> Any URI
>> provided within a retrieval context is assumed to be an information
>> resource.
>
> The architecture is *not* one in which the classes are deduced from 
> context.

Well, then, what architecture are we talking about?  It obviously
isn't the one that contains HTML, DOM, URI, HTTP, etc.

>>   The scheme is irrelevant to such assumptions.  The right
>> solution is to fix the Semantic Web so that it doesn't throw away
>> method semantics, as it does currently by assuming a URI denotes
>> what is obtained by a response to GET.
>
> It does NOT assume that it denotes what you get in response.

Then you can't say anything about how the true nature of the
resource might differ from the pages you get interacting with it.
The claimed ambiguity was on the basis of retrieval, which means
GET.

> Please don't put words in people's mouths, incorrect ones.
> This is explained in painstaking detail in
> http://www.w3.org/DesignIssues/HTTP-URI.html,
>  which was presented at the TAG F2F in vancouver in 2002-09.

Which has been responded to, in painstaking detail.  For that
matter, I would appreciate it if you would stop bringing out the
car example and claiming things about what I think when I have
already told you FIVE times now they are not what I think.

I am sorry, Tim, but on this issue you have consistently refused
to listen to any of my comments and those of the rest of the TAG --
you can't even distinguish between our comments and those of
Mark Baker.  You made up your mind a year ago and haven't heard
a word since.

In my considered opinion, your "desirable distinction" for SW is
undesirable for the deployed Web, fatal for Web Services, and
not considered necessary by anyone else I've talked to that
are currently working on SW.  I am not giving you that opinion
because I like to wear out my fingers on the keyboard or spend
oodles of money traveling to face-to-face meetings.

I have heard and understood every one of your comments, and I
understand that you believe it is important for the Semantic Web
to be able to distinguish these things.  However, the design
principle of separation of concerns that led to the "orthogonal
protocols deserve orthogonal specifications" constraint on the
Web architecture is far more important than any of the perceived
benefits you claim for SW.  ANY claim that a resource type
distinction can always be determined by examining the URI scheme
is false and always will be false, so if you build such an
assumption into the Semantic Web then you are dooming that system
to a rather dismal future.  Find another solution to your problem,
preferably one that doesn't run counter to the established design
principles that we have worked with for over a decade.

....Roy
Received on Tuesday, 29 July 2003 04:34:52 UTC