Re: usage of 'resource' vs 'representation' in HTML 5, CSS, HTML 4, SVG, ... from Jonathan Rees on 2010-01-11 (www-tag@w3.org from January 2010)

From: Jonathan Rees <jar@creativecommons.org>
Date: Mon, 11 Jan 2010 08:41:56 -0500
To: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Cc: Dan Connolly <connolly@w3.org>, "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <760bcb2a1001110541n62140b75s7a3bff3f7c46c3de@mail.gmail.com>
I, too, am sympathetic to arguments that say you shouldn't talk about
"things that don't exist". Conversations about phlogiston and gremlins
generally come to no good. But existence is a very slippery idea, and
even if we agreed on what existed and what didn't, it's not clear that
any agreement on a theory of existence would have any place in a
technical specification.

Human conversation is full of statements whose subjects and objects
are things that "don't exist". In law we have things like
corporations, courts, suits, easements, etc., none of which has better
claim to "existence" than HTTP's "resource". Computer science and the
web are populated almost exclusively with things that don't exist:
algorithms, data structures, files, databases, services, virtual
hosts, protocols, and so on.

I think the right question is not whether such entities exist, but
whether speech using such designators as subjects and objects (a) has
useful and clear consequences and (b) does not lead us into fallacies.
I suspect these considerations are probably closer to the substance of
the original complaint about "resource" than any opinions about
existence.

The "resource" fiction on the surface seems to be a back-construction:
R is whatever it has to be in order for a server authorized to
"associate" R with a URI U to be correctly implementing the HTTP spec
(which talks about resources from time to time) when it responds as it
does. That is, if I have a statement (in a spec, say) P(R) where R
putatively designates a resource, there is always a logically
equivalent statement P'(S,U) where S is a server authorized to process
requests for URI U designating R. Because I never have to talk about R
if I don't want to, the notion of "resource" is not *neccessary*.

If someone wants "resource" in the HTTP sense to remain in the HTML
spec, even if under a different term, its use should be justified
based on its utility and convenience in describing and reasoning about
what's going on (and it should be shown to not promote fallacies).
Just as it is easier to talk about a table in a relational database,
instead of a DBMS's behavior when some string S appears in "select ...
from S..." (as if the DBMS were "real"!), it would seem to me
plausible that there might be situations where talking about a
resource is easier than talking about what a server does at some URI.
Whether such situations exist in the HTML spec is a question I can't
answer.

If nothing is gained by using "resource" sensu RFC 2616 in the HTML
spec, that's fine, but then it's simple manners I think to stay away
from repurposing a term that's pervasive in IETF and W3C documents and
that has been established with a particular meaning at considerable
cost. Choice of terminology is rarely so important that it justifies
major disruptions. Better to find some other term for whatever it is
one wanted to attach to "resource".

Best
Jonathan

On Mon, Jan 11, 2010 at 2:30 AM, "Martin J. Dürst"
<duerst@it.aoyama.ac.jp> wrote:
> Hello Dan, dear TAG,
>
> On 2009/12/10 13:55, Dan Connolly wrote:
>>
>> The clock has started on this issue in the HTML WG; proposals are due
>> January 16, 2010
>> http://lists.w3.org/Archives/Public/public-html/2009Dec/0256.html
>>
>> I'm thinking out loud a bit here...
>>
>> I'm sympathetic to this viewpoint:
>>
>> "the confusion is caused by trying to reference something that doesn't
>> exist. There is no such thing as what you call a "resource" -- it's an
>> abstract concept that has no correspondance to the real world. It is
>> unnecessary and makes talking about our infrastructure more complicated."
>> -- http://lists.w3.org/Archives/Public/public-html/2009Sep/1133.html
>
> I was reading this a few weeks ago, and didn't feel comfortable with it.
> Recently, I think I found a way to explain why. Consider the following text:
>
> "the confusion is caused by trying to reference something that doesn't
> exist. There is no such thing as what you call a "number" -- it's an
> abstract concept that has no correspondence to the real world."
>
> Indeed numbers don't exist in the real world. I can have five oranges or
> five apples, or hold up five fingers, but "five", what's that? Can you show
> me "five"? No. It will always be "five something".
>
> Nevertheless, I guess most people agree that "five", and numbers in general,
> are immensely useful, even if they don't exist in the real world.
>
>
>> Meanwhile, the translation impact suggests otherwise:
>> http://lists.w3.org/Archives/Public/public-html/2009Sep/1136.html
>
> That talks about the issue of translating the term "resource".
>
> There's also the issue of language negotiation.
> http://httpd.apache.org/docs/2.2/, for example, looks different for me in
> Opera (where the top language in Accept-Language is English) and in Safari
> (where it's Japanese). Yet it's the same URI, and the same resource (the top
> page of Apache HTTP Server Version 2.2 documentation).
>
>> Ian points out usage that suggests "a resource is a bag of bits"
>> in HTML 4, CSS, SVG etc.
>> http://lists.w3.org/Archives/Public/public-html/2009Sep/1132.html
>>
>> Roy Fielding dismisses those as "just examples," but I think it's
>> a bit more subtle than that... I think the webarch view of those usages
>> is that typically, a URI identifies
>> pretty much a file... the kind whose contents change over time, not the
>> contents of the file at any one time. So to say '<xyz.html> identifies
>> an HTML file' is not to say that it identifies a bag/sequence of bits,
>> but rather that it identifies a resource whose representations have mime
>> type text/html .
>> But as I say, I'm sympathetic to the position that (outside of the
>> Semantic Web) this
>> abstraction just makes talking about all this stuff more complicated.
>>
>> Meanwhile, Ian also says:
>>
>> This is actually intended to refer to "bag of bits". It identifies a
>> bag of bits in the same way that a telephone number identifies a
>> person. Sure, if you call a number at different times you might end up
>> with different people, but you're still using a phone number to
>> identify a person, you just don't know which one until you try to use
>> the phone.
>
> Well, that's true in some cases (where you use the phone number to
> "identify" several people living there), but in many cases, it's actually
> not true. There's definitely the very frequent case that you have a phone
> number identifying (for yourself) a single person, but where you're not sure
> you'll get him/her, or somebody else from the family (who exactly you don't
> really care). There's also the case that the phone gets answered by a (to
> you) complete stranger or somebody you wouldn't expect (and therefore
> wouldn't identify by that phone number) because that (to you) complete
> stranger happened to visit and e.g. was asked to pick up the phone and tell
> you that the phone's owner is just washing the dishes or whatever, and will
> call back soon.
>
>> I find that usage of "identify" very unappealing. I think normal usage
>> of "identify"
>> is unambiguous. If I say "In this game, teams are identified by color" and
>> then told you that blue identifies team X and a different team Y, you'd
>> consider that nonsense.
>
> Yes indeed.
>
> Regards,    Martin.
>
>> I wonder about some terminology that just relates URIs with byte
>> sequences,
>> without going thru the intermediate concept of resources, and yet doesn't
>> use "identify" in this confusing sense.
>>
>> Something like:
>>
>> A URL is a key typically used to retrieve a page from the Web; more
>> generally,
>> it is used as an address in the Web, whether to find documents, mailboxes,
>> services, applications, etc.
>>
>> "navigation marker" also appeals to me, though I'm not sure there's any
>> specific place
>> in the HTML 5 spec to talk about it that way.
>>
>> So "find" in place of "identifiy". Somewhat ironic... "find" is a
>> synonym for "locate"...
>> so maybe...
>>
>> A URL is a key, typically used to locate a Web page; more generally, it is
>> used to locate mailboxes, services, applications, etc.
>>
>> (footnote: I try to tow the party line where the standard term is 'URI'
>> rather than 'URL',
>> but only out of duty/burden/obligation; somewhere between RFC2396 in '98
>> and 3986 in '05, I tried to convince TimBL and the TAG that it's pushing
>> water uphill to try
>> to get the community to learn 'URI' rather than just going with the flow
>> and using 'URL',
>> but I couldn't make the sale. I'm reasonably happy to see arguments on
>> both sides examined
>> in some detail in the context of working out IRI interop stuff.)
>>
>> But maybe not... I think the analogy with files suggests that 'locate'
>> raises
>> the same issues as 'identify'; that is: filenames name files... or
>> identify files... or
>> locate files; in any case, when you open a file, edit it, and save it
>> back, it's
>> still the same file, and the filename identifies/names/locates/refers to
>> the file,
>> not its contents at a given time. This analogy works with variables in a
>> program, too:
>>
>> x = 1
>> y = 2
>> x = y + 2
>>
>> There's just one variable called/named x; the name 'x' doesn't refer to
>> 1 nor to 3, but rather to
>> the place in memory that holds 1 at first and then 3.
>>
>> I guess it's only in very informal glosses that you can skip from the
>> URL to the sequence-of-bytes
>> without referring to the notion in between... though 'retrieve' does
>> seem to get around it.
>> Filenames can be used to retrieve sequences of bytes... variable names
>> can be used
>> to retrieve values. 'retrieve' doesn't generalize to mailto: and POST so
>> well, but as Ian
>> pointed out somewhere in the thread, the HTML 5 spec doesn't need that
>> generalization.
>>
>> One specific case that the terminology showed up in the HTML 5 spec was
>> around
>> caches, I think; in that case, it's clear to me that the simplest way to
>> talk about
>> it is to talk about caching responses... or the content of response
>> messages.
>> Something like that.
>>
>> I hope to look at a few specific cases of HTML 5 spec text, but it's
>> late here and
>> I already spent a lot more time on this message than I intended to...
>>
>
> --
> #-# Martin J. Dürst, Professor, Aoyama Gakuin University
> #-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
>
>
Received on Monday, 11 January 2010 13:42:30 UTC