RE: Question about the On Linking Alternative Representations TAG Finding from Schleiff, Marty on 2008-08-11 (www-tag@w3.org from August 2008)

From: Schleiff, Marty <marty.schleiff@boeing.com>
Date: Mon, 11 Aug 2008 08:56:10 -0700
To: "Richard Cyganiak" <richard@cyganiak.de>, <wangxiao@musc.edu>
Cc: "Sebastien Lambla" <seb@serialseb.com>, "T.V Raman" <raman@google.com>, <john.kemp@nokia.com>, <www-tag@w3.org>, <kidehen@openlinksw.com>, <tthibodeau@openlinksw.com>
Message-ID: <B9EC5E8740E62240888BCF58FC8B008C01FA7D94@XCH-NW-10V1.nw.nos.boeing.com>
Hi Richard (& All),

Richard said, "The hardcopy picture is a rendering of the <bitstream,
MIME type> tuple on paper. The hardcopy picture is neither a resource
(unless someone assigns a URI to it, but again, let's not go there), nor
is it a representation (because it's not a <bitstream, MIME type> tuple,
but a piece of paper with ink on it)." 

If something is a resource (and I think "some thing" is a pretty good
definition of a resource), then it's a resource whether or not someone
assigns it a URI. If there can be no resource without an assigned URI,
then there were no resources before URI was specified.


Marty.Schleiff@boeing.com; CISSP
Associate Technical Fellow - Cyber Identity Specialist
Information Security - Technical Controls
(206) 679-5933

-----Original Message-----
From: Richard Cyganiak [mailto:richard@cyganiak.de] 
Sent: Saturday, August 09, 2008 2:13 AM
To: wangxiao@musc.edu
Cc: Sebastien Lambla; T.V Raman; john.kemp@nokia.com; www-tag@w3.org;
kidehen@openlinksw.com; tthibodeau@openlinksw.com
Subject: Re: Question about the On Linking Alternative Representations
TAG Finding


Xiaoshu,

On 8 Aug 2008, at 13:20, Xiaoshu Wang wrote:
>> A resource is anything named by a URI.
>>
>> A representation is a <bitstream, MIME type> tuple.
> It sounds simple but not really.  For instance, is 
> "http://dfdf.inesc-id.pt/tr/doc/web-arch/img/fig2
> " a resource or a representation?

It's a URI, a string starting with "http://". You probably meant to ask
what it *identifies*. Well, it evidently identifies a resource, because
that's what URIs do. I don't know what that resource is supposed to be,
so I cannot tell if it is a representation or not.  
Only the person who minted the URI knows, and has not chosen to tell us
explicitly.

> A *representation*, in my opinion, is what is delivered to the client,

Too simple. A representation is a <bitstream, MIME type> tuple.  
Representations are sometimes delivered to clients, sometimes sent from
the client to the server, sometimes stored in a cache and so on.  
It's the real bits going over the wire.

> a *resource* is whatever the provider intends it to be.

Yes.

> They are always different - not in the sense if they are *bitstream* 
> or not.

They usually are, but not by definition. A URI can name anything. So I
can name a particular <bitstream, MIME type> tuple with a URI, and hence
the URI identifies a resource that is a representation. (I'm not saying
that this is a particular useful thing to do. In fact, don't do
it.)

> A *representation* of a resource denoted by a URI doesn't have to be 
> delivered electronically.  This is the often wrongly conceived idea 
> that http-URI must be bound to HTTP protocol.  If I bind a URI to a 
> postal service as its transportation protocol, you (in principle) can 
> go to a postal office to request 
> "http://dfdf.inesc-id.pt/tr/doc/web-arch/img/fig2
> " and a hardcopy print, which is NOT a bitstream, can be delivered to 
> you.  Do you consider that picture a resource or a representation?

The hardcopy picture is a rendering of the <bitstream, MIME type> tuple
on paper. The hardcopy picture is neither a resource (unless someone
assigns a URI to it, but again, let's not go there), nor is it a
representation (because it's not a <bitstream, MIME type> tuple, but a
piece of paper with ink on it).

This is irrelevant to Web architecture, by the way. The Web is an
abstract machine that, among other things, emits representations
(<bitstream, MIME type> tuples) in response to GET requests. What we do
with the bits afterwards -- render them on a screen, print them on
paper, store them on a disk -- is outside the scope of Web architecture
and I don't see why we should talk about this.

>> That's a simple and objective distinction. The interesting and 
>> subjective question is how to best model an application using those 
>> two modelling primitives.
>>
>> There are two schools of thought on this. One school maintains a 
>> distinction between "documents" and "things described in the 
>> documents" in their modelling; the other school says that this 
>> distinction is unnecessary.
>>
>> The former modelling has been elevated to an axiom of Web 
>> architecture by the httpRange-14 decision. It has many advantages 
>> over the latter (cleaner handling of metadata, enables grouping of 
>> many descriptions in a single document, ...), and it has some 
>> disadvantages (more complex, ...). These have been discussed 
>> endlessly and I have no interest in resurrecting that debate.
>>
>>> There are only two choices.  (1) As T.V Raman said, don't make any 
>>> distinction between them.
>>
>> The distinction is made in the specs; and it's made by a large and 
>> significant part of the Web community (see REST). Raman might not 
>> consider it important, and that worries me a bit, but it doesn't 
>> diminish the importance of the distinction.
>>
>>> (2) As I have always proposed, to make an absolute distinction.   
>>> That is: to think every URI denotes a *resource*
>>
>> No one disagrees with this.
>>
>>> and what is dereferenced from the URI is the *representation* of 
>>> that resource.
>>
>> Not quite. A representation is what you get back when you do a GET on

>> the resource, or what you send when you do a POST/PUT.
> I am not sure how many people will agree on the latter part.  I 
> cannot.

People talk much more about GET than about POST and PUT, but I'm pretty
sure that I have correctly captured the spirit of the HTTP spec, of
Roy's Chapter 5, and of AWWW when I say that we can change a resource's
state by submitting a representation using POST or PUT. If you disagree,
I'm pretty sure that we are simply using language differently, and you
should probably use another term instead of "representation" for what
you have in mind. ("state of the resource"?)

Repeat after me: Representations are <bitstream, MIME type> tuples.

>> Dereferencing is the process of "reaching through the network" in 
>> order to perform one of the supported operations on a resource.
>>
>>> To think whether something is *in* a document or not is just a form 
>>> of self-contradiction because the goal is to make the web 
>>> *self-descriptive*.  Hence, a document (or resource) is both in and 
>>> not in itself.
>>
>> What do you mean when you say "something is in a document"? I can 
>> understand the phrase "something is described in a document".
>> Obviously a document can describe itself. I don't see the 
>> contradiction.
> I intended it the same way you described that "something described 
> inside a document".  I am trying to understand what you mean that
> "303 redirects are about creating URIs for "things described inside 
> documents".  Do you mean if something talks about itself, it should
> 303 redirect?  Or something else?

I mean something else.

I have this notion in my head that the Web is a collection of documents,
and a web document is not the same as the things the web document talks
about. Hence it's better not to use the same URI for a web document and
the things the web document talks about (except where it talks about
itself). I can't state it any simpler than this. I consider it
self-evident. If you don't agree, I give up trying to communicate this
idea and we just have to accept that we live in different realities ;-)

>> [snip]
>>>>   some_resource
>>>>      |
>>>>      +--303--> description_of_some_resource
>>>>                   |
>>>>                   +--Content-Location--> 
>>>> description_of_some_resource.{html|rdf}
>>>>
>>>> That's the clean and proper way of combining the 303 approach with 
>>>> content negotiation!
>>> It is *clean* only when the distinction of *resource* vs.  
>>> *representation/description* is unambiguous, which hardly is.
>>
>> Those who use the approach described here simply make a modelling 
>> distinction between documents and the things described in the 
>> documents. That distinction *is* unambiguous. (But it is
>> subjective.) The described approach is "clean" in the sense of HTTP 
>> interactions. And it is "clean" in that it enables the modelling 
>> style described above.
> But the web is about facilitating ad hoc communication.  If you have 
> an unambiguous but subjective distinction and I have mine?  Is it 
> going to be unambiguous or not when we intend to communicate with each

> other?

I can unambiguously communicate my subjective choice, and you can
recognize the difference in our choices and work around it. No technical
solution will protect you from subjectivity on the Web.

>>> In either case, i.e., the (1) and (2) solution mentioned above,
>>> 303 is unnecessary.  Sure, it does no harm.  But it does slow down 
>>> the web and our goal should be to make the web more efficient but 
>>> less.
>>
>> That's why I keep insisting that you should use hash URIs, which do 
>> not exhibit this downside, instead of the 303 approach. The solutions

>> mentioned above are for those who have decided to use 303s anyway, 
>> despite the well-known downsides.
> First, there is reality issue - such as dublin core etc., already uses

> slash.

If the additional redirect becomes a serious performance problem, then
the 303 users will be slowly losing linkshare to hash-based
alternatives. Let the market fairy sort it out.

> Second, there are clear use cases where #hash URI is not appropriate.

> Consider the document size, if a domain vocabulary, such as that of 
> SNOMD, will be using #URI.

This is not an issue. You can do <document1#it>, <document2#it> and so
on. You can chunk your documents any way you like. Granted, it's less
flexible than 303 redirects.

> Third, there is still the issue of the nature of resource because what

> a hash URI denotes when there are multiple representation/ variants, 
> isn't clearly defined. If a (generic) resource say 
> http://example.com/gr  has two representations, an RDF and an HTML 
> one, that can be get via conneg.  Let's give each of them a URI 
> http://example.com/gr.rdf and http://example.com/gr.html.  What will 
> be http://example.com/ gr#a denoting?

It will denote whatever the variants say it denotes. It's possible to
coordinate the variants to make them communicate the same idea.

But you are right, this is a somewhat messy area and the different specs
involved don't answer all questions. I think however that we have a
quite clear understanding of what the desirable answers would be, that
is, what the specs *should* say to make things work out.

> And what is the relationship between http://example.com/gr.rdf#a and 
> http://example.com/gr.html#a ?

There is none, unless explicitly stated.

> This is a muddled area, which I hope TAG can find time to give some 
> recommendations.

+1.

Best,
Richard


>
>
> Regards,
>
> Xiaoshu
Received on Monday, 11 August 2008 15:57:51 UTC