Re: URIs, used in RDF, that do not have associated documentation from Pat Hayes on 2012-03-28 (www-tag@w3.org from March 2012)

From: Pat Hayes <phayes@ihmc.us>
Date: Wed, 28 Mar 2012 13:26:31 -0500
To: トーレ　エリクソン <tore.eriksson@po.rd.taisho.co.jp>
Cc: Jonathan A Rees <rees@mumble.net>, www-tag@w3.org, tore.eriksson@gmail.com
Message-Id: <C560CAF0-A48C-40E0-A672-6A45C11C2B53@ihmc.us>
On Mar 27, 2012, at 9:44 PM, トーレ　エリクソン wrote:

> Jonathan Rees wrote:
>> 2012/3/27 トーレ　エリクソン <tore.eriksson@po.rd.taisho.co.jp>:
>>>>>> Remember that I've stated my dismay that httpRange-14(a) says "is an
>>>>>> information resource" rather than addressing the ambiguity mentioned
>>>>>> in Fielding's email (as illustrated by the Flickr and Jamendo cases).
>>>>>> httpRange-14(a) as written doesn't really help, except that its
>>>>>> authors and nearly everyone else have interpreted it to resolve the
>>>>>> ambiguity in a particular way - that the URI refers to what you
>>>>>> retrieve (generically if you will), not to what is described by what
>>>>>> you retrieve. This interpretation *has* been helpful because it lets
>>>>>> you use these RDF-less URIs in RDF and be understood. That the
>>>>>> resolution didn't say what was meant was a colossal screwup IMO. But
>>>>>> let's set that aside and just look at the question.
>>>>> 
>>>>> Saying that the URI refers to what you retreive is not consistent with
>>>>> the HTTP specification, in which resources and representations are
>>>>> clearly seprate entities. Saying that they are equivalent confounds them,
>>>>> and one of the axioms of RDF is that two things should not use the same
>>>>> URI.
>>>> 
>>>> Nobody who has followed this discussion is saying this. Not me, not
>>>> Tim, not anyone. (I think Ian Hickson said it once, but he's not
>>>> involved in this discussion.) If this is what you think then I'm not
>>>> sure how we can discuss this matter.
>>> 
>>> I'm sorry it this came across the wrong way. In hindsight, "equivalent"
>>> was not the right word to use. You wrote that the common interpretation
>>> of httpRange-14 is that the URI refers to what is retrieved.
>> 
>> *generically*. To just say what is retrieved or what you get is just
>> shorthand for the generic-resource story, which is tedious to repeat
>> over and over.
> 
> I am aware of this distinction.
> 
>>> A lot of the
>>> messages on the mailing list are describing a GET/200 as a URI returning
>>> content, and as it returning instances the resource. Add to this the
>>> fact that the URI also denotes an information resource, and that the
>>> definition of information resources is that all of their essential
>>> characteristics can be conveyed in a message.
>> 
>> I have been campaining against this horrid definition since 2007. Any
>> application of it is a  matter of judgment, so it cannot be used
>> between people who don't trust one another to be reasonable.

I'm with Jonathan on that one.

>>> Since the message in
>>> question is supposed to be the retrieved representation,
>> 
>> That's a jump!  Not sure how you get that.
> 
> I cant really say that I it has been explicitly stated, but I always
> assumed that this is what was intended. The httpRange discussion has
> always been full of examples, galaxies, whether, birds, and humans, of
> things that can't be transmitted and compared this with information
> resources that can be transmitted.

No, that is not the point. The contrast is not between things that can vs. cannot be *transmitted*: it is to do with things that can vs. cannot be *contacted* by HTTP (or xxTP). What matters about weather, galaxies, people etc.. is that the entire HTTP story misses them completely: they aren't HTTP endpoints, they aren't causally part of the Web or Internet *machinery*. They don't respond to a GET command because they arent the kind of thing that ever can be involved in the entire business of transfer-protocol machinery. They aren't *computational* entities. They are invisible to the world of machine architecture. Whether they can be "represented" (in any of the myriad senses being used out there) by the payload of an HTTP transaction is irrelevant and a complete red herring. Maybe in some sense all the essential characteristics of, say, a physical book can be transmitted by some sufficiently elaborate and detailed protocol: that is completely irrelevant if the book in question is not connected by a wire or a fiber-optic cable (or maybe a quantum entanglement, or whatever the future will bring) to a computer which is attached to the world-wide communication network. THAT is what matters in this discussion. 

> Since the only way to do transmission
> in HTTP is retrieval of representation, this jump has always seemed to
> be in order.
> 
>>> this implies
>>> that the information resource and its representation share all essential
>>> characteristics.
>> 
>> No, the representations can have characteristics that the resource doesn't.
> 
> Sure, media type, length, MD5 checksum, &c.
> 
>> And the exact manner of "conveying" isn't spelled out. As I said
>> before it is more accurate that the representations "possess" the
>> resource's "essential characteristics" if you can convince yourself
>> that "e. c."s are like what I call metadata predicates, things like
>> dc:title and so on.
> 
> No, I though that the essential characteristic of a picture is how it
> looks.

Tell that to a museum curator who has to worry about insurance.

> 
>> But really I think it's folly to microanalyze such a poorly articulated theory.
> 
> Yes, there sure are bigger fish to fry here.
> 
>>> Although not strict equivalence, I always thought that
>>> this sounded like they are constrained to be very similar.
>> 
>> The more generic the resource, the fewer similarities there will be.
>> It depends on the full range of representations. There will be very
>> little you can say about resources with highly variable
>> representations, even though there will be a lot to say about
>> individual representations.
> 
> That is why I think we should focus on describing resources explicitly
> and not through looking at their representations (unless *they* describe
> the resource explicitly in RDF of course).
> 
>>> On the other
>>> hand, the HTTP specification says that what is used a representation of
>>> an URI is something the URI owner defines. If he defines them not to
>>> share their essential characteristics, then httpRange-14 fails.
>> 
>> The spec isn't as clear as all that. Also nobody says that
>> identification and representation are completely decoupled. But if you
>> are saying the foundations just don't make any sense, I agree.
>> 
>> I have raised this foolishness with the HTTP WG and am waiting back to
>> hear from them.
>> 
>>> In a theoretical sense, no one (excluding Ian Hickson for now) says that
>>> they share the same URI. But one hand the URI *denotes* the resource,
>>> on the other hand the URI *refers* to the representation. If you can
>>> point me at a precise definition of these two verbs, that would be
>>> very helpful, but even if they mean different things, there is ample
>>> room for misunderstandings.
>> 
>> I'm not sure where you get that the URI refers to the representation.
>> I don't think people use them that way in RDF, and it's not implied by
>> anything I've said.
> 
> This is a direct quote form your initial mail:
> 
> "nearly everyone else have interpreted it to resolve the ambiguity in a
> particular way - that the URI refers to what you retrieve (generically
> if you will), not to what is described by what you retrieve."
> 
> And what is retrieved is the representation, isn't it?
> 
>>>> My views on the matter are put down here:
>>>> http://www.w3.org/2001/tag/awwsw/ir/latest/
>>>> Tim has written about this as well:
>>>> http://www.w3.org/DesignIssues/Generic.html
>>>> 
>>>> I have never claimed that the generic resource idea follows from the
>>>> HTTP spec. I have just said that, empirically speaking, many people
>>>> writing metadata assume that this is how things work. It would
>>>> therefore be a good idea to codify the practice, or at least avoid
>>>> making statements that discourage it.
>>>> 
>>>> You may disagree with the ideas, but don't claim I'm not aware of how
>>>> HTTP or Web architecture work.
>>> 
>>> I am profoundly sorry it this is how it came across. I have the uttermost
>>> trust in your knowledge of this area and am grateful for all the hours
>>> you are putting in trying to clarify these issues. My response was not
>>> directed at you personally, it was more of a comment on how httpRange-14
>>> is applied. The content of your original mail was also used as a reply
>>> to my proposal, which due to a mistake on my behalf ended up off list.
>>> Thank you very much for re-posting it, by the way. The offending
>>> paragraph was meant as a short introduction to my position in this
>>> matter for new readers, that is all.
>> 
>> I didn't take personal offense, that doesn't matter, I just don't like
>> it when people draw unjustified conclusions, especially when it seems
>> to make someone else appear illogical.
> 
> Once more, I'm sorry for making you appear illogical.
> 
>>> Speaking of which, I am really interested in your reply to the subsequent
>>> part of the mail since you argued that my propsal would break deployed
>>> RDF, which certainly is not my intention. Acknowledging that I was the
>>> one who put it in there in the first place, I would really appreciate if
>>> the discussion don't get side-tracked by the IR issue, and instead
>>> continues with the interesting stuff.
>> 
>> Your proposal says, quote: "a representation retrieved
>> from a HTTP URI will [...] always be a description (of the state) of the
>> resource" - that's what I was triggered by. I can't conceive how a
>> representation retrieved from, say, http://www.yahoo.com/ could be
>> construed as a description of the identified resource. It's not a
>> description of anything. The representation is the content of the
>> resource, but not a description. I *can* imagine that what you're
>> saying is such URIs shouldn't be used in RDF, which is a coherent
>> position, but one that deprecates lots of URIs, as I said before.
> 
> The representation retrieved from <http://www.yahoo.com/> is an HTML
> document, and this document describes the current state of the resource.

No, it describes (if it describes anything) how to make a nice jazzy image appear on the screen in your browser window. That is what it *describes*. Its relationship to the resource that emitted it is different: it is more like a kind of imprint, or file copy, of that. But not a description. 

> Since Yahoo don't provide any explicit URI documentation (I didn't check
> very hard though) in RDF, I don't know what class the resource is of.
> This makes it fragile to say things about it in RDF, but RDF is built on
> the open world assumption, so nothing will break horrendously if you get
> it wrong.

Not sure if I follow this, but I dont see how the OWA makes it OK to be wrong. 

Pat Hayes

> 
>> I suspect you didn't mean what you said. You might have intended to
>> classify URIs as content-oriented vs. description-oriented (or
>> representations as content vs. description), which would be a better
>> match to current and desired practice, I think. Then representations
>> wouldn't *always* be descriptions. The question then would be where to
>> draw the line.
> 
> I did mean what I said. Classifying URIs as content-oriented vs.
> description-oriented is in my opinion the root problem. What I want to
> say is:
> 
> * All HTTP URIs are description-oriented, even if they return a 200 *
> 
> Neither me nor Tim Berners-Lee wants to draw the line. His position is
> that all URIs responding with a 200 are content-oriented, my position is
> that they, as wella as the 303s, are description-oriented. Both
> positions have their merits and demerits. I have mentally applied my
> position to a lot of problems that have been discussed on these mailing
> lists during the last decade, and I think it works out well in most
> cases.
> 
> I know it is a radical change, but this is what I am proposing. Many
> other people have propose versions of this position during the years,
> but it has never stuck. Apparently the point is hard to get across; even
> if you write it out in prose, people assume that you meant something
> else... I hope you understand what I'm trying to say now, and any advice
> on how to formulate this more clearly would be much appreciated.
> 
> Tore
> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 28 March 2012 18:27:12 UTC