- From: Booth, David (HP Software - Boston) <dbooth@hp.com>
- Date: Mon, 24 Apr 2006 14:49:14 -0400
- To: "Frank Manola" <fmanola@acm.org>
- Cc: "Pat Hayes" <phayes@ihmc.us>, <public-swbp-wg@w3.org>, "Guus Schreiber" <guus@few.vu.nl>, "Steve Pepper" <pepper@ontopia.net>, "Mark van Assem" <mark@cs.vu.nl>, "Ralph R. Swick" <swick@w3.org>
Frank, Excellent explanation! Thanks for adding this clarification. David Booth > -----Original Message----- > From: Frank Manola [mailto:fmanola@acm.org] > Sent: Monday, April 24, 2006 12:50 PM > To: Booth, David (HP Software - Boston) > Cc: Pat Hayes; public-swbp-wg@w3.org; Guus Schreiber; Steve > Pepper; Mark van Assem; Ralph R. Swick > Subject: Re: on documents and terms [was: RE: [WNET] new > proposal WN URIs and related issues] > > > Booth, David (HP Software - Boston) wrote: > >> From: Frank Manola > >>>> From: David Booth > >>>>> From: Pat Hayes > >>>>> > >>>>> It might be best to start with a definition of what you > >>>>> consider an > >>>>> information resource to be. Since the TAG do not define this > >>>>> critical term, yet base important engineering decisions > >>>>> on it, any > >>>>> authoritative exposition would be of immense value. My > current > >>>>> understanding is that an information resource is some > >>>>> thing that can > >>>>> be transmitted over a network by a transfer protocol. On this > >>>>> understanding, one could argue that a word was an information > >>>>> resource. > >>>> Definitely not. That would be a "representation", not an > >>>> "information resource". The information resource is the > >>>> *source* of "representations" that can be transmitted > >>>> over a network. > >> Sorry to butt in, but a couple of minor comments: > >> > >> "Definitely not" may be technically correct, but I think a bit more > >> context is needed here. The TAG Architecture document says: > >> > >> "It is conventional on the hypertext Web to describe Web > >> pages, images, product catalogs, etc. as "resources". The > >> distinguishing characteristic of these resources is that > >> all of their essential characteristics can be > >> conveyed in a message. We identify this set as "information > >> resources." > >> > >> This document is an example of an information resource. It > >> consists of words and punctuation symbols and graphics and other > >> artifacts that can be encoded, with varying degrees of > >> fidelity, into a sequence of bits. There is nothing about > >> the essential information content of this > >> document that cannot in principle be transfered in a message. > >> In the case of this document, the message payload is the > >> representation of this document." > >> > >> So, referring to the next sentence, it would seem that an RDF > >> ontology and an HTML web page *are* information resources. > >> What gets transmitted over the wire, however, would be > >> representations of those information resources. Right? > > > > You're right. I should have been clearer that it depends > on what you > > mean by "RDF ontology" or "HTML web page". If you're > referring to the > > abstract document that may change over time then yes, it is an > > information resource. If you're referring to a particular > > instantiation of that document that may be transmitted over > the wire then no, it is a > > representation. Pat was > > referring to something that could be transmitted over the wire. > > > > An information resource cannot be transmitted over the > wire. It is an > > abstraction. Thus, I believe the WebArch sentence above that says: > > > > "all of their essential characteristics can be conveyed > > in a message" > > > > is slightly incorrect and should have said something like: > > > > "all of their *current* essential characteristics can be > > conveyed in a message" > > > > because a representation only gives a snapshot of that information > > resource at one particular moment, whereas the "information > resource" > > is the abstract source/set of those representations over time. > > > > David-- > > What you say is correct, but I think that some of the qualifications > about *current* characteristics and *information* resources could be > misinterpreted. > > First off, if I understand this business properly, *no* resources, > information or not, can be sent over a network or conveyed in > messages. > Only *representations* of resources can be sent or conveyed > in this way. > The distinction between information resources and other resources > isn't about whether or not representations of them can be sent or > conveyed (*only* representations of resources can be sent or > conveyed, > and non-information resources can have associated > representations that > can be sent or conveyed), but rather about whether or not those > representations convey the "essential characteristics" of > those resources. > > This separation of concepts serves a number of purposes. One > of them is > to deal with time-varying resources. However, it's not necessary for > the resource to vary over time: a resource may be static, > and the same > separation of concepts applies. In the case of a static > resource, what > you'd get for a request is a snapshot, but the *same* snapshot. The > separation is there to allow for the time-varying case, and > for you to > be able to coin separate URIs for the time-varying resource, and for > particular "versions" (e.g., over time) of it. > > Another purpose is to distinguish the resource in the abstract from > different representations of it that may be returned for different > purposes. Examples of this are illustrated in > http://www.w3.org/TR/swbp-vocab-pub/, where an RDF vocabulary is > returned in either RDF/XML or HTML, depending on what the user wants. > Here's where things can get further confused (or, at least, where *I* > may be further confused). > > Take the case of an RDF vocabulary referenced by a single URI, say > http://example.myvocab. However, "under the covers" there are really > two documents available, http://example.myvocab.rdf and > http://example.myvocab.html. A user may want either the rdf > or the html > version of the vocabulary, depending on what she/he is trying > to do, and > the discussion in http://www.w3.org/TR/swbp-vocab-pub/ shows > how you can > get the version you want if you ask simply for > http://example.myvocab. > Now, my understanding is that: > > a. There are *three* resources here, http://example.myvocab, > http://example.myvocab.rdf, and http://example.myvocab.html. > These are > all resources in spite of the fact that http://example.myvocab is in > some sense "more abstract" (less of a specific > representation) than the > other two. > > b. Even when the server selects one of the versions, either > http://example.myvocab.rdf or http://example.myvocab.html to return, > what gets returned is still a representation of one of these > resources, > not the resource itself. > > c. *All* of these are "information resources", in that their > "essential > characteristics can be conveyed in a message". That is, considered > independently, the essential characteristics of > http://example.myvocab.rdf can be conveyed in a message, the > essential > characteristics of http://example.myvocab.html can be conveyed in a > message, and presumably the essential characteristics of > http://example.myvocab can be conveyed in a message (although what > actually gets sent is a representation of one of those other files). > > This is a place where I find the definition of "information resource" > (put together with the httpRange-14 guidance) somewhat > problematic, in that: > > d. whether a given representation conveys the "essential > characteristics" of some resource is (necessarily) kind of fuzzy, and > > e. it seems as if the *server* (presumably acting as an intermediary > for whoever put the resource out there in first place) has a > lot more to > say about whether the "essential characteristics" of the resource are > being conveyed (via the return code it sends when you ask for the > resource) than the user does, even though "essential for what" seems > like an application-dependent decision. > > I (think I) understand at least some of the architectural tradeoffs > involved, but I can't help but think that an HTTP response code is a > pretty low-bandwidth mechanism for conveying this kind of information > (in fact, I'm inclined toward Pat's position), and that we > ought to be > looking more at ways to use the RDF/OWL/... class of languages to > provide metadata about: > > f. what kind of thing (or kinds of things) dereferencing a > given URI is > going to return, and > > g. what kinds of things a given kind of return might be useful for > (e.g., users could document what they've been able to do with what's > been returned, in an extensible fashion). > > --Frank >
Received on Monday, 24 April 2006 18:49:41 UTC