- From: Pat Hayes <phayes@ihmc.us>
- Date: Mon, 24 Apr 2006 14:14:34 -0500
- To: Frank Manola <fmanola@acm.org>
- Cc: :
>>> >>>> From: Pat Hayes >>>> >>>> It might be best to start with a definition of what you consider an >>>> information resource to be. Since the TAG do not define this critical >>>> term, yet base important engineering decisions on it, any >>>> authoritative exposition would be of immense value. My current >>>> understanding is that an information resource is some thing that can >>>> be transmitted over a network by a transfer protocol. On this >>>> understanding, one could argue that a word was an information >>>> resource. >>> >>>Definitely not. That would be a "representation", not an "information >>>resource". The information resource is the *source* of >>>"representations" that can be transmitted over a network. > >Sorry to butt in, but a couple of minor comments: Not minor at all, Frank. This might get to one of the hearts of the issue. >"Definitely not" may be technically correct, but >I think a bit more context is needed here. The >TAG Architecture document says: > >"It is conventional on the hypertext Web to >describe Web pages, images, product catalogs, >etc. as ³resources². The distinguishing >characteristic of these resources is that all of >their essential characteristics can be conveyed >in a message. We identify this set as >³information resources.² > >This document is an example of an information >resource. It consists of words and punctuation >symbols and graphics and other artifacts that >can be encoded, with varying degrees of >fidelity, into a sequence of bits. There is >nothing about the essential information content >of this document that cannot in principle be >transfered in a message. In the case of this >document, the message payload is the >representation of this document." OK, reading the above carefully, in the light of David's comment, I seem to discern an implicit distinction between several things. Let me be excruciatingly pedantic here for a second, and make very, very careful distinctions between several things involved in a hypothetical HTTP GET, which to keep things as simple as possible I will assume is the successful getting of an XHTML web page from a server, with a 2xx code, no problems. There seem to be several entities involved in this. 1. An "HTTP endpoint", which is a computational process running on hardware, which processes the GET request and emits http codes and bit-strings. 2. The sequence of bits or bytes whose transmission from (1) constitutes the successful completion of the GET request. 3. The Web page itself: a document, consisting of characters, which conform to XHTML syntactic rules. 4. The encoding of the Web page (3) which is used by the process (1) to produce the bitsequence (2) 5. The encoding of the Web page (3) which is produced from the bit sequence (2) in the browser which issued the GET request and used by it render a visual form of the Web page (3) on the users's screen. and we could of course go further, distinguishing the image on the screen from its binary representation, the state of the process from the process itself, and so on (and on.) Now, I tend to blur some of these distinctions, myself. For example, I tend to think of 2 through 5 as simply being 'the Web page'; or if I am being more careful, to identify 2, 4 and 5 as 'renderings' or 'encodings' or 'tokens' of the single, abstract, Web page (3). And I often don't bother to distinguish between 1 and 4. This gives a simplified picture, which is adequate for many purposes, in which we happily ignore the type/token distinction (as we normally do in English) and where issuing a GET is a bit like asking an usher for A concert program, at which she then hands you a copy from her pile of identical copies, and you take it away and read it without bothering her further, and if anyone asks you what you are reading you say, THE program. (You could say that each copy is a 'representation' of the great concert program in the sky, or of all the other copies, or of the state of the printing platen at the moment the ink hit the paper, but there's not usually much point in being that picky about these distinctions.) It seems (?) that David is concerned to maintain a clear distinction between 1 and 2, and wants to be clear that the information resource is the former. I am however not sure what the status of 3 is, on this account. It hardly seems reasonable to say that 2 is a representation of 1 in the way that it might be to say that it is a 'representation' (token) of 3. Now, I guess there is a coherent position which considers 4 to be a part or an aspect of (a state of) 1, so views 2 as a representation of (a state of) 1, and considers 1, now considered be embodying or including 4, to be the actual information resource; that seems to be closest to what the REST model says, and it is what David seemed to be saying. But it does not seem to be what the TAG says when it declares that an information resource is a document, i.e. 3. If anything, this would have all of 2, 4 and 5 being 'representations' of 3. Whatever 1 is, it certainly is not a document in the sense of (3). And, as you point out, other W3C sources speak of information resources as being transmitted over a network, which makes sense only for 2, speaking strictly. So, as so often in trying to understand the TAG, I am left in a state of muddled confusion as what everyone is talking about. Just to clarify another source of muddle, I would not call any of these things "representations" of any of the others. In my usage of the word "representation", there is no representation of anything involved in the entire architectural story of how an http GET is processed. Nothing represents anything here, because there are no semantic relationships involved. The various bitstrings are simply copies of one another, and the relationship of a document to its bitstring encoding is that of a rendering or encoding, rather than a representation: a token/type relationship. (The bitstring does not *describe* the document it encodes. If it did, it would have to describe it using a syntax, but bitstrings, pretty much by their very nature, do not have any syntax.) This usage of "representation", which I have slowly come to understand is common in the TAG documents, is entirely alien to uses of that word in logic, linguistics and semantics (and AI/KR), and which is used throughout the RDF and OWL specification documents. This is not the sense of "representation" in which, for example, an RDF ontology of weather might be said to represent the weather conditions in Oaxacala. On this TAG sense of "representation", one would presumably say that any written token of a word was a 'representation' of the abstract word itself. This usage might be glossed as 'represents-as-token', or maybe 'represents-as-brass-rubbing' rather than 'represents-by-description'. I also note in passing that with this notion of representation, it is (literally) impossible for any bit-string to 'represent' anything other than a document. In particular, it is impossible to 'represent' the weather over Oaxacala in this sense. Of course one can 'represent' a weather *report*; and that report might represent, in an entirely different sense, the real weather; but being-a-representation-of is not transitive. >So, referring to the next sentence, it would >seem that an RDF ontology and an HTML web page >*are* information resources. What gets >transmitted over the wire, however, would be >representations of those information resources. >Right? An RDF ontology, at any rate, is either an RDF graph or an RDF/XML XML document. Either way, it is not an HTTP endpoint or an abstraction of an HTTP endpoint. So it cannot be an information resource in David's sense, seems to me. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Monday, 24 April 2006 19:15:01 UTC