- From: Pat Hayes <phayes@ihmc.us>
- Date: Tue, 25 Apr 2006 11:12:23 -0500
- To: "Booth, David (HP Software - Boston)" <dbooth@hp.com>
- Cc: "Pat Hayes" <phayes@ihmc.us>, <public-swbp-wg@w3.org>, "Frank Manola" <fmanola@acm.org>
> > From: Pat Hayes [mailto:phayes@ihmc.us] >> . . . >> Not minor at all, Frank. This might get to one of the hearts >> of the issue. >> >> >"Definitely not" may be technically correct, but >> >I think a bit more context is needed here. The >> >TAG Architecture document says: >> > >> >"It is conventional on the hypertext Web to >> >describe Web pages, images, product catalogs, >> >etc. as ³resources². The distinguishing >> >characteristic of these resources is that all of >> >their essential characteristics can be conveyed >> >in a message. We identify this set as >> >³information resources.² >> > >> >This document is an example of an information >> >resource. It consists of words and punctuation >> >symbols and graphics and other artifacts that >> >can be encoded, with varying degrees of >> >fidelity, into a sequence of bits. There is >> >nothing about the essential information content >> >of this document that cannot in principle be >> >transfered in a message. In the case of this >> >document, the message payload is the >> >representation of this document." >> >> OK, reading the above carefully, in the light of >> David's comment, I seem to discern an implicit >> distinction between several things. Let me be >> excruciatingly pedantic here for a second, and >> make very, very careful distinctions between >> several things involved in a hypothetical HTTP >> GET, which to keep things as simple as possible I >> will assume is the successful getting of an XHTML >> web page from a server, with a 2xx code, no >> problems. There seem to be several entities >> involved in this. >> >> 1. An "HTTP endpoint", which is a computational >> process running on hardware, which processes the >> GET request and emits http codes and bit-strings. > >Yes. This is the "information resource", defined in an operational style. > >Side note: I actually used the term "logical >HTTP endpoint", not just "HTTP endpoint", >because an "information resource" is associated >with an entire URL minus the fragment >identifier, whereas a Web server is (normally?) >associated with only the domain+server part >(ignoring the path and query string parts). For >example, given the URI >http://example.org/foo?bar#fum , >http://example.org/ , http://example.org/foo and >http://example.og?bar may all correspond to >different "information resources", even though >they would be served by the same Web server >associated with example.org. Yes. I really don't care about distinctions like this for the present discussion. But still, I take it, by 'logical endpoint', you do mean to refer to some kind of computational process running on a machine, however this is conceptualized. If not, I am unable to follow your meaning; if so, then this is not the same kind of thing as a document, a piece of XML, or a bit- or byte-stream. > > >> 2. The sequence of bits or bytes whose >> transmission from (1) constitutes the successful >> completion of the GET request. > >I'm not sure what you mean here. If you're >including the bits that are part of the HTTP >protocol handshake itself, then it would be more >than just the "representation". If not, then I >don't know how you mean #2 to be different from >#4 below. As to the first point, I would like to know how you distinguish those parts of the transmission which you consider to be the 'representation' from those you do not, and say what it is about the former that makes them particularly 'representational'. As to the second, the difference between #2 and #4 is that #2 is transmitted, while #4 resides on, or at, the network location described as #1. I have a mental picture here of #2 being pretty much a read-out or copy of #4. > > >> 3. The Web page itself: a document, consisting of >> characters, which conform to XHTML syntactic >> rules. > >If it conforms to XHTML syntactic rules it >sounds like you are talking about a particular >instance of a document rather than a document in >the abstract sense (which may change over time) No, a document does not change over time, in either the abstract or concrete sense. To refer to documents changing over time is simply an ontological error. There is nothing in the XML spec that refers to documents changing over time. Literary documents, legal documents and other documents do not change over time. In many cases, it is part of the very reason for having the document that it does not change over time. RDF graphs do not change over time. According to the TAG and REST, resources are defined to be able to change over time (more properly, to be functions from times to representations) but that does not imply that documents are resources: this is in fact one of the issues that we need to get clear. It seems that they cannot be, in fact, for this very reason: the only way to describe this situation coherently is to say that a resource can be a function from times to documents (the 'version' at that time). >, so this sounds to me like a "representation". I cannot follow you here. Something is classified as a 'representation' simply by virtue of it not changing over time? This is the most extraordinary idea, and bears absolutely no relationship to the normal uses of this terminology. How can an abstract document be a representation of one of its own instances or tokens? This simply does not make sense: it would seem to make the representing relationship circular. > > 4. The encoding of the Web page (3) which is used >> by the process (1) to produce the bitsequence (2) > >This also sounds like the "representation", >though stated more specfically. The WebArch >says: 'HTTP . . . uses the "Content-Type" and >"Content-Encoding" header fields to further >identify the format of the representation'. See >http://www.w3.org/TR/webarch/#intro The WebArch seems to use language incoherently. Part of my goal here is to try to disentangle what its authors intended to say. Citing it as authoritative is about as useful as quoting scripture to an atheist. I have no clear idea what the document means, in particular, by 'representation', other than it is clearly not what the rest of the world means. > > 5. The encoding of the Web page (3) which is >> produced from the bit sequence (2) in the browser >> which issued the GET request and used by it >> render a visual form of the Web page (3) on the >> users's screen. > >This sounds like an internal browser-dependent >version of the "representation". You refer to 'the representation' in the singular, which seems to indicate that some of these distinctions are irrelevant. Fair enough; but can you indicate which of my cases you would lump together as being (versions of ?) 'the' representation, and what you mean by a 'version'? > > and we could of course go further, distinguishing >> the image on the screen from its binary >> representation, the state of the process from the >> process itself, and so on (and on.) >> >> Now, I tend to blur some of these distinctions, >> myself. For example, I tend to think of 2 through >> 5 as simply being 'the Web page'; or if I am >> being more careful, to identify 2, 4 and 5 as >> 'renderings' or 'encodings' or 'tokens' of the >> single, abstract, Web page (3). And I often don't >> bother to distinguish between 1 and 4. This gives >> a simplified picture, which is adequate for many >> purposes, in which we happily ignore the >> type/token distinction (as we normally do in >> English) and where issuing a GET is a bit like >> asking an usher for A concert program, at which >> she then hands you a copy from her pile of >> identical copies, and you take it away and read >> it without bothering her further, and if anyone >> asks you what you are reading you say, THE >> program. (You could say that each copy is a >> 'representation' of the great concert program in >> the sky, or of all the other copies, or of the >> state of the printing platen at the moment the >> ink hit the paper, but there's not usually much >> point in being that picky about these >> distinctions.) > >True, but if one is discussing the TAG's WebArch >document ( at http://www.w3.org/TR/webarch/ ), >it is essential to make this distinction, >because the difference between a >"representation" and an "information resource" >is essential to the WebArch. I repeat, the WebArch is incomprehensible. The point, for me, of this entire discussion is to try to make sense of it. I know that the WebArch makes this distinction between "representation" and "information resource", but it never defines either of these terms, so I have no idea WHAT distinction this is supposed to actually BE. To be told that an incomprehensible distinction is 'essential' is not very much help. > >> . . . >> Just to clarify another source of muddle, I would >> not call any of these things "representations" of >> any of the others. In my usage of the word >> "representation", there is no representation of >> anything involved in the entire architectural >> story of how an http GET is processed. Nothing >> represents anything here, because there are no >> semantic relationships involved. The various >> bitstrings are simply copies of one another, and >> the relationship of a document to its bitstring >> encoding is that of a rendering or encoding, >> rather than a representation: a token/type >> relationship. (The bitstring does not *describe* >> the document it encodes. If it did, it would have >> to describe it using a syntax, but bitstrings, >> pretty much by their very nature, do not have any >> syntax.) > >I agree. I don't like the term "representation" >either, but I guess the TAG needed a term and >that was the term they picked. Fine, provided that they gave some indication of what they intended it to mean, in this new technical sense. But they do not, and never have done. They did not explicate a notion and then say, we will call this "representation". They simply used the word, as though their readers shared in this usage, and refused to give any explication of what they mean. Your message continues in this unhelpful tradition, in fact. > >> . . . >> An RDF ontology, at any rate, is either an RDF >> graph or an RDF/XML XML document. Either way, it >> is not an HTTP endpoint or an abstraction of an >> HTTP endpoint. So it cannot be an information >> resource in David's sense, seems to me. > >Yes, it can be if instances of it are intended to be served via HTTP. No, I am sorry, it cannot. The fact is that an HTTP endpoint, given your answer above to my question, is not even in the same category as an RDF ontology: it not the same KIND of thing. So if an information resource is an HTTP endpoint, then it cannot possibly be an RDF ontology. If you want an RDF ontology to be an information resource, then you must change your definition. This has got nothing to do with the transfer protocol. > My proposed definition[1] is very narrow in at >least two ways: (1) it ignores "documents" that >are never intended to be served (because they >are not very relevant to the "information >resource"/"representation" discussion); and (2) >it is restricted to the HTTP protocol, because >that's where the issue of resource identity (and >the httpRange-14 issue) comes up. Fine, I don't want to take issue with either of those restrictions. My point is more basic: running code at a network communication endpoint, on the one hand; and documents or ontologies, on the other, are simply not the same kind of thing. If an information resource is defined to be the former, then one of the latter can't be an information resource. Pat > >[1] http://lists.w3.org/Archives/Public/public-swbp-wg/2006Apr/0053.html > >David Booth -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Tuesday, 25 April 2006 16:12:42 UTC