- From: <noah_mendelsohn@us.ibm.com>
- Date: Tue, 4 Dec 2007 18:17:49 -0500
- To: wangxiao@musc.edu
- Cc: Chimezie Ogbuji <chimezie@gmail.com>, Mikael Nilsson <mikael@nilsson.name>, Tim Berners-Lee <timbl@w3.org>, www-tag@w3.org
Xiaoshu Wang writes: > I am not sure if I have misunderstand. Do you want to say that > "information *is* bit-stream"? Long ago Dan Connolly proposed what I think is often a pretty good rule of thumb for contributions to this list[1]: "If a thread goes back and forth three times without anybody suggesting textual changes to the document, something's wrong." I like that guideline, and I think we should take it as advice to wrap up at least this little bit of the discussion, perhaps without agreement if necessary. Still, you've asked a direct question, so I think that deserves at least a try at a direct answer: No, I don't think I'd say that the "information is the bit stream". The way I understand Shannon, we can start by imagining, as an example, a fixed set of messages I might wish to convey to you: 1. The light is red. 2. The light is yellow. 3. The light is green. 4. The light is off. We know in advance that these are the possible messages. If I, using some code or other, manage to tell you that, for example, the light is green, then we can say I have conveyed to you that information. Now, as someone who is interested in semantics you might say "which light? what do you mean by green? Were you trying to distinguish regular green from dark green, or did you mean any shade of green is OK?"Both Shannon and I, for purposes of this exchange are saying: "Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem." They're not irrelevant to the semantic web, but I think they are not essential to the fundamental notion of information, is in "information resource". All that's required here is that you and agree that they are four choices, which we agree to label with the sentences above. Now, we may have taken the trouble to agree on very exact semantics for, say, the color green, so you'll know exactly what frequency my light color will be. Other communities may be much more informal about it. The Web allows both, though I don't doubt that the semantic Web will allow much more precise reasoning to be done when people define their RDF properties have carefully defined and communicated semantics, rather than loosely or informally defined ones. Getting back to your question, I don't think the bits are information. One encoding for the above information is as 2 bits per message, 00 = the light is red, and so on. Shannon points out that this is an optimal encoding bandwith-wise only if the 4 choices are equally likely, but we don't care about that. You and I could instead agree on a more verbose encoding, such as the UTF for the characters RED = The light is red, GREEN = The light is green, and so on. I would say that the bits for this coding are very different (UTF 8 vs little 2 bit sequences) but the information conveyed is the same. Going a bit further, I can pretty well signal my views on information resources using this example: Let's I as the owner of http://example.com/lightStatus decide that I want to assign that URI to a URI which has as its state the 4 way choice above. It knows that a light is either red, green, yellow or off. I believe that I can fully convey the essence of that resource in a message (I've just shown at least two ways), so it's an information resource. It doesn't matter whether I had in mind some particular real world traffic light, some rigorous definition of the color green, etc. It's a resource that can answer a question one of 4 ways, and it's an information resource. I might instead be thinking of a real actual traffic light, the kind that will actually bend your car if you run into it. That's not an information resource, because I can't use messages to bend your car as it runs into the light. The thing has mass and is screwed down to the street. If I want to put up on the Web a resource that tells you whether that light is red, yellow, green or off, I should use a 303 redirect from the URI of the big heavy light itself. That's how I see it anyway. Again, I think we should wrap up this bit of the discussion. I think we've aired enough of the issues that others on this thread can draw on whatever aspects of your or my analysis that they find useful. Noah [1] http://www.w3.org/2001/tag/tatn [2] http://plan9.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Tuesday, 4 December 2007 22:16:42 UTC