- From: <noah_mendelsohn@us.ibm.com>
- Date: Tue, 4 Dec 2007 13:55:09 -0500
- To: wangxiao@musc.edu
- Cc: Chimezie Ogbuji <chimezie@gmail.com>, Mikael Nilsson <mikael@nilsson.name>, Tim Berners-Lee <timbl@w3.org>, www-tag@w3.org
Xiaoshu Wang wrote: > First, Shannon quantifies the information to investigate communication. Yes. > Second, information is embedded in a message, i..e, it is the > content of the message, yes? No. Assuming binary coding is used, the message is a sequence of bits. It is presumed that the sender and receiver agree in advance on the range of possible information values (my term, not Shannon's), that a given message might convey; each distinct message essentially selects one of those values. From Shannon's 1948 paper [1]: "The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages." -- and -- "If the number of messages in the set is finite then this number or any monotonic function of this number can be regarded as a measure of the information produced when one message is chosen from the set, all choices being equally likely." So, messages convey information, in that (when successfully transmitted) they cause the sender to agree on a choice between N possibilities. By the way, for Shannons purposes, information is quantified: a message that allows you to choose one of 1024 possible possibilities is conveying more information than one that allows you to choose between just two: the former is eliminating 1023 options while the latter is eliminating only 1. That's why it takes at minimum 10 bits to convey the first message, but only one bit to convey the second. Revisiting what Tim said on 25 November: > Information has been quantified by Shannon, who allows us to > measure it and so some math about it. You can model it in > various ways. one way is to imagine that I have very little > idea of your state of mind, or your situation. Then you you > send me information: you publish something I read on the web. > As a result of reading it, I have significantly cut down the > possibilities for what I imagine your state of mind to me. > Yes, exactly. I think that's a pretty good informal statement > of what I quoted from Shannon above, and is consistent with his > usage in the rest of Shannon's paper. The N bits sent on the wire allow you to get agreement between sender and receiver on a choice between at most 2^N possible choices. The higher level significance of those choices is usually also the subject of agreement in advance between sender and receiver, even though Shannon didn't need to say much about that, since he was mainly concerned with reliable transmission of the bits (or other code). So, when I return to you a text/html page with an HTTP 200, you get an entity body that is a sequence of bits. You and I agree which sequence, of all the possible ones, I have sent you. The HTTP specification then delegates to the specification for the text/plain media type to tell you more: the page has a body, perhaps with certain paragraphs and headings. So, with those specifications on hand, you and I agree that I have sent you an HTML document with certain constructs in it. By the way, I have been hoping to ground the TAG's discussion of versioning more firmly in this view of information as being choices among predetermined options, but that's a subject for a different permathread. Noah Then, you said the information is abstract [1] http://plan9.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Tuesday, 4 December 2007 17:54:28 UTC