W3C home > Mailing lists > Public > www-tag@w3.org > December 2007

Re: The meaning of "representation"

From: <noah_mendelsohn@us.ibm.com>
Date: Tue, 4 Dec 2007 18:17:49 -0500
To: wangxiao@musc.edu
Cc: Chimezie Ogbuji <chimezie@gmail.com>, Mikael Nilsson <mikael@nilsson.name>, Tim Berners-Lee <timbl@w3.org>, www-tag@w3.org
Message-ID: <OF321C6F01.0EB38BCE-ON852573A7.00785FE2-852573A7.007A2A7E@lotus.com>

Xiaoshu Wang writes:

> I am not sure if I have misunderstand. Do you want to say that 
> "information *is* bit-stream"?

Long ago Dan Connolly proposed what I think is often a pretty good rule of 
thumb for contributions to this list[1]: "If a thread goes back and forth 
three times without anybody suggesting textual changes to the document, 
something's wrong."  I like that guideline, and I think we should take it 
as advice to wrap up at least this little bit of the discussion, perhaps 
without agreement if necessary. 

Still, you've asked a direct question, so I think that deserves at least a 
try at a direct answer:  No, I don't think I'd say that the "information 
is the bit stream".  The way I understand Shannon, we can start by 
imagining, as an example, a fixed set of messages I might wish to convey 
to you:

1. The light is red.
2. The light is yellow.
3. The light is green.
4. The light is off.

We know in advance that these are the possible messages.  If I, using some 
code or other, manage to tell you that, for example, the light is green, 
then we can say I have conveyed to you that information.  Now, as someone 
who is interested in semantics you might say "which light?  what do you 
mean by green?  Were you trying to distinguish regular green from dark 
green, or did you mean any shade of green is OK?"Both Shannon and I, for 
purposes of this exchange are saying: "Frequently the messages have 
meaning; that is they refer to or are correlated according to some system 
with certain physical or conceptual entities. These semantic aspects of 
communication are irrelevant to the engineering problem."  They're not 
irrelevant to the semantic web, but I think they are not essential to the 
fundamental notion of information, is in "information resource". 

All that's required here is that you and agree that they are four choices, 
which we agree to label with the sentences above.  Now, we may have taken 
the trouble to agree on very exact semantics for, say, the color green, so 
you'll know exactly what frequency my light color will be.  Other 
communities may be much more informal about it.  The Web allows both, 
though I don't doubt that the semantic Web will allow much more precise 
reasoning to be done when people define their RDF properties have 
carefully defined and communicated semantics, rather than loosely or 
informally defined ones. 

Getting back to your question, I don't think the bits are information. One 
encoding for the above information is as 2 bits per message, 00 = the 
light is red, and so on.  Shannon points out that this is an optimal 
encoding bandwith-wise only if the 4 choices are equally likely, but we 
don't care about that.  You and I could instead agree on a more verbose 
encoding, such as the UTF for the characters RED = The light is red, GREEN 
= The light is green, and so on.  I would say that the bits for this 
coding are very different (UTF 8 vs little 2 bit sequences) but the 
information conveyed is the same.

Going a bit further, I can pretty well signal my views on information 
resources using this example:  Let's I as the owner of 
http://example.com/lightStatus decide that I want to assign that URI to a 
URI which has as its state the 4 way choice above.  It knows that a light 
is either red, green, yellow or off.  I believe that I can fully convey 
the essence of that resource in a message (I've just shown at least two 
ways), so it's an information resource.  It doesn't matter whether I had 
in mind some particular real world traffic light, some rigorous definition 
of the color green, etc.  It's a resource that can answer a question one 
of 4 ways, and it's an information resource.  I might instead be thinking 
of a real actual traffic light, the kind that will actually bend your car 
if you run into it.  That's not an information resource, because I can't 
use messages to bend your car as it runs into the light.  The thing has 
mass and is screwed down to the street.  If I want to put up on the Web a 
resource that tells you whether that light is red, yellow, green or off, I 
should use a 303 redirect from the URI of the big heavy light itself.

That's how I see it anyway.  Again, I think we should wrap up this bit of 
the discussion.  I think we've aired enough of the issues that others on 
this thread can draw on whatever aspects of your or my analysis that they 
find useful. 

Noah




[1] http://www.w3.org/2001/tag/tatn
[2] http://plan9.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------
Received on Tuesday, 4 December 2007 22:16:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:51 GMT