Re: The meaning of "representation" from Tim Berners-Lee on 2007-11-25 (www-tag@w3.org from November 2007)

From: Tim Berners-Lee <timbl@w3.org>
Date: Sun, 25 Nov 2007 14:09:28 -0500
To: wangxiao@musc.edu
Cc: www-tag@w3.org, Mikael Nilsson <mikael@nilsson.name>, Chimezie Ogbuji <chimezie@gmail.com>
Message-Id: <52E6BE83-B51D-4C20-AF99-F7904192DD2B@w3.org>
On 2007-11 -25, at 08:47, Xiaoshu Wang wrote:

> Tim Berners-Lee wrote:
>> a) The definition in AWW resources "all of their essential  
>> characteristics can be conveyed in a message" does not speak to  
>> where you are coming from. You take "essential characteristics" as  
>> meaning "Properties' rather than " content".  So, not having  
>> understood that, you are happy to consider Pat to be one.  That I  
>> think is a problem with that definition.  However, remember natural  
>> language is an imprecise  tool for making these definitions, and  
>> efforts is needed on the part of reader as well as writer. An  
>> Information Resource is information.  Pat is not.
> I am not, and also won't present to be, a linguist.  But if we were  
> to replace "properties" with "content"? Will that invalid my  
> argument and strength yours?

Yes, it would invalidate your argument, using your notation, because   
C is the content of X.   C, if you like, expresses  the meaning of X.  
Two representations will then be normally exactly Ca = Cb =C as you  
say.  (In the case that b is for example a less expressive language,   
Cb is a subset of C.  In this case, such as my accessing your document  
on a cellphone which can't do pictures, the system relies on my  
understanding I might not have got all of it, and if you serve  
different representations, not all complete, then you as a publisher  
recognize that and accept the consequences.  This happens for example  
with image resolution).  But the basic model is that all  
representations of an information resource convey the same information.


In an open world, anything can say anything *about* of  document.    
Its properties.  I can say the document is X is written by you, is  
misleading, is interesting, etc etc.    This is all *about* the  
document, it is not the *content* of the document However one cannot  
add to the *content* of the document .
If I server up a representation (say at b) which I maintain is a  
representation of your document  ( <a> sameAs <b>) bit I add more  
triples then I am lying, mirepresenting what you said.



>  Or, will it help me or others to get an objective definition of  
> "information resource"?
> I am not, and also won't present to be, a philosopher. But, what is  
> an "information"?

Information has been quantified by Shannon, who allows us to measure  
it and so some math about it.
You can model it in various ways.  one way is to imagine that I have  
very little idea of your state of mind, or your situation. Then you  
you send me information: you publish something I read on the web.   As  
a result of reading it, I have significantly cut down the  
possibilities for what I imagine your state of mind to me.

> IMHO, Information is never a static or physical thing.

It is not physical.  It is absutact.  But you can have a very static  
thing like the RDFS ontology, which is a static set of statements.

>  Information is acquired through a process but not presented as  
> being is.

You discuss the web as communication system, but that is I think too  
low a level. It is useful to rise above the level of communication  
system, and think of it as a word of interconnected documents. It is  
true that the communication system brings this WWW world into  
existence.   And we are discussing the details of how that happens.   
But the goal is to make a web of documents in an abstract space. That  
is the web.

> This is the hardest part, ant it took me quite a while to get it,  
> during the design the core DFDF ontology (I explained it somewhat in  
> DFDF primer).  For me, Pat is a person.  Only through interacting  
> with him or his web proxy http://www.ihmc.us/users/phayes/PatHayes  
> do I get some information about Pat.  The same is http:// 
> www.w3.org.  If I don't interact with http://www.w3.org, it might  
> just as well be Pat.

>
>> - You try to make an architecture in which Pat Hay's famous page is  
>> true. IMO his page is false, and misleading.
> This is what prompted me to write things up.   As you said, "natural  
> language is an imprecise  tool".  So, can we be clear what do you  
> mean "Pat Hay's famous page is false".  Do you mean?
> (1) http://www.ihmc.us/users/phayes/PatHayes is false?

You are "begging the question" - it all depends on what you mean by  
'false'.   I specifically said it in N3, using "FalseDocument".  That  
is the class (for the sake of this argument) of documents which convey  
a falsehood.

If the document were in RDF, I could say it at a higher level:

   [ is log:semantics of <http://www.ihmc.us/users/phayes/PatHayes> ]  
a log:Falsehood.

In other words, the graph you get when you acces Pat's document is a  
logical falsehood.
It would contain something like the triple

<http://www.ihmc.us/users/phayes/PatHayes. a foaf:Person.

when I know that
<http://www.ihmc.us/users/phayes/PatHayes> a gen:InformationResource,

and those two classes are disjoint.


> (2) the representation of http://www.ihmc.us/users/phayes/PatHayes  
> is false?

See above.  I don't argue about the representation of things, just as  
I keep talking about your technical note, not about a representation  
of it.

(Your instinct was good when you protested the generation of RDF  
triples from HTTP as you mentioned very properly the importance of  
orthogonality.   That is the issue here too.  In fact I think modeling  
HTTP using the RDF language )


> I really want to understand on which ground they can be false.  We  
> cannot say (1) is wrong, right? because Pat is the innocent party,  
> who are dragged into this debate by me.  But how can (2) be wrong  
> too?  As you said latter, a representation is a set of bit, and the  
> identity is the content.  And the representation of that URI did  
> talk about Pat, doesn't it?

It says that a document is Pat.  That is wrong.

>
>> - You say a Representation is different from the "content of the  
>> representation". However, as the representation is a set of bits,  
>> its identity is is contents, IMO.
> See above.
>>
>> -  If I understand correctly, I think you use content-negotiation  
>> for distinguishing between the binary data DFDF things and the RDF  
>> metadata DFDF things, which are completely different.  One is not   
>> substitute  for the other.  Content negotiation is inappropriate.    
>> This may be a source of great confusion.
>>
>> - You at one point propose, to counter the need for IR as a first- 
>> class object,  double sets of vocabulary, one set for talking about  
>> the document and one its subject.  This has been suggested in the  
>> past mid-argument, I forget where exactly.   I find that approach  
>> unsatisfactory for a number of reasons.
> With regard to the web, there are creators for resource

I take this to mean:  people create things.

> and there are creators for web pages.

I take this to mean:  People create information resources (aka web  
pages).

>  No matter what, we need to create an additional URI.

What for?  The thi

>  303 tries to create a different URI to separate resource from page.

It allows the URI for a person (say) and the URI for a page to be  
different.  Yes. Even Pat is convinced of this now.
You really can't build a serious knowledge representation system n  
which you can't talk about a person and their web page separately as  
first class objects.

> A new vocabulary tries to separate the semantics of similar wording.  
> IMHO, the latter is a better engineer design because the same  
> vocabulary can be reused to describe a great many resources but the  
> former must create a URI for a resource and redirect.

The # works really well in this case.

>
>>  - You can't use arbitrary predicates, without having some  
>> mechanism for generating them.
> I am a bit lost here.  If dublin core add a dc:repCreator to their  
> vocabulary, will it be arbitrary?


Suppose   MD is a URI of about a book which actually dereferences to  
the library card about the book.:

	<MD> dc:title          'Moby Dick".
	<MD> dc:repTitle      "Catalog card 98712345".
	<MD>     dc:creatorName   'Herman Melville.".
	<MD>    dc:repCraetorName   "A.N. Librarian".
	<MD>    dc:creationDate "1851".
	<MD>    dc:repCCraetionDate "1998".

Basically, I think you are saying that for each property about the  
book, I need a shadow one about the web resource about the book.

and that is just to resolve the ambiguity in the *object*.   Suppose now
I want to make a link "predates".  Do I have to make four predicates

		predates
		repPredates
		predatesRep
		RepPredatesRep

to make sure I can refer in the case of the subject and/or object to  
the subject of the page or the page itself?
much better to have separate URIs.


	<MD#book> 	dc:title          'Moby Dick".
	<MD> 	         dc:title      "Catalog card 98712345".
	<MD#book>     dc:creatorName   'Herman Melville.".
	<MD>              dc:CraetorName   "A.N. Librarian".
	<MD#book>    dc:creationDate "1851".
	<MD>              dc:creationDate "1998".

Tim
Received on Sunday, 25 November 2007 19:09:38 UTC