Re: resources and URIs from Tim Berners-Lee on 2003-07-17 (www-tag@w3.org from July 2003)

From: Tim Berners-Lee <timbl@w3.org>
Date: Wed, 16 Jul 2003 23:43:59 -0400
To: pat hayes <phayes@ihmc.us>
Cc: www-tag@w3.org, Pat Hayes <phayes@ai.uwf.edu>
Message-Id: <E68BBAEE-B808-11D7-94B5-000393914268@w3.org>
On Tuesday, Jul 15, 2003, at 19:20 US/Eastern, pat hayes wrote:

> Gentlemen, I would like to ask you to please clarify the meaning of 
> the terms 'resource' and 'representation' in 
> http://www.w3.org/TR/2003/WD-webarch-20030627/.
>
> Allow me to elaborate.  Your introductory example asserts the 
> following:
>
> "Objects in the networked information system called resources are 
> identified by Uniform Resource Identifiers ( URIs ). "
>
> and later the document says:
>
> "URIs identify resources. When a representation of one resource refers 
> to another resource with a URI, a link is formed between the two 
> resources. The networked information system is built of linked 
> resources, and the large-scale effect is a shared information space. 
> The value of the Web grows exponentially as a function of the number 
> of linked resources (the "network effect").  "
>
> These, and other pieces of text concerning 'resources' published by 
> other W3C authorities,  seem to clearly indicate that the word 
> "resource" is intended to refer to the entities *in* the networked 
> information system: they are the kind of thing we use words like 
> 'website', 'client' and 'server' to describe; they are things with a 
> computational state, things with which one can communicate, things 
> which send and receive information which can be transmitted along 
> optical fibers and twisted pairs, things than can be linked to one 
> another.
>

Yes. Let us actually call these things Information Resources.  They are 
an important subclass of Resources.  You make a very good point, and I 
have asked for the Architecture document to be changed to reflect this.

Specifically, the things addressed directly by http:  are all 
information resources.
This does *not* apply to other schemes.
But HTTP resources are as you decribe and I have been trying to get that
acknowledged in the arch doc.
(The issue is HTTP range 14)


> So far this is clear; and the account of 'representation' given in the 
> document is also then reasonably clear:
>
> "Agents (such as servers, browsers and multimedia players) communicate 
> resource state through a non-exclusive set of data formats, used 
> separately or in combination (e.g., XHTML, CSS, PNG, XLink, RDF/XML, 
> SVG, SMIL animation). In the travel scenario, Dan's user agent uses 
> the URI to request a representation of the identified resource. In 
> this scenario, the representation consists of XHTML with embedded 
> weather maps in SVG. "
>
> On this picture, the information (which Dan, in your introductory 
> example, reads on his screen, and which is in some sense all about the 
> weather in Oaxaca) is a representation of the (current state of) some 
> entity *in the WWW itself*: a resource in the global information 
> network: the state of some computer system, or maybe some abstraction 
> of a computer system.
>
> However, it is also clear that neither the weather in Oaxala, nor 
> Oaxala itself, are entities of this kind:  weather and cities in 
> Mexico are not the kind of entities which can be thought of as 
> 'objects on the networked information system'. Other examples abound, 
> eg http://chandra.harvard.edu/photo/2003/ngc1068/index.html  is in 
> clearly about a galaxy containing a supermassive black hole, which is 
> also not something one would expect to find as part of an networked 
> information system, given the likely physical constraints on network 
> architecture.

Yes. In RDF, the "#" is like an operator which combines the identifier 
of an information resource with a local identifier used within that 
resource, and together forms an identifier for the abstract thing, like 
the weather.

> It seems that there is a systematic ambiguity between two senses of 
> 'resource' (or maybe two senses of 'representation') here. In your 
> first example, I doubt very much that Dan, when looking at his screen 
> after telling his browser to retrieve 
> http://weather.example.com/oaxaca, thinks of what he is reading as in 
> any sense about the state of something on the WW information network. 
> Certainly if I were in his shoes, I would be reading it as being about 
> Oaxala and weather: that is why he is reading it, presumably: to find 
> out something about the weather in Oaxala.  So what this 
> representation is *about* is not, apparently a resource:

It is a resource - resource is like daml:Thing.  The Arch doc confuses 
people at the moment by not introducing the class of information 
resources.  I am told that if I can convince Roy that this is a useful 
distinction, then we will probably be done, and I thought I almost had, 
but then he says no.

>  so it is not a representation of a resource, in the usual sense of 
> 'representation' and what is apparently your sense of 'resource'. 
> Similarly, http://chandra.harvard.edu/photo/2003/ngc1068/index.html 
> sure reads to me like it is about NGC 1068. But this means that either 
> it is a 'representation' which is not about what it is 'of', or else 
> that NGC 1086 is an 'object in the networked information system'; 
> neither of which seem to me to be remotely plausible as factual claims 
> using the ordinary senses of the words, and kind of brain-damaged as 
> attempts at a formal definition of some kind of architectural/semantic 
> theory.
>
> Now, this could be just a matter of philosophical opinion, were it not 
> for the fact that semantic web languages like RDF and OWL have been 
> given *formal* semantic theories which have direct architectural 
> consequences for Web agents, and which depend crucially on notions 
> like the term 'about' I have used rather loosely above.  RDF uses URI 
> references as *names* to *refer* to entities. So if a web page such as 
> http://chandra.harvard.edu/photo/2003/ngc1068/index.html were to 
> include RDF markup, one might expect to find things like this in it:
>
> <rdf:Description
> rdf:about="http://chandra.harvard.edu/NGC/ngc1068"
> rdf:type="http://chandra.harvard.edu/AOtype/Activegalaxy7"
> </rdf:Description>
>

No, that would be illegal by my way of thinking.
http://chandra.harvard.edu/NGC/ngc1068 is an information resource.
You would expect

<rdf:Description
rdf:about="http://chandra.harvard.edu/NGC#ngc1068"
rdf:type="http://chandra.harvard.edu/AOtype/Activegalaxy7"
</rdf:Description>

or, in the http://chandra.harvard.edu/NGC information ressource,,

<rdf:Description
rdf:about="#ngc1068"
rdf:type="http://chandra.harvard.edu/AOtype/Activegalaxy7"
</rdf:Description>

where you can see that local identifiers can be used to refer to 
abstract things, because that is what the RDF language spec says.

> where the URIs refer respectively to a galaxy and an RDFS class of 
> galaxy types. This is completely incompatible with what your document 
> says about resources and representations. 

It is incompatible with your assumption that ALL resources are 
information resources, just because there is so much talk of 
information resources.
That assumption is not made by the specs though.

>  Using the URI in this way does not create any kind of link between 
> anything on this planet and NGC 1086 (which is, fortunately, about 50 
> million light-years away).  But RDF/RDFS/DAML/OWL/OIL and all the 
> other emerging Semantic Web formalisms *require* that URIs be used in 
> this way, as *referring expressions*, not as informational links in a 
> global architecture.

The magic of the "#".
Now you see why http://.../foo.rdf and http://.../foo.rdf#bar are such 
different things.

> The RDF/RDFS/OWL semantics assumes that URI references refer to 
> "resources" , but it explicitly denies that this word "resource" is 
> limited to the kinds of resource that you seem to be talking about. On 
> the SW view, *anything* is a resource: galaxies, regions of France, 
> kinds of wine, sodium atoms, classes, mathematical abstractions, even 
> fictional entities: anything that can be referred to by a name. None 
> of these can possibly be "objects in a networked information system".

Yes, "objects in the networked information system" are just another 
subclass of Resource, from RDF's point of view.

>   So whatever you are talking about, and whatever they are talking 
> about, y'all cannot possibly be using the words "resource" and 
> "representation" in the same sense.
>
> As a result, several of the assertions you make in this document are 
> not correct. For example
>
> 2.8.2
> "merging Semantic Web technologies, including "DAML+OIL" [ DAMLOIL ] 
> and "Web Ontology Language (OWL)" [ OWL10 ], define RDF properties 
> such as equivalentTo and FunctionalProperty to state -- or at least 
> claim -- formally that two URIs identify the same resource. "
>
> is incorrect. These assertions claim that two URI references *denote* 
> the same entity in all interpretations. That is not the same notion as 
> 'identify'.

No?  How do they differ?  One seems to be expressed in MTspeak, the 
other in normal software engineering speak.

> In fact, there is no such notion as 'identify' in RDF/RDFS/OWL 
> semantics; and the first principle in section 2 ("All important 
> resources SHOULD be identified by a URI ") is meaningless when taken 
> literally in the context of semantic web languages,

Which just shows that taking one statement literally in a foreign 
context does not preserve its meaning in natural language.  The arch 
doc is not written in the languages in which OWL is described.  You can 
probably translate though.

> as URIs there typically cannot be said to identify anything: they act 
> as names whose possible referents are constrained by the assertions 
> made using them, but they are not 'linked' to anything, not 'bound' to 
> anything, and are not obliged to 'identify' anything;

I say they identify things, you say they can't.  So I suppose you have 
to show me what *you* mean by identify which breaks.

> and the universes of discourse may contain entities which cannot 
> possibly be all identified or even referred to by URIs, since there 
> are too many of them, or it is physically impossible to identify them 
> with enough precision, or simply because it is impractical to do so.
>

No one said that there was a URI for every resource.  It is quite a 
different thing to say that *any* resource can have a URI (which is 
true) as to say that *every* resource has a URI.  The system does not 
require that every resource has a URI.

> ------
>
> Sorry this comes across so negatively, but there seems to be a central 
> misunderstanding right at the center of several architectural accounts 
> of the Web, and I think it is important to get it sorted out.
>

Indeed, this distinction between information resources and Resources in 
general has not had had an easy passage, but it is absolutely 
necessary.  I hope we can get it into the arch doc. soon.

Also, the fact that an identifier can be composed, using "#", of a term 
used in a document combined with the global identifier for the document 
in order to construct a global identifier for the thing identified by 
the term, while so simple is subject to a lot of criticism.   But it 
works and resolves these issues - even if sometimes the document is 
imaginary.

Tim
Attachments

text/enriched attachment: stored
Received on Wednesday, 16 July 2003 23:44:04 UTC