resources and URIs from pat hayes on 2003-07-15 (www-tag@w3.org from July 2003)

From: pat hayes <phayes@ihmc.us>
Date: Tue, 15 Jul 2003 18:20:10 -0500
To: www-tag@w3.org
Cc: Pat Hayes <phayes@ai.uwf.edu>
Message-Id: <p06001215bb39faa42145@[10.0.100.23]>
Gentlemen, I would like to ask you to please clarify the meaning of 
the terms 'resource' and 'representation' in 
http://www.w3.org/TR/2003/WD-webarch-20030627/.

Allow me to elaborate.  Your introductory example asserts the following:

"Objects in the networked information system called resources are 
identified by Uniform Resource Identifiers ( URIs ). "

and later the document says:

"URIs identify resources. When a representation of one resource 
refers to another resource with a URI, a link is formed between the 
two resources. The networked information system is built of linked 
resources, and the large-scale effect is a shared information space. 
The value of the Web grows exponentially as a function of the number 
of linked resources (the "network effect").  "

These, and other pieces of text concerning 'resources' published by 
other W3C authorities,  seem to clearly indicate that the word 
"resource" is intended to refer to the entities *in* the networked 
information system: they are the kind of thing we use words like 
'website', 'client' and 'server' to describe; they are things with a 
computational state, things with which one can communicate, things 
which send and receive information which can be transmitted along 
optical fibers and twisted pairs, things than can be linked to one 
another.

So far this is clear; and the account of 'representation' given in 
the document is also then reasonably clear:

"Agents (such as servers, browsers and multimedia players) 
communicate resource state through a non-exclusive set of data 
formats, used separately or in combination (e.g., XHTML, CSS, PNG, 
XLink, RDF/XML, SVG, SMIL animation). In the travel scenario, Dan's 
user agent uses the URI to request a representation of the identified 
resource. In this scenario, the representation consists of XHTML with 
embedded weather maps in SVG. "

On this picture, the information (which Dan, in your introductory 
example, reads on his screen, and which is in some sense all about 
the weather in Oaxaca) is a representation of the (current state of) 
some entity *in the WWW itself*: a resource in the global information 
network: the state of some computer system, or maybe some abstraction 
of a computer system.

However, it is also clear that neither the weather in Oaxala, nor 
Oaxala itself, are entities of this kind:  weather and cities in 
Mexico are not the kind of entities which can be thought of as 
'objects on the networked information system'. Other examples abound, 
eg http://chandra.harvard.edu/photo/2003/ngc1068/index.html  is in 
clearly about a galaxy containing a supermassive black hole, which is 
also not something one would expect to find as part of an networked 
information system, given the likely physical constraints on network 
architecture.

It seems that there is a systematic ambiguity between two senses of 
'resource' (or maybe two senses of 'representation') here. In your 
first example, I doubt very much that Dan, when looking at his screen 
after telling his browser to retrieve 
http://weather.example.com/oaxaca, thinks of what he is reading as in 
any sense about the state of something on the WW information network. 
Certainly if I were in his shoes, I would be reading it as being 
about Oaxala and weather: that is why he is reading it, presumably: 
to find out something about the weather in Oaxala.  So what this 
representation is *about* is not, apparently a resource: so it is not 
a representation of a resource, in the usual sense of 
'representation' and what is apparently your sense of 'resource'. 
Similarly, http://chandra.harvard.edu/photo/2003/ngc1068/index.html 
sure reads to me like it is about NGC 1068. But this means that 
either it is a 'representation' which is not about what it is 'of', 
or else that NGC 1086 is an 'object in the networked information 
system'; neither of which seem to me to be remotely plausible as 
factual claims using the ordinary senses of the words, and kind of 
brain-damaged as attempts at a formal definition of some kind of 
architectural/semantic theory.

Now, this could be just a matter of philosophical opinion, were it 
not for the fact that semantic web languages like RDF and OWL have 
been given *formal* semantic theories which have direct architectural 
consequences for Web agents, and which depend crucially on notions 
like the term 'about' I have used rather loosely above.  RDF uses URI 
references as *names* to *refer* to entities. So if a web page such 
as http://chandra.harvard.edu/photo/2003/ngc1068/index.html were to 
include RDF markup, one might expect to find things like this in it:

<rdf:Description
rdf:about="http://chandra.harvard.edu/NGC/ngc1068"
rdf:type="http://chandra.harvard.edu/AOtype/Activegalaxy7"
</rdf:Description>

where the URIs refer respectively to a galaxy and an RDFS class of 
galaxy types. This is completely incompatible with what your document 
says about resources and representations.  Using the URI in this way 
does not create any kind of link between anything on this planet and 
NGC 1086 (which is, fortunately, about 50 million light-years away). 
But RDF/RDFS/DAML/OWL/OIL and all the other emerging Semantic Web 
formalisms *require* that URIs be used in this way, as *referring 
expressions*, not as informational links in a global architecture.

The RDF/RDFS/OWL semantics assumes that URI references refer to 
"resources" , but it explicitly denies that this word "resource" is 
limited to the kinds of resource that you seem to be talking about. 
On the SW view, *anything* is a resource: galaxies, regions of 
France, kinds of wine, sodium atoms, classes, mathematical 
abstractions, even fictional entities: anything that can be referred 
to by a name. None of these can possibly be "objects in a networked 
information system".  So whatever you are talking about, and whatever 
they are talking about, y'all cannot possibly be using the words 
"resource" and "representation" in the same sense.

As a result, several of the assertions you make in this document are 
not correct. For example

2.8.2
"merging Semantic Web technologies, including "DAML+OIL" [ DAMLOIL ] 
and "Web Ontology Language (OWL)" [ OWL10 ], define RDF properties 
such as equivalentTo and FunctionalProperty to state -- or at least 
claim -- formally that two URIs identify the same resource. "

is incorrect. These assertions claim that two URI references *denote* 
the same entity in all interpretations. That is not the same notion 
as 'identify'.

In fact, there is no such notion as 'identify' in RDF/RDFS/OWL 
semantics; and the first principle in section 2 ("All important 
resources SHOULD be identified by a URI ") is meaningless when taken 
literally in the context of semantic web languages, as URIs there 
typically cannot be said to identify anything: they act as names 
whose possible referents are constrained by the assertions made using 
them, but they are not 'linked' to anything, not 'bound' to anything, 
and are not obliged to 'identify' anything; and the universes of 
discourse may contain entities which cannot possibly be all 
identified or even referred to by URIs, since there are too many of 
them, or it is physically impossible to identify them with enough 
precision, or simply because it is impractical to do so.

------

Sorry this comes across so negatively, but there seems to be a 
central misunderstanding right at the center of several architectural 
accounts of the Web, and I think it is important to get it sorted out.

There are two distinct models of how names refer. In some ways, URIs 
are like file names in a programming language: they provide a way to 
access a piece of information, a global address of some entity which 
delivers information on demand. In this sense they provide a link 
between network components, can  be considered to be unique, and it 
is reasonable to claim that every one of the resources they link 
should be identified by a URI.  This is the old sense of URLs, which 
of course has now been generalized: but URIs, particularly when 
discussed in an architectural framework, seem to retain a kind of 
shadow of this URL heritage.  In other ways, emphasized more recently 
and particularly by the semantic web languages, URIs are more like 
referring names in an assertional language: they simply denote 
things. In this sense they do not provide links (naming something 
does not establish a link to it, eg one can name entities which no 
longer exist or could possibly exist, such as the 19th century); they 
are not unique; and it is ridiculous to claim that every entity 
should have a name (does every grain of sand on Pensacola beach have 
a URI? But it is easy to invent a URI for the rdfs:ClassOf all the 
sand grains on Pensacola Beach, and to assert - possibly in a rule 
language - that every such grain is made of quartz. This requires 
that every grain be considered to be a 'resource'.). Many of these 
architectural requirements make sense only if we interpret the 
language of the documents as though URIs were slight generalizations 
of URLs. For example - due to Roy Fielding - consider a webcam which 
delivers a view of a room when suitably pinged; then one might say 
that the room is the resource 'identified' by the webcam's URI; this 
kind of generalization of 'resource' allows the edges of the Web to 
extend a little further into the world surrounding what is usually 
thought of as the global network.  But names, and the idea of 
reference, extend much further than this: they extend to the entire 
universe of things that exist, will exist, have ever existed or could 
possibly exist.  Most of what is said about URIs *does not make 
sense* when one tries to read the language of the documents as though 
URIs are general referring names; and yet the semantic web standards 
are being written based on this assumption.

We need to get clear on this issue, or else we will continue to be 
mired in confusion.

Let me suggest that it would be worth distinguishing between what a 
representation is *about*, and what resource *produced* it. The 
document currently says that URIs are used to retrieve 
representations 'of' a resource.  It is easy to read this as saying 
that the representation is 'about' the resource: that it 'refers to' 
or 'describes' the resource; but this is evidently incompatible with 
the notion of a resource as something that must be 'part of' an 
informational network.  This point has nothing particularly to do 
with the semantic web, by the way: it is just as true of the current 
web, as the Chandra example (and indeed your own Oaxala weather 
example) shows. The *source* of a representation and what might be 
called the *topic or content* of the representation need have very 
little to do with one another (although trust may be based on a 
judgement of the authority of the former to make claims about the 
latter). Right now the two ideas seem to be confused; I think it 
would be clearer if they could be explicitly separated. This 
shouldn't make any deep difference to the purely architectural issues 
you are describing, but it will greatly help to clarify the semantic 
issues which depend in part on them, and which we still have not 
managed to fully harmonize with them.

Pat Hayes

-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 15 July 2003 19:20:20 UTC