Re: URIs as names-for-reference vs locations-for-access

In the world of RDF, it is free to dereference a URI. It is really a problem 
because a representationREST may be temporarily unaccessible due to network 
failure and server maintenance. In the world of OWL, we have the vocabulary 
owl:imports. We have the concept of imports closure. It becomes not a 
problem. Hence, semantic inference in the world of RDF should be done 
manually by merging RDF graphs. Semantic inference in the world of OWL can 
be done dynamically when the collection of ontologies and axioms and facts 
is imports closed.

It seems that it is not a problem in the semantic web.


Jeremy

----- Original Message ----- 
From: "Harry Halpin" <hhalpin@ibiblio.org>
To: <www-tag@w3.org>
Cc: <www-rdf-interest@w3.org>; <semantic-web@w3.org>
Sent: Tuesday, April 05, 2005 10:59 AM
Subject: URIs as names-for-reference vs locations-for-access


>
> Ah, a title might be courteous....
>
> Again, there seems to be the usual questions about the SemWeb popping up,
> and in particular http-range-14. There also doesn't seem to be much 
> progress on these issues. Here's some notes that I think may be helpful,
> which basically try to distinguish between URIs as names for locations 
> versus URIs as locations for physical access, as well as try to define the 
> elusive term "on the Web" as being something that if the Web was 
> destroyed, would also be destroyed. Also I distinguish between the use of 
> representation in REST versus representation in AI/philosophy, which are 
> not always the same. I think these distinctions, and taking them 
> seriously, is clearly very important to http-range-14.
>
> The full text is here, and benefited from some discussion with Pat Hayes:
>
> http://www.ibiblio.org/hhalpin/homepage/notes/uri.html
>
> Text version below:
> -----------------------------------------------------------------------
> URIs as Names for Reference and as Locations for Access
> httpRange-14 notes
> By Harry Halpin
> Thanks to Pat Hayes for some examples and commentary, although any errors 
> are due to me of course!
>
>
> What do URIs identify?
>
> In essence, one reason Web works because using a web protocol like 
> http(Hypertext Transfer Protocol), one can from a client send a request to 
> a server to do an operation such as HTTP GET for a given URI and 
> dereference something, often a web-page. However, this very basic feature 
> of the Web is bedeviled by a question: "What is the range of the HTTP 
> dereference function?" In other words, what do URIs identify? In theory 
> this question has been solved by the W3C TAG's AWWW: URIs refer to 
> anything. Upon inspection, the official definition is actually circular: 
> "We do not limit the scope of what might be a resource...it is used in a 
> general sense for whatever might be identified by a URI." The question 
> then arises that if a resource is just anything that could theoretically 
> be with a identified URI, is there anything that can not be identified? It 
> would seem not. This view is given by the AWWW as "our use of the term 
> resource is intentionally more broad. Other things, such as cars and dogs 
> ... are resources too." However, referring to a web-page and the car in my 
> garage are similar, but not exactly the same. The essential difference is 
> this: in the first case on the Web we have physical, connected, access to 
> the Web-page, while in the second case if we are using Semantic Web logic 
> to refer to my car, we only the ability to refer to my car by a URI name, 
> and this has no direct, connected, or physical access. When one uses a URI 
> as a name there is a disconnect, as the thing named may not be on the Web.
>
> The division between representation and resource existed but was not 
> explicitly stated, and definitely not noticed by, most of the users of the 
> original hypertext Web. URLs seem to be originally meant to identify the 
> location of representations, such as HTML web-pages, or possibly sets of 
> representations, such when through content negotiation a news website 
> figures out where you live and then serves you your local news. With the 
> advent of the Semantic Web, the problem of httpRange-14 comes up precisely 
> because a URI can be used to refer to anything, not just web pages. To be 
> more precise, the issue comes up because URIs can refer to things that are 
> not "on the Web" and so do not necessarily have a Web-accessible 
> representation. Despite of this, these things that are "not on the Web" 
> are fundamentally "on the Web" in another sense, since they can be 
> reasoned about by the Semantic Web. The crucial point is what does "on the 
> Web" mean? To answer that question we must pursue the historical chain of 
> events from URL to URN to URI.
>
> Locations
>
> Uniform Resource Locations (URL) did not suffer from the httpRange-14 
> issue, unlike their nearly identical brethren URIs. Unlike URIs, URLs 
> identified a specific type of thing: a location, which is a physical 
> place. This location was assumed to be on the Web. By "on the Web," 
> something that is physically connected to the Web. A URL denotes a 
> location on some web-server which serves representations (HTML document, 
> music file to download, whatever) to visiting web clients. A location can 
> be connected to the Web because it - even after endless redirection - in a 
> physical place.
>
> Take a mundane example: my address. An address is a just a location that 
> has a thing that can (usually) be found at that location, and there exists 
> a specified system for finding the location of an address. This allows 
> multiple locations to be ordered in a way that humans, such as in street 
> addresses (or machines in the case of IP addresses) can navigate easily. 
> In the case of my address, and if one wants to find me, they can try to 
> looks for at the location of my address - and I'm sometimes not there, so 
> my address can give the person trying to find me a metaphysical 404 error. 
> A location can, and should, give you direct, connected, physical access to 
> the thing at the location. URLs are used as names of locations, and 
> sending at HTTP GET (or POST, or HEAD, and so on) to a server requires the 
> server if possible to go to the location and physically access the thing 
> at the location, usually by copying it and sending a copy to your 
> computer. Or sending a very real 404 error.
>
> On the Web
>
> Something could be found on the Web if it physically and causally 
> connected to the Web. This means that whatever it was "on the Web," it 
> could be encoded into bits and transferred over the Web. However, this is 
> only "on the Web" the Web in the strongest sense: as in always on the Web. 
> A thing can be only on the Web sometimes, or only partially on the Web, or 
> only rarely on the Web. By our definition, if it could not be removed from 
> the Web without loss of its functionality. One can imagine a whole range 
> of possibilities, from being "strongly" on the Web (all the time) to 
> "weakly" on the Web (occasionally). Thus, both documents and servers are 
> "on the Web", and humans are not "on the Web" in a weak sense since they 
> only interacted directly with the Web indirectly through typing on 
> keyboards. Things like the Eiffel Tower or Louis XVI are definitely "not 
> on the Web" on the Web, since Louis XVI is long gone and cannot at any 
> point directly connect physically to the Web, while the Eiffel Tower is 
> only represented on the Web, but no physically sending any bytes to anyone 
> itself. The Eiffel tower is composed not of bytes, but of steel. This 
> brings us to "representations" on the Web. What is the difference between 
> something merely having a representation on the Web and something being 
> fully on the Web? Rephrasing Brian Smith: Some thing is on the Web such 
> that if the Web itself was destroyed, that thing would also be destroyed. 
> If not, it's not fully on the Web. If someone destroyed the Web, this 
> would not damage me if I were being denoted by a URI, but my homepage at 
> that URI would be up in smoke if that what's people were using to refer to 
> me by. I am not on the Web in a strong sense, but my homepage sure is. 
> There are lots of middling cases: my computer is weakly on the Web, more 
> so than myself. If my httpd daemon went down and my computer could no 
> longer access the Web, or the Web itself collapsed, the computer qua 
> computer still exists, but the computer qua Web server went up in smoke 
> with the rest of the Web. One good question yet to be answered when are 
> humans on the Web in a strong sense? Would it require our credit card 
> details to be in an chip beneath our skin with a URI, and wireless 
> internet monitoring us with a GPS that sent messages over the Internet? 
> Those examples seem also too simplistic and extreme. Still, what is the 
> difference between a something being represented on the Web and being on 
> the Web? One necessary but not nearly sufficient condition for 
> "representation" would be that a thing X represents another thing Y if you 
> can destroy thing X and thing Y remains unscathed. Representations qua 
> representations are on the Web, and would be destroyed if the Web was 
> destroyed. However, what they represent would not be destroyed, unless 
> what the representation represented also was on the Web.
>
> Representations: REST and AI
>
> Before going any further, we have to distinguish two different uses of the 
> word "representation." The first is the use of "representation" as it is 
> used artificial intelligence, cognitive science, and philosophy. In this 
> use, a representation is something that "denotes" or "is about" something 
> else, although often additional requirements are put on exactly what type 
> of things the representation or its denotation may be. This will be called 
> "representationAI." The second use is the use of "representation" as used 
> by REST (The Representational State Transfer web architecture theory of 
> Roy Fielding), where a representation can be whatever that a URI returns 
> from a HTTP request. This will be called a "representationREST". A 
> representationREST, unlike a representationAI, does not necessarily refer 
> to or denote any other thing - although it might! The two definitions are 
> not the same, but not mutually exclusive either. So, the difference 
> between "on the Web" and "not on the Web" is also a test of both types of 
> representation. A representationAI can qua representationAI be entirely on 
> the Web if what it represents is also on the Web. Lots of representations, 
> such an analog photo on my desk, are not on the Web at all. In another 
> case, a picture of me on the Web is on the Web qua itself but not on the 
> Web qua me, because it denotes me, not something on the Web. If the Web 
> was destroyed, it would only destroy the bytes of the representationAI, 
> not necessarily what the representation denoted. Also, representationsAI 
> may have layers of representationAI, as one representation may denote 
> other representationsAI, leading to all sorts of interesting chains of 
> reference. However, representationsREST are by definition on the Web, and 
> would be destroyed if the Web was destroyed, at least as the possible 
> objects of HTTP operations. This is because representationsREST are 
> defined precisely as the bytes that are sent over the Web. One could argue 
> that copies of them archived to a computer might survive. However, those 
> copies would no longer be representationsREST qua the Web, but just 
> whatever they are without the Web being involved. This argument does 
> reveal that both sorts of representation are functional categories that 
> are dependent on their context, as something is never a representationREST 
> without being on the Web (or in some parallel universe, another system 
> that implements REST). Something is never a representationAI without 
> something being represented.
>
> Virtual Locations and Digitality
>
> This idea of physically being on the Web can be abstracted from the 
> concept of location. "Being on the Web" does not mean a thing has one URL 
> or even physical location. Something could be on the Web and have multiple 
> URLs, are multiple copies in different physical locations. A location can 
> be a virtual location, an abstraction over a set of possible physical 
> representations, as long as it really is a location. What exactly is the 
> "thing" at a URL location? It's not just a particular server, nor is it 
> some abstract resource. It is actually some bytes, a representationREST or 
> set of representationsREST, which one has to actually GET to determine 
> using your web client to see if it's a representationAI. The particular 
> server where the actual representationREST lives is actually denoted by 
> another type of location: wherever it is on the server, and the server has 
> a very concrete IP address. A URL can be a name that denotes a virtual 
> location, which is the forwarded to the place where the concrete bits are 
> stored. These bits are usually on a server somewhere. When one accesses 
> http://www.w3c.org, if I am in Japan I get the mirror of the W3C web-pages 
> in Japan, if I'm in the US I get the one hosted at MIT, but I get the same 
> "resource," regardless. Here the concept of resource as stated by TAG 
> starts making some sense. It's a concept about the contents of a 
> representationREST. However, this resource is not identical to the thing 
> physically received as bytes (that's the representationREST). A resource 
> seems to be the abstract idea of the common information between all the 
> possible representationsREST returned. To properly understand resource 
> then one needs a thorough inspection of theories of information and 
> content, which is beyond the scope of this little note. Still, what is 
> physically returned by a HTTP GET is just the representationREST, which 
> may differ between MIT and Kyoto, while it might not between INRIA and 
> MIT. The fact that the Web is digital becomes crucially important: the 
> "copyability" of the representationsREST, due to their digital nature, is 
> crucial to why the Web works, just as crucial as a universal naming 
> scheme. Yet, things not "on the Web" (Pat Hayes qua Pat Hayes, my dog, 
> etc) don't have this property of copyability. A picture on the Web of Pat 
> Hayes is digital, but Pat Hayes is not, no matter how much time he spends 
> online.
>
> What's in a Name?
>
> A name is entirely different from a location. Unlike a location, a name 
> does not necessarily give you access to the thing named, and this thing 
> name we will call the referent of the name. The set of all referents of a 
> name (or denotations of a representation for that matter) we will call its 
> interpretation. In fact, names are usually used when connected, physical 
> access is impossible, and as such are place-holders for the physical thing 
> precisely because there is no physical access. This concept of "names" is 
> more in line with the URN effort, which essentially tries to serve as 
> rigid designators in the Kripkean sense for the Web. Since a name does not 
> have any connection to a referent, putting a name on the Web via a URI 
> (such as a URN) does absolutely nothing at all to the referent of the 
> name. When anyone accesses the resource "Pat Hayes" from URI 
> ,http://www.ihmc.us/users/phayes/PatHayes.html, Pat Hayes does magically 
> appear next to them. What that URI currently can return from a HTTP get is 
> a representationREST: a Web-page in HTML encoded as very physical bytes 
> somewhere that get sent to me over a wire as very physical bytes, and then 
> displaying by a very physical computer the social security number of Pat 
> Hayes and other defining details. It could even theoretically return a 
> definition of Pat Hayes in RDF. Yet this particular URI representationREST 
> also serves double-duty as a representationAI, since it contains pictures 
> of the actual Pat Hayes, relevant facts about him, and so on. Pat Hayes 
> himself is not on the Web, since if the Web is destroyed Pat Hayes would 
> merrily go along, and probably with more spare time.
>
> So, the use of a URI as a "name" causes a URI to be used as a 
> representationAI. However, what exactly the interpretation of a URI as a 
> "name" actually is goes beyond the physics of transferring bytes. This 
> interpretation is either the yet-to-come metaphysics of the Semantic Web, 
> social meaning, or something else - who knows? But what is important is 
> that it is a non-physical, non-causal, non-connected relationship, unlike 
> the relationship of a location which is a physical, connected, causal 
> relationship. Note that URIs used as names-for-reference are common in the 
> Semantic Web, and the Semantic Web depends on there being names with 
> interpretations to reason over. Because there is no direct access to the 
> thing the URI-as-name identifies, unlike the use of a URI-as-location, the 
> Semantic Web uses URIs without any necessary use of representationsREST. A 
> URI in the Semantic Web is used more like as "place-holders" or even 
> (stretching it a bit) "keys," without any HTTP operation returning any 
> bytes from a server in terms of representationREST. Thus, the Semantic Web 
> uses URIs as representationsAI, while the Good-Old HyperText Web uses URIs 
> as representationsREST.
>
> Double Lives as Names and Locations
>
> The key of the confusion is that http fundamentally will dereference 
> whatever a URI refers to, and there are two distinct types of functional 
> roles a URI can play: name and location. A URI can serves as a 
> identifier-as-a-name, which is a non-physical relation of reference, and 
> as a identifier of a location, which is a physical relation of access. 
> Just naming something has no effect on the thing named: naming something 
> does not bathe the thing named in any type of energy that we can detect 
> via a physical radar. There is no way to build a detector to detect what 
> exactly someone means by a URI, although we can guess from talking to them 
> or accessing representations they give us. Locations give you physical, 
> connected, access to a thing. If you go to a location to get something, if 
> the thing is there you return with it physically in hand. A name might, 
> but does not have to and usually does not give one any sort of physical, 
> connected, access to the thing named by the location.
>
> The word "identifier" is even more vague than name or location, and here 
> the problem of the "identity" crisis appears: how do we know if the URI is 
> being used for something as a name or as a location? The URI itself does 
> not tell us. Even worse, what does "identify" mean, and how can we tell if 
> two things identify the same thing? With representationsAI that is 
> sometimes very clear, as in photographs, and sometimes not so clear, as in 
> abstract art. Even the integers have problems with identification: does 
> "11" identify eleven in decimal or three in binary? We won't know - and 
> can't know unless we are given some sort of decoding scheme. In 
> programming language tradition "identifier" has a pretty secure meaning 
> and in that context the access/reference distinction is theoretically 
> important but not of great practical significance, since everything you 
> can refer to is physically accessible by the computer and has an address 
> in memory. This is not true of logic, and definitely not true of 
> model-theoretic semantics. Importantly, the access and reference 
> distinction holds on the Web with many things that have URIs. In an 
> information space, things may be identified without being accessed via a 
> physical connection. In terms of the AWWW, a "non-information" resource is 
> probably similar to the use of URI-as-access, while the use of URI for 
> reference without access is called an "information resource."
>
> Solving the Identity Crisis
>
> Then there's the identity crisis: a single URI can actually play both 
> roles (name with no access and location with access) at the same time, 
> which gives us a powerful device for some application. The official view 
> is that the representations are supposed to be interpreted by applications 
> depending on MIME types is clearly focused on the use of a URI as a 
> location for access; yet nothing forbids a URI that returns a 
> representationREST or some other data to be used tell the web client that 
> this URI is also a name for reference in addition to a location for 
> access. In fact, for a URI used only as a name, MIME-types are clearly 
> irrelevant. At least for the time being!
>
> It would be useful to distinguish when a URI is used as "name" or as a 
> "location, " and if some URIs can only be used as names or only as 
> locations. In other words, this depends on whether the thing (which would 
> be the "resource") identified by URI is on the Web or not. This already 
> reduces to the "non-information resource" and "information resource" 
> distinction on some level, and so is not a return to the historical Dark 
> Ages of the Web. Since they share a common syntax, it does make sense to 
> unite URLs and URNs on a level as URIs, and even to use URLs as "names." 
> The identity crisis can be solved pretty easily, as shown by the Web 
> Proper Names proposal. First, a separate URI scheme (wpn:// or tdb://) can 
> distinguish the use of URI as names for reference from URI as locations 
> for access. To capitalise even further on the identity crisis, this can be 
> distinguished without a new URI scheme by solving it by the use of a 
> representationREST, by having a type of representation format which says 
> that this URI is a "name" as opposed to a "location." In fact, one could 
> even have a special MIME-type to distinguish names for things: imagine the 
> "name" MIME-type, or the "application/xhtml+xml+name" type.
>
> The Future...
>
> However, one subject which needs more exploration is the "interpretation" 
> of URIs as names. How does one tell, if a URI as a name for reference, 
> what its interpretation is? All the RDF statements that apply to that URI? 
> And if so, how do we get them in a decentralized system? SPARQL? URIQA? 
> Magic? In other words, assuming the URI gave you machine-readable 
> descriptions in some Semantic Web language readable by machines, should 
> the use of a URI-as-a-name really mean that this URI refers to (or 
> denotes) whatever is necessary to satisfy the Semantic Web description? 
> The Semantic Web allows one to build a number of roles and assertions, and 
> one would assume that its interpretation is those other Semantic Web URIs 
> that are satisfied by these roles and assertions. However, the SemWeb as 
> it stands just has URIs as Semantic Web objects referring as names to 
> other URIs as Semantic Web objects, and does not fulfill what the Semantic 
> Web really needs: a way to move out of the Web and to the wide world 
> beyond the Web. The Web needs to be integrated more into the world, and 
> there lies the true holy grail of the Semantic Web. This is not just a 
> problem for the Web, but the fundamental problem that proved to be the 
> ultimate bane of AI. Indeed, it's easy to just attach a model theory to 
> any formal system and say "We have semantics." Yes, that's strictly true - 
> but let's not forget the adjective "model-theoretic." And models of the 
> real world can be wrong, and often are. The real burden of the Semantic 
> Web will lie on the ability of people and machines to produce models using 
> SemWeb languages whose model-theoretic interpretations are relevant to the 
> real world, and match them in interesting and useful ways that allow the 
> Web to do things that are either impossible or very difficult on the 
> current Web. Can people and machines do this in a large, dencentralized 
> manner? Are the SemWeb standards sufficient for the task? Yet, while the 
> answer to that question is unknown, the winds seem favorable.
>
>
> 

Received on Tuesday, 5 April 2005 04:52:10 UTC