- From: Harry Halpin <hhalpin@ibiblio.org>
- Date: Tue, 5 Apr 2005 01:01:31 -0400 (EDT)
- To: Jeremy Wong 黃泓量 <50263336@student.cityu.edu.hk>
- Cc: www-tag@w3.org, www-rdf-interest@w3.org, semantic-web@w3.org
I think you miss the point.
Your free to dereference a URI all you want, but obviously
http://www.ihmc.us/users/phayes/PatHayes.html
does not dereference Pat Hayes in the same way
http://www.ihmc.us/users/user.php?UserID=4
clearly dereferences his web-page.
Even if
http://www.ihmc.us/users/phayes/PatHayes.html
gave you via content-negotiation/URIQA/whatever some nice RDF,
you get RDF, not Pat Hayes. And you can't owl:imports Pat Hayes,
although you can probably get more RDF statements about him if you really
want to (well, assuming someone has coded them somewhere).
Confusing the RDF/web-page for Pat Hayes is basically confusing
the map for the territory. Not that doing that can't be useful quite
often.
It seems it's not a problem, it's a feature of the SemWeb :)
-harry
On Tue, 5 Apr 2005, [utf-8] Jeremy Wong ??? wrote:
> In the world of RDF, it is free to dereference a URI. It is really a problem
> because a representationREST may be temporarily unaccessible due to network
> failure and server maintenance. In the world of OWL, we have the vocabulary
> owl:imports. We have the concept of imports closure. It becomes not a
> problem. Hence, semantic inference in the world of RDF should be done
> manually by merging RDF graphs. Semantic inference in the world of OWL can be
> done dynamically when the collection of ontologies and axioms and facts is
> imports closed.
>
> It seems that it is not a problem in the semantic web.
>
>
> Jeremy
>
> ----- Original Message ----- From: "Harry Halpin" <hhalpin@ibiblio.org>
> To: <www-tag@w3.org>
> Cc: <www-rdf-interest@w3.org>; <semantic-web@w3.org>
> Sent: Tuesday, April 05, 2005 10:59 AM
> Subject: URIs as names-for-reference vs locations-for-access
>
>
>>
>> Ah, a title might be courteous....
>>
>> Again, there seems to be the usual questions about the SemWeb popping up,
>> and in particular http-range-14. There also doesn't seem to be much
>> progress on these issues. Here's some notes that I think may be helpful,
>> which basically try to distinguish between URIs as names for locations
>> versus URIs as locations for physical access, as well as try to define the
>> elusive term "on the Web" as being something that if the Web was destroyed,
>> would also be destroyed. Also I distinguish between the use of
>> representation in REST versus representation in AI/philosophy, which are
>> not always the same. I think these distinctions, and taking them seriously,
>> is clearly very important to http-range-14.
>>
>> The full text is here, and benefited from some discussion with Pat Hayes:
>>
>> http://www.ibiblio.org/hhalpin/homepage/notes/uri.html
>>
>> Text version below:
>> -----------------------------------------------------------------------
>> URIs as Names for Reference and as Locations for Access
>> httpRange-14 notes
>> By Harry Halpin
>> Thanks to Pat Hayes for some examples and commentary, although any errors
>> are due to me of course!
>>
>>
>> What do URIs identify?
>>
>> In essence, one reason Web works because using a web protocol like
>> http(Hypertext Transfer Protocol), one can from a client send a request to
>> a server to do an operation such as HTTP GET for a given URI and
>> dereference something, often a web-page. However, this very basic feature
>> of the Web is bedeviled by a question: "What is the range of the HTTP
>> dereference function?" In other words, what do URIs identify? In theory
>> this question has been solved by the W3C TAG's AWWW: URIs refer to
>> anything. Upon inspection, the official definition is actually circular:
>> "We do not limit the scope of what might be a resource...it is used in a
>> general sense for whatever might be identified by a URI." The question then
>> arises that if a resource is just anything that could theoretically be with
>> a identified URI, is there anything that can not be identified? It would
>> seem not. This view is given by the AWWW as "our use of the term resource
>> is intentionally more broad. Other things, such as cars and dogs ... are
>> resources too." However, referring to a web-page and the car in my garage
>> are similar, but not exactly the same. The essential difference is this: in
>> the first case on the Web we have physical, connected, access to the
>> Web-page, while in the second case if we are using Semantic Web logic to
>> refer to my car, we only the ability to refer to my car by a URI name, and
>> this has no direct, connected, or physical access. When one uses a URI as a
>> name there is a disconnect, as the thing named may not be on the Web.
>>
>> The division between representation and resource existed but was not
>> explicitly stated, and definitely not noticed by, most of the users of the
>> original hypertext Web. URLs seem to be originally meant to identify the
>> location of representations, such as HTML web-pages, or possibly sets of
>> representations, such when through content negotiation a news website
>> figures out where you live and then serves you your local news. With the
>> advent of the Semantic Web, the problem of httpRange-14 comes up precisely
>> because a URI can be used to refer to anything, not just web pages. To be
>> more precise, the issue comes up because URIs can refer to things that are
>> not "on the Web" and so do not necessarily have a Web-accessible
>> representation. Despite of this, these things that are "not on the Web" are
>> fundamentally "on the Web" in another sense, since they can be reasoned
>> about by the Semantic Web. The crucial point is what does "on the Web"
>> mean? To answer that question we must pursue the historical chain of events
>> from URL to URN to URI.
>>
>> Locations
>>
>> Uniform Resource Locations (URL) did not suffer from the httpRange-14
>> issue, unlike their nearly identical brethren URIs. Unlike URIs, URLs
>> identified a specific type of thing: a location, which is a physical place.
>> This location was assumed to be on the Web. By "on the Web," something that
>> is physically connected to the Web. A URL denotes a location on some
>> web-server which serves representations (HTML document, music file to
>> download, whatever) to visiting web clients. A location can be connected to
>> the Web because it - even after endless redirection - in a physical place.
>>
>> Take a mundane example: my address. An address is a just a location that
>> has a thing that can (usually) be found at that location, and there exists
>> a specified system for finding the location of an address. This allows
>> multiple locations to be ordered in a way that humans, such as in street
>> addresses (or machines in the case of IP addresses) can navigate easily. In
>> the case of my address, and if one wants to find me, they can try to looks
>> for at the location of my address - and I'm sometimes not there, so my
>> address can give the person trying to find me a metaphysical 404 error. A
>> location can, and should, give you direct, connected, physical access to
>> the thing at the location. URLs are used as names of locations, and sending
>> at HTTP GET (or POST, or HEAD, and so on) to a server requires the server
>> if possible to go to the location and physically access the thing at the
>> location, usually by copying it and sending a copy to your computer. Or
>> sending a very real 404 error.
>>
>> On the Web
>>
>> Something could be found on the Web if it physically and causally connected
>> to the Web. This means that whatever it was "on the Web," it could be
>> encoded into bits and transferred over the Web. However, this is only "on
>> the Web" the Web in the strongest sense: as in always on the Web. A thing
>> can be only on the Web sometimes, or only partially on the Web, or only
>> rarely on the Web. By our definition, if it could not be removed from the
>> Web without loss of its functionality. One can imagine a whole range of
>> possibilities, from being "strongly" on the Web (all the time) to "weakly"
>> on the Web (occasionally). Thus, both documents and servers are "on the
>> Web", and humans are not "on the Web" in a weak sense since they only
>> interacted directly with the Web indirectly through typing on keyboards.
>> Things like the Eiffel Tower or Louis XVI are definitely "not on the Web"
>> on the Web, since Louis XVI is long gone and cannot at any point directly
>> connect physically to the Web, while the Eiffel Tower is only represented
>> on the Web, but no physically sending any bytes to anyone itself. The
>> Eiffel tower is composed not of bytes, but of steel. This brings us to
>> "representations" on the Web. What is the difference between something
>> merely having a representation on the Web and something being fully on the
>> Web? Rephrasing Brian Smith: Some thing is on the Web such that if the Web
>> itself was destroyed, that thing would also be destroyed. If not, it's not
>> fully on the Web. If someone destroyed the Web, this would not damage me if
>> I were being denoted by a URI, but my homepage at that URI would be up in
>> smoke if that what's people were using to refer to me by. I am not on the
>> Web in a strong sense, but my homepage sure is. There are lots of middling
>> cases: my computer is weakly on the Web, more so than myself. If my httpd
>> daemon went down and my computer could no longer access the Web, or the Web
>> itself collapsed, the computer qua computer still exists, but the computer
>> qua Web server went up in smoke with the rest of the Web. One good question
>> yet to be answered when are humans on the Web in a strong sense? Would it
>> require our credit card details to be in an chip beneath our skin with a
>> URI, and wireless internet monitoring us with a GPS that sent messages over
>> the Internet? Those examples seem also too simplistic and extreme. Still,
>> what is the difference between a something being represented on the Web and
>> being on the Web? One necessary but not nearly sufficient condition for
>> "representation" would be that a thing X represents another thing Y if you
>> can destroy thing X and thing Y remains unscathed. Representations qua
>> representations are on the Web, and would be destroyed if the Web was
>> destroyed. However, what they represent would not be destroyed, unless what
>> the representation represented also was on the Web.
>>
>> Representations: REST and AI
>>
>> Before going any further, we have to distinguish two different uses of the
>> word "representation." The first is the use of "representation" as it is
>> used artificial intelligence, cognitive science, and philosophy. In this
>> use, a representation is something that "denotes" or "is about" something
>> else, although often additional requirements are put on exactly what type
>> of things the representation or its denotation may be. This will be called
>> "representationAI." The second use is the use of "representation" as used
>> by REST (The Representational State Transfer web architecture theory of Roy
>> Fielding), where a representation can be whatever that a URI returns from a
>> HTTP request. This will be called a "representationREST". A
>> representationREST, unlike a representationAI, does not necessarily refer
>> to or denote any other thing - although it might! The two definitions are
>> not the same, but not mutually exclusive either. So, the difference between
>> "on the Web" and "not on the Web" is also a test of both types of
>> representation. A representationAI can qua representationAI be entirely on
>> the Web if what it represents is also on the Web. Lots of representations,
>> such an analog photo on my desk, are not on the Web at all. In another
>> case, a picture of me on the Web is on the Web qua itself but not on the
>> Web qua me, because it denotes me, not something on the Web. If the Web was
>> destroyed, it would only destroy the bytes of the representationAI, not
>> necessarily what the representation denoted. Also, representationsAI may
>> have layers of representationAI, as one representation may denote other
>> representationsAI, leading to all sorts of interesting chains of reference.
>> However, representationsREST are by definition on the Web, and would be
>> destroyed if the Web was destroyed, at least as the possible objects of
>> HTTP operations. This is because representationsREST are defined precisely
>> as the bytes that are sent over the Web. One could argue that copies of
>> them archived to a computer might survive. However, those copies would no
>> longer be representationsREST qua the Web, but just whatever they are
>> without the Web being involved. This argument does reveal that both sorts
>> of representation are functional categories that are dependent on their
>> context, as something is never a representationREST without being on the
>> Web (or in some parallel universe, another system that implements REST).
>> Something is never a representationAI without something being represented.
>>
>> Virtual Locations and Digitality
>>
>> This idea of physically being on the Web can be abstracted from the concept
>> of location. "Being on the Web" does not mean a thing has one URL or even
>> physical location. Something could be on the Web and have multiple URLs,
>> are multiple copies in different physical locations. A location can be a
>> virtual location, an abstraction over a set of possible physical
>> representations, as long as it really is a location. What exactly is the
>> "thing" at a URL location? It's not just a particular server, nor is it
>> some abstract resource. It is actually some bytes, a representationREST or
>> set of representationsREST, which one has to actually GET to determine
>> using your web client to see if it's a representationAI. The particular
>> server where the actual representationREST lives is actually denoted by
>> another type of location: wherever it is on the server, and the server has
>> a very concrete IP address. A URL can be a name that denotes a virtual
>> location, which is the forwarded to the place where the concrete bits are
>> stored. These bits are usually on a server somewhere. When one accesses
>> http://www.w3c.org, if I am in Japan I get the mirror of the W3C web-pages
>> in Japan, if I'm in the US I get the one hosted at MIT, but I get the same
>> "resource," regardless. Here the concept of resource as stated by TAG
>> starts making some sense. It's a concept about the contents of a
>> representationREST. However, this resource is not identical to the thing
>> physically received as bytes (that's the representationREST). A resource
>> seems to be the abstract idea of the common information between all the
>> possible representationsREST returned. To properly understand resource then
>> one needs a thorough inspection of theories of information and content,
>> which is beyond the scope of this little note. Still, what is physically
>> returned by a HTTP GET is just the representationREST, which may differ
>> between MIT and Kyoto, while it might not between INRIA and MIT. The fact
>> that the Web is digital becomes crucially important: the "copyability" of
>> the representationsREST, due to their digital nature, is crucial to why the
>> Web works, just as crucial as a universal naming scheme. Yet, things not
>> "on the Web" (Pat Hayes qua Pat Hayes, my dog, etc) don't have this
>> property of copyability. A picture on the Web of Pat Hayes is digital, but
>> Pat Hayes is not, no matter how much time he spends online.
>>
>> What's in a Name?
>>
>> A name is entirely different from a location. Unlike a location, a name
>> does not necessarily give you access to the thing named, and this thing
>> name we will call the referent of the name. The set of all referents of a
>> name (or denotations of a representation for that matter) we will call its
>> interpretation. In fact, names are usually used when connected, physical
>> access is impossible, and as such are place-holders for the physical thing
>> precisely because there is no physical access. This concept of "names" is
>> more in line with the URN effort, which essentially tries to serve as rigid
>> designators in the Kripkean sense for the Web. Since a name does not have
>> any connection to a referent, putting a name on the Web via a URI (such as
>> a URN) does absolutely nothing at all to the referent of the name. When
>> anyone accesses the resource "Pat Hayes" from URI
>> ,http://www.ihmc.us/users/phayes/PatHayes.html, Pat Hayes does magically
>> appear next to them. What that URI currently can return from a HTTP get is
>> a representationREST: a Web-page in HTML encoded as very physical bytes
>> somewhere that get sent to me over a wire as very physical bytes, and then
>> displaying by a very physical computer the social security number of Pat
>> Hayes and other defining details. It could even theoretically return a
>> definition of Pat Hayes in RDF. Yet this particular URI representationREST
>> also serves double-duty as a representationAI, since it contains pictures
>> of the actual Pat Hayes, relevant facts about him, and so on. Pat Hayes
>> himself is not on the Web, since if the Web is destroyed Pat Hayes would
>> merrily go along, and probably with more spare time.
>>
>> So, the use of a URI as a "name" causes a URI to be used as a
>> representationAI. However, what exactly the interpretation of a URI as a
>> "name" actually is goes beyond the physics of transferring bytes. This
>> interpretation is either the yet-to-come metaphysics of the Semantic Web,
>> social meaning, or something else - who knows? But what is important is
>> that it is a non-physical, non-causal, non-connected relationship, unlike
>> the relationship of a location which is a physical, connected, causal
>> relationship. Note that URIs used as names-for-reference are common in the
>> Semantic Web, and the Semantic Web depends on there being names with
>> interpretations to reason over. Because there is no direct access to the
>> thing the URI-as-name identifies, unlike the use of a URI-as-location, the
>> Semantic Web uses URIs without any necessary use of representationsREST. A
>> URI in the Semantic Web is used more like as "place-holders" or even
>> (stretching it a bit) "keys," without any HTTP operation returning any
>> bytes from a server in terms of representationREST. Thus, the Semantic Web
>> uses URIs as representationsAI, while the Good-Old HyperText Web uses URIs
>> as representationsREST.
>>
>> Double Lives as Names and Locations
>>
>> The key of the confusion is that http fundamentally will dereference
>> whatever a URI refers to, and there are two distinct types of functional
>> roles a URI can play: name and location. A URI can serves as a
>> identifier-as-a-name, which is a non-physical relation of reference, and as
>> a identifier of a location, which is a physical relation of access. Just
>> naming something has no effect on the thing named: naming something does
>> not bathe the thing named in any type of energy that we can detect via a
>> physical radar. There is no way to build a detector to detect what exactly
>> someone means by a URI, although we can guess from talking to them or
>> accessing representations they give us. Locations give you physical,
>> connected, access to a thing. If you go to a location to get something, if
>> the thing is there you return with it physically in hand. A name might, but
>> does not have to and usually does not give one any sort of physical,
>> connected, access to the thing named by the location.
>>
>> The word "identifier" is even more vague than name or location, and here
>> the problem of the "identity" crisis appears: how do we know if the URI is
>> being used for something as a name or as a location? The URI itself does
>> not tell us. Even worse, what does "identify" mean, and how can we tell if
>> two things identify the same thing? With representationsAI that is
>> sometimes very clear, as in photographs, and sometimes not so clear, as in
>> abstract art. Even the integers have problems with identification: does
>> "11" identify eleven in decimal or three in binary? We won't know - and
>> can't know unless we are given some sort of decoding scheme. In programming
>> language tradition "identifier" has a pretty secure meaning and in that
>> context the access/reference distinction is theoretically important but not
>> of great practical significance, since everything you can refer to is
>> physically accessible by the computer and has an address in memory. This is
>> not true of logic, and definitely not true of model-theoretic semantics.
>> Importantly, the access and reference distinction holds on the Web with
>> many things that have URIs. In an information space, things may be
>> identified without being accessed via a physical connection. In terms of
>> the AWWW, a "non-information" resource is probably similar to the use of
>> URI-as-access, while the use of URI for reference without access is called
>> an "information resource."
>>
>> Solving the Identity Crisis
>>
>> Then there's the identity crisis: a single URI can actually play both roles
>> (name with no access and location with access) at the same time, which
>> gives us a powerful device for some application. The official view is that
>> the representations are supposed to be interpreted by applications
>> depending on MIME types is clearly focused on the use of a URI as a
>> location for access; yet nothing forbids a URI that returns a
>> representationREST or some other data to be used tell the web client that
>> this URI is also a name for reference in addition to a location for access.
>> In fact, for a URI used only as a name, MIME-types are clearly irrelevant.
>> At least for the time being!
>>
>> It would be useful to distinguish when a URI is used as "name" or as a
>> "location, " and if some URIs can only be used as names or only as
>> locations. In other words, this depends on whether the thing (which would
>> be the "resource") identified by URI is on the Web or not. This already
>> reduces to the "non-information resource" and "information resource"
>> distinction on some level, and so is not a return to the historical Dark
>> Ages of the Web. Since they share a common syntax, it does make sense to
>> unite URLs and URNs on a level as URIs, and even to use URLs as "names."
>> The identity crisis can be solved pretty easily, as shown by the Web Proper
>> Names proposal. First, a separate URI scheme (wpn:// or tdb://) can
>> distinguish the use of URI as names for reference from URI as locations for
>> access. To capitalise even further on the identity crisis, this can be
>> distinguished without a new URI scheme by solving it by the use of a
>> representationREST, by having a type of representation format which says
>> that this URI is a "name" as opposed to a "location." In fact, one could
>> even have a special MIME-type to distinguish names for things: imagine the
>> "name" MIME-type, or the "application/xhtml+xml+name" type.
>>
>> The Future...
>>
>> However, one subject which needs more exploration is the "interpretation"
>> of URIs as names. How does one tell, if a URI as a name for reference, what
>> its interpretation is? All the RDF statements that apply to that URI? And
>> if so, how do we get them in a decentralized system? SPARQL? URIQA? Magic?
>> In other words, assuming the URI gave you machine-readable descriptions in
>> some Semantic Web language readable by machines, should the use of a
>> URI-as-a-name really mean that this URI refers to (or denotes) whatever is
>> necessary to satisfy the Semantic Web description? The Semantic Web allows
>> one to build a number of roles and assertions, and one would assume that
>> its interpretation is those other Semantic Web URIs that are satisfied by
>> these roles and assertions. However, the SemWeb as it stands just has URIs
>> as Semantic Web objects referring as names to other URIs as Semantic Web
>> objects, and does not fulfill what the Semantic Web really needs: a way to
>> move out of the Web and to the wide world beyond the Web. The Web needs to
>> be integrated more into the world, and there lies the true holy grail of
>> the Semantic Web. This is not just a problem for the Web, but the
>> fundamental problem that proved to be the ultimate bane of AI. Indeed, it's
>> easy to just attach a model theory to any formal system and say "We have
>> semantics." Yes, that's strictly true - but let's not forget the adjective
>> "model-theoretic." And models of the real world can be wrong, and often
>> are. The real burden of the Semantic Web will lie on the ability of people
>> and machines to produce models using SemWeb languages whose model-theoretic
>> interpretations are relevant to the real world, and match them in
>> interesting and useful ways that allow the Web to do things that are either
>> impossible or very difficult on the current Web. Can people and machines do
>> this in a large, dencentralized manner? Are the SemWeb standards sufficient
>> for the task? Yet, while the answer to that question is unknown, the winds
>> seem favorable.
>>
>>
>>
>
>
--
--harry
Harry Halpin
Informatics, University of Edinburgh
http://www.ibiblio.org/hhalpin
Received on Tuesday, 5 April 2005 05:01:49 UTC