- From: Harry Halpin <hhalpin@ibiblio.org>
- Date: Tue, 5 Apr 2005 01:01:31 -0400 (EDT)
- To: Jeremy Wong 黃泓量 <50263336@student.cityu.edu.hk>
- Cc: www-tag@w3.org, www-rdf-interest@w3.org, semantic-web@w3.org
I think you miss the point. Your free to dereference a URI all you want, but obviously http://www.ihmc.us/users/phayes/PatHayes.html does not dereference Pat Hayes in the same way http://www.ihmc.us/users/user.php?UserID=4 clearly dereferences his web-page. Even if http://www.ihmc.us/users/phayes/PatHayes.html gave you via content-negotiation/URIQA/whatever some nice RDF, you get RDF, not Pat Hayes. And you can't owl:imports Pat Hayes, although you can probably get more RDF statements about him if you really want to (well, assuming someone has coded them somewhere). Confusing the RDF/web-page for Pat Hayes is basically confusing the map for the territory. Not that doing that can't be useful quite often. It seems it's not a problem, it's a feature of the SemWeb :) -harry On Tue, 5 Apr 2005, [utf-8] Jeremy Wong ??? wrote: > In the world of RDF, it is free to dereference a URI. It is really a problem > because a representationREST may be temporarily unaccessible due to network > failure and server maintenance. In the world of OWL, we have the vocabulary > owl:imports. We have the concept of imports closure. It becomes not a > problem. Hence, semantic inference in the world of RDF should be done > manually by merging RDF graphs. Semantic inference in the world of OWL can be > done dynamically when the collection of ontologies and axioms and facts is > imports closed. > > It seems that it is not a problem in the semantic web. > > > Jeremy > > ----- Original Message ----- From: "Harry Halpin" <hhalpin@ibiblio.org> > To: <www-tag@w3.org> > Cc: <www-rdf-interest@w3.org>; <semantic-web@w3.org> > Sent: Tuesday, April 05, 2005 10:59 AM > Subject: URIs as names-for-reference vs locations-for-access > > >> >> Ah, a title might be courteous.... >> >> Again, there seems to be the usual questions about the SemWeb popping up, >> and in particular http-range-14. There also doesn't seem to be much >> progress on these issues. Here's some notes that I think may be helpful, >> which basically try to distinguish between URIs as names for locations >> versus URIs as locations for physical access, as well as try to define the >> elusive term "on the Web" as being something that if the Web was destroyed, >> would also be destroyed. Also I distinguish between the use of >> representation in REST versus representation in AI/philosophy, which are >> not always the same. I think these distinctions, and taking them seriously, >> is clearly very important to http-range-14. >> >> The full text is here, and benefited from some discussion with Pat Hayes: >> >> http://www.ibiblio.org/hhalpin/homepage/notes/uri.html >> >> Text version below: >> ----------------------------------------------------------------------- >> URIs as Names for Reference and as Locations for Access >> httpRange-14 notes >> By Harry Halpin >> Thanks to Pat Hayes for some examples and commentary, although any errors >> are due to me of course! >> >> >> What do URIs identify? >> >> In essence, one reason Web works because using a web protocol like >> http(Hypertext Transfer Protocol), one can from a client send a request to >> a server to do an operation such as HTTP GET for a given URI and >> dereference something, often a web-page. However, this very basic feature >> of the Web is bedeviled by a question: "What is the range of the HTTP >> dereference function?" In other words, what do URIs identify? In theory >> this question has been solved by the W3C TAG's AWWW: URIs refer to >> anything. Upon inspection, the official definition is actually circular: >> "We do not limit the scope of what might be a resource...it is used in a >> general sense for whatever might be identified by a URI." The question then >> arises that if a resource is just anything that could theoretically be with >> a identified URI, is there anything that can not be identified? It would >> seem not. This view is given by the AWWW as "our use of the term resource >> is intentionally more broad. Other things, such as cars and dogs ... are >> resources too." However, referring to a web-page and the car in my garage >> are similar, but not exactly the same. The essential difference is this: in >> the first case on the Web we have physical, connected, access to the >> Web-page, while in the second case if we are using Semantic Web logic to >> refer to my car, we only the ability to refer to my car by a URI name, and >> this has no direct, connected, or physical access. When one uses a URI as a >> name there is a disconnect, as the thing named may not be on the Web. >> >> The division between representation and resource existed but was not >> explicitly stated, and definitely not noticed by, most of the users of the >> original hypertext Web. URLs seem to be originally meant to identify the >> location of representations, such as HTML web-pages, or possibly sets of >> representations, such when through content negotiation a news website >> figures out where you live and then serves you your local news. With the >> advent of the Semantic Web, the problem of httpRange-14 comes up precisely >> because a URI can be used to refer to anything, not just web pages. To be >> more precise, the issue comes up because URIs can refer to things that are >> not "on the Web" and so do not necessarily have a Web-accessible >> representation. Despite of this, these things that are "not on the Web" are >> fundamentally "on the Web" in another sense, since they can be reasoned >> about by the Semantic Web. The crucial point is what does "on the Web" >> mean? To answer that question we must pursue the historical chain of events >> from URL to URN to URI. >> >> Locations >> >> Uniform Resource Locations (URL) did not suffer from the httpRange-14 >> issue, unlike their nearly identical brethren URIs. Unlike URIs, URLs >> identified a specific type of thing: a location, which is a physical place. >> This location was assumed to be on the Web. By "on the Web," something that >> is physically connected to the Web. A URL denotes a location on some >> web-server which serves representations (HTML document, music file to >> download, whatever) to visiting web clients. A location can be connected to >> the Web because it - even after endless redirection - in a physical place. >> >> Take a mundane example: my address. An address is a just a location that >> has a thing that can (usually) be found at that location, and there exists >> a specified system for finding the location of an address. This allows >> multiple locations to be ordered in a way that humans, such as in street >> addresses (or machines in the case of IP addresses) can navigate easily. In >> the case of my address, and if one wants to find me, they can try to looks >> for at the location of my address - and I'm sometimes not there, so my >> address can give the person trying to find me a metaphysical 404 error. A >> location can, and should, give you direct, connected, physical access to >> the thing at the location. URLs are used as names of locations, and sending >> at HTTP GET (or POST, or HEAD, and so on) to a server requires the server >> if possible to go to the location and physically access the thing at the >> location, usually by copying it and sending a copy to your computer. Or >> sending a very real 404 error. >> >> On the Web >> >> Something could be found on the Web if it physically and causally connected >> to the Web. This means that whatever it was "on the Web," it could be >> encoded into bits and transferred over the Web. However, this is only "on >> the Web" the Web in the strongest sense: as in always on the Web. A thing >> can be only on the Web sometimes, or only partially on the Web, or only >> rarely on the Web. By our definition, if it could not be removed from the >> Web without loss of its functionality. One can imagine a whole range of >> possibilities, from being "strongly" on the Web (all the time) to "weakly" >> on the Web (occasionally). Thus, both documents and servers are "on the >> Web", and humans are not "on the Web" in a weak sense since they only >> interacted directly with the Web indirectly through typing on keyboards. >> Things like the Eiffel Tower or Louis XVI are definitely "not on the Web" >> on the Web, since Louis XVI is long gone and cannot at any point directly >> connect physically to the Web, while the Eiffel Tower is only represented >> on the Web, but no physically sending any bytes to anyone itself. The >> Eiffel tower is composed not of bytes, but of steel. This brings us to >> "representations" on the Web. What is the difference between something >> merely having a representation on the Web and something being fully on the >> Web? Rephrasing Brian Smith: Some thing is on the Web such that if the Web >> itself was destroyed, that thing would also be destroyed. If not, it's not >> fully on the Web. If someone destroyed the Web, this would not damage me if >> I were being denoted by a URI, but my homepage at that URI would be up in >> smoke if that what's people were using to refer to me by. I am not on the >> Web in a strong sense, but my homepage sure is. There are lots of middling >> cases: my computer is weakly on the Web, more so than myself. If my httpd >> daemon went down and my computer could no longer access the Web, or the Web >> itself collapsed, the computer qua computer still exists, but the computer >> qua Web server went up in smoke with the rest of the Web. One good question >> yet to be answered when are humans on the Web in a strong sense? Would it >> require our credit card details to be in an chip beneath our skin with a >> URI, and wireless internet monitoring us with a GPS that sent messages over >> the Internet? Those examples seem also too simplistic and extreme. Still, >> what is the difference between a something being represented on the Web and >> being on the Web? One necessary but not nearly sufficient condition for >> "representation" would be that a thing X represents another thing Y if you >> can destroy thing X and thing Y remains unscathed. Representations qua >> representations are on the Web, and would be destroyed if the Web was >> destroyed. However, what they represent would not be destroyed, unless what >> the representation represented also was on the Web. >> >> Representations: REST and AI >> >> Before going any further, we have to distinguish two different uses of the >> word "representation." The first is the use of "representation" as it is >> used artificial intelligence, cognitive science, and philosophy. In this >> use, a representation is something that "denotes" or "is about" something >> else, although often additional requirements are put on exactly what type >> of things the representation or its denotation may be. This will be called >> "representationAI." The second use is the use of "representation" as used >> by REST (The Representational State Transfer web architecture theory of Roy >> Fielding), where a representation can be whatever that a URI returns from a >> HTTP request. This will be called a "representationREST". A >> representationREST, unlike a representationAI, does not necessarily refer >> to or denote any other thing - although it might! The two definitions are >> not the same, but not mutually exclusive either. So, the difference between >> "on the Web" and "not on the Web" is also a test of both types of >> representation. A representationAI can qua representationAI be entirely on >> the Web if what it represents is also on the Web. Lots of representations, >> such an analog photo on my desk, are not on the Web at all. In another >> case, a picture of me on the Web is on the Web qua itself but not on the >> Web qua me, because it denotes me, not something on the Web. If the Web was >> destroyed, it would only destroy the bytes of the representationAI, not >> necessarily what the representation denoted. Also, representationsAI may >> have layers of representationAI, as one representation may denote other >> representationsAI, leading to all sorts of interesting chains of reference. >> However, representationsREST are by definition on the Web, and would be >> destroyed if the Web was destroyed, at least as the possible objects of >> HTTP operations. This is because representationsREST are defined precisely >> as the bytes that are sent over the Web. One could argue that copies of >> them archived to a computer might survive. However, those copies would no >> longer be representationsREST qua the Web, but just whatever they are >> without the Web being involved. This argument does reveal that both sorts >> of representation are functional categories that are dependent on their >> context, as something is never a representationREST without being on the >> Web (or in some parallel universe, another system that implements REST). >> Something is never a representationAI without something being represented. >> >> Virtual Locations and Digitality >> >> This idea of physically being on the Web can be abstracted from the concept >> of location. "Being on the Web" does not mean a thing has one URL or even >> physical location. Something could be on the Web and have multiple URLs, >> are multiple copies in different physical locations. A location can be a >> virtual location, an abstraction over a set of possible physical >> representations, as long as it really is a location. What exactly is the >> "thing" at a URL location? It's not just a particular server, nor is it >> some abstract resource. It is actually some bytes, a representationREST or >> set of representationsREST, which one has to actually GET to determine >> using your web client to see if it's a representationAI. The particular >> server where the actual representationREST lives is actually denoted by >> another type of location: wherever it is on the server, and the server has >> a very concrete IP address. A URL can be a name that denotes a virtual >> location, which is the forwarded to the place where the concrete bits are >> stored. These bits are usually on a server somewhere. When one accesses >> http://www.w3c.org, if I am in Japan I get the mirror of the W3C web-pages >> in Japan, if I'm in the US I get the one hosted at MIT, but I get the same >> "resource," regardless. Here the concept of resource as stated by TAG >> starts making some sense. It's a concept about the contents of a >> representationREST. However, this resource is not identical to the thing >> physically received as bytes (that's the representationREST). A resource >> seems to be the abstract idea of the common information between all the >> possible representationsREST returned. To properly understand resource then >> one needs a thorough inspection of theories of information and content, >> which is beyond the scope of this little note. Still, what is physically >> returned by a HTTP GET is just the representationREST, which may differ >> between MIT and Kyoto, while it might not between INRIA and MIT. The fact >> that the Web is digital becomes crucially important: the "copyability" of >> the representationsREST, due to their digital nature, is crucial to why the >> Web works, just as crucial as a universal naming scheme. Yet, things not >> "on the Web" (Pat Hayes qua Pat Hayes, my dog, etc) don't have this >> property of copyability. A picture on the Web of Pat Hayes is digital, but >> Pat Hayes is not, no matter how much time he spends online. >> >> What's in a Name? >> >> A name is entirely different from a location. Unlike a location, a name >> does not necessarily give you access to the thing named, and this thing >> name we will call the referent of the name. The set of all referents of a >> name (or denotations of a representation for that matter) we will call its >> interpretation. In fact, names are usually used when connected, physical >> access is impossible, and as such are place-holders for the physical thing >> precisely because there is no physical access. This concept of "names" is >> more in line with the URN effort, which essentially tries to serve as rigid >> designators in the Kripkean sense for the Web. Since a name does not have >> any connection to a referent, putting a name on the Web via a URI (such as >> a URN) does absolutely nothing at all to the referent of the name. When >> anyone accesses the resource "Pat Hayes" from URI >> ,http://www.ihmc.us/users/phayes/PatHayes.html, Pat Hayes does magically >> appear next to them. What that URI currently can return from a HTTP get is >> a representationREST: a Web-page in HTML encoded as very physical bytes >> somewhere that get sent to me over a wire as very physical bytes, and then >> displaying by a very physical computer the social security number of Pat >> Hayes and other defining details. It could even theoretically return a >> definition of Pat Hayes in RDF. Yet this particular URI representationREST >> also serves double-duty as a representationAI, since it contains pictures >> of the actual Pat Hayes, relevant facts about him, and so on. Pat Hayes >> himself is not on the Web, since if the Web is destroyed Pat Hayes would >> merrily go along, and probably with more spare time. >> >> So, the use of a URI as a "name" causes a URI to be used as a >> representationAI. However, what exactly the interpretation of a URI as a >> "name" actually is goes beyond the physics of transferring bytes. This >> interpretation is either the yet-to-come metaphysics of the Semantic Web, >> social meaning, or something else - who knows? But what is important is >> that it is a non-physical, non-causal, non-connected relationship, unlike >> the relationship of a location which is a physical, connected, causal >> relationship. Note that URIs used as names-for-reference are common in the >> Semantic Web, and the Semantic Web depends on there being names with >> interpretations to reason over. Because there is no direct access to the >> thing the URI-as-name identifies, unlike the use of a URI-as-location, the >> Semantic Web uses URIs without any necessary use of representationsREST. A >> URI in the Semantic Web is used more like as "place-holders" or even >> (stretching it a bit) "keys," without any HTTP operation returning any >> bytes from a server in terms of representationREST. Thus, the Semantic Web >> uses URIs as representationsAI, while the Good-Old HyperText Web uses URIs >> as representationsREST. >> >> Double Lives as Names and Locations >> >> The key of the confusion is that http fundamentally will dereference >> whatever a URI refers to, and there are two distinct types of functional >> roles a URI can play: name and location. A URI can serves as a >> identifier-as-a-name, which is a non-physical relation of reference, and as >> a identifier of a location, which is a physical relation of access. Just >> naming something has no effect on the thing named: naming something does >> not bathe the thing named in any type of energy that we can detect via a >> physical radar. There is no way to build a detector to detect what exactly >> someone means by a URI, although we can guess from talking to them or >> accessing representations they give us. Locations give you physical, >> connected, access to a thing. If you go to a location to get something, if >> the thing is there you return with it physically in hand. A name might, but >> does not have to and usually does not give one any sort of physical, >> connected, access to the thing named by the location. >> >> The word "identifier" is even more vague than name or location, and here >> the problem of the "identity" crisis appears: how do we know if the URI is >> being used for something as a name or as a location? The URI itself does >> not tell us. Even worse, what does "identify" mean, and how can we tell if >> two things identify the same thing? With representationsAI that is >> sometimes very clear, as in photographs, and sometimes not so clear, as in >> abstract art. Even the integers have problems with identification: does >> "11" identify eleven in decimal or three in binary? We won't know - and >> can't know unless we are given some sort of decoding scheme. In programming >> language tradition "identifier" has a pretty secure meaning and in that >> context the access/reference distinction is theoretically important but not >> of great practical significance, since everything you can refer to is >> physically accessible by the computer and has an address in memory. This is >> not true of logic, and definitely not true of model-theoretic semantics. >> Importantly, the access and reference distinction holds on the Web with >> many things that have URIs. In an information space, things may be >> identified without being accessed via a physical connection. In terms of >> the AWWW, a "non-information" resource is probably similar to the use of >> URI-as-access, while the use of URI for reference without access is called >> an "information resource." >> >> Solving the Identity Crisis >> >> Then there's the identity crisis: a single URI can actually play both roles >> (name with no access and location with access) at the same time, which >> gives us a powerful device for some application. The official view is that >> the representations are supposed to be interpreted by applications >> depending on MIME types is clearly focused on the use of a URI as a >> location for access; yet nothing forbids a URI that returns a >> representationREST or some other data to be used tell the web client that >> this URI is also a name for reference in addition to a location for access. >> In fact, for a URI used only as a name, MIME-types are clearly irrelevant. >> At least for the time being! >> >> It would be useful to distinguish when a URI is used as "name" or as a >> "location, " and if some URIs can only be used as names or only as >> locations. In other words, this depends on whether the thing (which would >> be the "resource") identified by URI is on the Web or not. This already >> reduces to the "non-information resource" and "information resource" >> distinction on some level, and so is not a return to the historical Dark >> Ages of the Web. Since they share a common syntax, it does make sense to >> unite URLs and URNs on a level as URIs, and even to use URLs as "names." >> The identity crisis can be solved pretty easily, as shown by the Web Proper >> Names proposal. First, a separate URI scheme (wpn:// or tdb://) can >> distinguish the use of URI as names for reference from URI as locations for >> access. To capitalise even further on the identity crisis, this can be >> distinguished without a new URI scheme by solving it by the use of a >> representationREST, by having a type of representation format which says >> that this URI is a "name" as opposed to a "location." In fact, one could >> even have a special MIME-type to distinguish names for things: imagine the >> "name" MIME-type, or the "application/xhtml+xml+name" type. >> >> The Future... >> >> However, one subject which needs more exploration is the "interpretation" >> of URIs as names. How does one tell, if a URI as a name for reference, what >> its interpretation is? All the RDF statements that apply to that URI? And >> if so, how do we get them in a decentralized system? SPARQL? URIQA? Magic? >> In other words, assuming the URI gave you machine-readable descriptions in >> some Semantic Web language readable by machines, should the use of a >> URI-as-a-name really mean that this URI refers to (or denotes) whatever is >> necessary to satisfy the Semantic Web description? The Semantic Web allows >> one to build a number of roles and assertions, and one would assume that >> its interpretation is those other Semantic Web URIs that are satisfied by >> these roles and assertions. However, the SemWeb as it stands just has URIs >> as Semantic Web objects referring as names to other URIs as Semantic Web >> objects, and does not fulfill what the Semantic Web really needs: a way to >> move out of the Web and to the wide world beyond the Web. The Web needs to >> be integrated more into the world, and there lies the true holy grail of >> the Semantic Web. This is not just a problem for the Web, but the >> fundamental problem that proved to be the ultimate bane of AI. Indeed, it's >> easy to just attach a model theory to any formal system and say "We have >> semantics." Yes, that's strictly true - but let's not forget the adjective >> "model-theoretic." And models of the real world can be wrong, and often >> are. The real burden of the Semantic Web will lie on the ability of people >> and machines to produce models using SemWeb languages whose model-theoretic >> interpretations are relevant to the real world, and match them in >> interesting and useful ways that allow the Web to do things that are either >> impossible or very difficult on the current Web. Can people and machines do >> this in a large, dencentralized manner? Are the SemWeb standards sufficient >> for the task? Yet, while the answer to that question is unknown, the winds >> seem favorable. >> >> >> > > -- --harry Harry Halpin Informatics, University of Edinburgh http://www.ibiblio.org/hhalpin
Received on Tuesday, 5 April 2005 05:01:49 UTC