- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Mon, 25 Feb 2008 11:49:03 -0800
- To: Danny Ayers <danny.ayers@gmail.com>
- Cc: noah_mendelsohn@us.ibm.com, "W3C TAG" <www-tag@w3.org>
On Feb 25, 2008, at 4:52 AM, Danny Ayers wrote: > Roy asks whether "A key requirement of the Semantic Web is that URIs > be used to identify resources unambiguously". Well, yes, I'd suggest > it is, in exactly the same way the Web is dependent on an essentially > unambiguous naming scheme - the resource identified and the URI are > intimately bound, thanks to their somewhat circular, fixpoint kind of > definition. Right, the Web is bound by a consistency (or lack thereof) in representations/results for any given URI. But that consistency only exists when observed over time. > On the other hand the relationship between a resource and > the thing it stands for does have ambiguity - the publisher may be > clear, but the consumer of such information is limited to making their > best interpretation of whatever (ultimately human-readable) > definitions the publisher has provided. And that's a problem because ...? Let's try an example. 1) a resource owner might have a good idea of what resource they intend to identify when they mint a URI, or they might just be uploading a hundred photos named IMG_98nnn.jpg. The Web doesn't depend on cool URIs -- they are just a nice thing to have. The resource owner just thinks of it as "My vacation in Nevada". 2) an author, having discovered a useful trove of photos, adds links to their personal favorites along with metadata to describe the resource that they are linking to. The Web doesn't depend on the author's link semantics matching the owner's resource semantics, even though it would be nice if they matched. 3) a thousand other authors do the same, using their own notions of the semantics that are important to them. One person notices that the scenic backdrop of our owner, snacking on sandwiches by the side of a road in Nevada, contains what looks like an alien spacecraft sticking out the side of an exposed bluff. Naturally, they slashdot the photo as evidence that UFOs exist, and it is linked to by another fifty thousand UFO enthusiasts as "proof that aliens exist among us (just ignore the guy with the sandwich)". 4) Google's spider wanders by, notices all these links to these photos in this collection, and then builds an index based on the links and text surrounding the references by others to particular photos, with extra weight given to photos that are described in the same way by multiple references. 5) The owner receives much fan mail and questions about this otherwise boring picture and (having read all the webarch documents) decides to maintain that URI as a permanent home for "I was abducted by a UFO in Nevada". Here is the problem. People mint URIs for various reasons and rarely decide what they mean until long after. People use URIs, and through their use assign meaning that may have little or nothing to do with the owner's original meaning. This mix of meanings and intentions is always ambiguous, even when the owner does take the time to carefully describe what they intend by the semantics of a name. Note that this entire example uses "Information Resources". As I said throughout the earlier discussions, there is no relevant distinction from the web's point of view between "information resources" and "non-information resources." Those categories exist purely for the sake of argument, based on the theory that it is somehow more important to perceive the ambiguity between those sets than it is to perceive the ambiguity *within* those sets. In fact, it is an entirely pointless exercise in maneuvering closed-world assumptions, instead of facing up to the real requirement: the Web is not a closed world. The question isn't "can we remove ambiguity?" It should be "can we understand a relationship given that ambiguity almost a certainty?" Because that's what life on the Web is all about -- communicating in spite of decentralized authority. There is nothing that we can do to the Web to make it less ambiguous without undoing the very design that made it successful in the first place -- a loose, decentralized, counter-authoritarian interconnectedness. > But I think Roy does highlight the most important part of the issue > when he says: > [[ > On the Web, millions of people mint URIs, and millions more use them > in references. Millions of human beings, conversing over time, with an > occasional URI thrown in to refer to a subject under discussion. > ]] > > Ok, the Semantic Web is an extension of the existing > (document-oriented) Web. Flipping that over, I think it's reasonable > to consider the existing Web as a projection or view of (some subset > of) the Semantic Web. > >> > From this perspective, regular HTML links can been seen as expressions > of (s, p, o) statements, where the predicate isn't explicitly typed. > The relation can be typed, using the rel/rev attributes in concert > with a HTML Meta Data profile - GRDDL is the nearest we have to a > formalism for this. But it's common practice to use a kind of > human-friendly implicit typing, for example using <a > href="http://www.ics.uci.edu/~fielding/">Roy Fielding</a> to refer to > a person. Note, however, that HTML anchors do not (by default) express an "is_a" type of relationship from the content to the identified resource. They are usually "more_about" relationships. > But I'm suggesting the Semantic Web *does* need to distinguish between > Roy the person and Roy's homepage. A reasonable RDF expression of the > link above might be something like: > > <> dc:related <http://www.ics.uci.edu/~fielding/> . > [ foaf:name "Roy Fielding"; > foaf:homepage <http://www.ics.uci.edu/~fielding/> ] . I think we are jumping back off the rails at this point. There is no doubt that the Semantic Web needs to make logical assertions. It does so by defining things like foaf:name and foaf:homepage in unambiguous ways, not by restricting the identifier range and certainly not by making assumptions about HTTP status codes. The above was true between 1993-2000. Today, my home page is <http://roy.gbiv.com/>. The Semantic Web should be capable of understanding that, even when it is temporarily untrue, because time is essential to understanding the Web. > But does the (document) Web need to distinguish between Roy the person > and Roy's homepage? Evidently not, given the utility of simple linkage > like that above. That's not what the document Web is doing with an anchor. The only time that we can make a valid assumption about the relationship expressed by an HTML anchor is when the rel="" attribute is used correctly. Likewise, there is nothing (aside from syntax issues) that prevents less ambiguous relationships to be expressed within the same representation, within the protocol stream, or within other representations on the Web (like RDF). > The only way I can see to square this circle is to differentiate > between two kinds of interpretation. For example: > > $ wget http://www.w3.org/People/Berners-Lee/card.n3#i > ... > HTTP request sent, awaiting response... 200 OK > > Web interpretation: this is somehow related to Tim > Semantic Web interpretation: we got a 200, so this is about Tim - what > does the RDF here say? > > If we'd got a 303, sure, we could follow the httpRange-14 resolution's > interpretation. But I don't think we can realistically assume > 200=Information Resource. I don't think this problem can be completely > resolved with a technical trick at the HTTP layer. Nobody *needs* to assume 200=Information Resource. That is another completely artificial case for the sake of useless argumentation. The 303 solution exists for people who do not want to imply that their named resource can be represented. That's all it means for a GET to return 303 (ever since 1994, when the original meaning of "redirect with new method" was deprecated due to lack of implementation and security issues and replaced with "redirect to see other resource"). Knowing the nature of the resource is irrelevant. The use case for the 303 recommendation was to avoid contradiction, not to avoid ambiguity. ....Roy
Received on Monday, 25 February 2008 19:49:19 UTC