- From: Xiaoshu Wang <wangxiao@musc.edu>
- Date: Mon, 22 Oct 2007 14:44:26 +0100
- To: "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>
- CC: "Booth, David (HP Software - Boston)" <dbooth@hp.com>, W3C-TAG Group WG <www-tag@w3.org>, Alan Ruttenberg <alanruttenberg@gmail.com>, Jonathan A Rees <jar@mumble.net>, Dan Connolly <connolly@w3.org>, Tim Berners-Lee <timbl@w3.org>
Williams, Stuart (HP Labs, Bristol) wrote: >> I think, the root cause for all these is the httpRange-14. >> The way its resolution is written just sounds like a >> inference. After some thoughts, I start to think that >> "httpRange-14" gets it wrong. The issue is raised to solve >> the URI ambiguity. But what it does is to open more issues >> than it has solved. >> > > Would you care to enumerate some of those please? > I'm particularly interested in those problems that youattribute to the > 'resolution' rather than those that exist independent of it. > > After much email and debate, the TAG resolved the question such that Web > Architecture places no constraints on what can be referred to using http > scheme URIs, with or without a '#' in a given URI. > Yes, I think this is the correct conclusion because it separates the URI from its *dereference* protocol. > A consequence of following the TAG's advice at [1] ... > <snip>. I was aware of it and I thought it was right too. But a project that I am working on and the questions posted on various mailing archive let me to start rethinking it. >> The whole issue, I think relies on how we understand the >> relationship between the following two things. >> >> 1) The thing that a URI denotes, let's call it T. >> > > That would be what AWWW calls a resource, right? > Yes >> 2) The thing that you get back from dereferencing the URI, >> let's call it R. >> > > That would be what AWWW calls a representaion, right? > Yes. >> The important question is whether T should be R? Most people >> think so, but I think we should not. >> > > In which case I think you and AWWW are in agreement. > Hmm.. not really. I think AWWW's opinion is that for some resource, i.e., the information resource, T=R. At least, most people reading the http-Range14 would get an impression of that. >> First, a protocol, such >> as HTTP, is just one of the many protocol that can be used to >> "dereference" a URI. Second, the HTTP content negotiation >> makes it impossible that R is T. For instance, if we >> normalize all the HTTP GET by moving all the Accept header >> into a query string. Then, given a URI like "http://example.com/foo" >> >> T = http://example.com/foo >> >> But R can be one of the followings >> >> R1 = http://example.com/foo?Accept=text/html >> R2 = http://example.com/foo?Accept=application/rdf+xml >> R3 = http://example.com/foo?Accept=anything >> >> And they have completely different URIs. >> > > Hmmm.... this seems to confuse resources with representations. T can be > taken as a reference to a generic resource while R1,R2 and R3 can be > taken as references to more specific resources which give access to a > narrower set of representations than T (a some given instant). > That is exactly the point, is there a URI for R? (I think not) If someone think so, what is the URI for the returned representation? >> In other words, >> what a URI identifies will *never* be the same as what the >> URI is dereferenced unless we explicitly assert them. >> > > ? don't understand the claim. > What I mean is: what a URI identifies is always a resource in the sense of TBL's generic resource irregardless of it is a network resource or not. For example, let's use "http://example.com/abook" to denote a particular book. This URI can be grounded on various systems, each of which may have different mechanisms(protocols) to dereference the URI. Which system to use and which protocol to use is up to a client. - In a traditional market place, such as bookstores, a client may get back a printed copy of the book. - In a book-reading club, a client will get back a stream of sound wave. - In the web, a client will get back either a bit-stream, which can be further subdivided by the MIME type into html, rdf or pdf stream... But those things - printed copy, sound wave, bit-stream - are NOT the book identified by the "http://example.com/abook". They are one particular representation of the book. They may referred to as _:aPrintCopy awww:hardCopyOf <http://example.com/abook>. _:anAudio awww:soundOf <http://example.com/abook>. _:anHTMLRep awww:informationResourceOf <http://example.com/abook>. _:anPDFFile awww:informationResourceOf <http://example.com/abook>. ..... Please note that my last two assertions because I think it is more appropriate to define *information resource* as the set of all representations of all generic URIs. Such a view has few advantages. 1) It is much easier to understand and consistent because it doesn't matter if a URI identifies a network resource, a person, or a namespace, or an ontology. We understand what we get back is just a particular representation of that resource that we try to understand within a given information system. 2) It is more efficient. We don't need 303 redirect anymore. 3) It can avoid unnecessary proliferation of URIs and allows various *information resources* be logically grouped under the same URI without physically bound to each other. This is particular important for me because I am developing an RDF-based Data Format Description Framework (http://dfdf.inesc-id.pt), where the data format description (in RDF) is separated from data encoding (in binary of any form. 4) It also solves the conceptual problem of the URI with a fragment identifier. Because with content negotiation, the nature of a URI with fragment identifier becomes a problem with the traditional view of the IR and non-IR. For instance, what will "http://example.com/#chapter1" identify? Say if we intend the URI to identify the first chapter of the book, what should it be used to denotes in an HTML representation? Would it be wrong, if the URI identifies a <div> or a <h1> element? In the traditional distinction of IR or non-IR, this is very likely to be considered wrong. And the best we can do is to carefully avoid name conflict of the fragment identifier in each representations. But this, in turn, hurts the usability. For instance, if I request the HTML representation of a particular ontological term say, I would like the browser to automatically screw to the relevant section instead of finding it. But with this newly proposed view of the relationships between URI, HTTP, Resource, Representation/InformationResource, it will be O.K. to use "http://example.com/#chapter1" to identify a <div> or <h1>. Because what gets for a fragment URI is the same for a primary URI, it is just one of, but not *the*, representation of the resource. Cheers, Xiaoshu
Received on Monday, 22 October 2007 13:46:31 UTC