- From: Mitch Denny <mitch.denny@warbyte.com>
- Date: Tue, 18 Dec 2001 21:06:23 +1100
- To: <www-tag@w3.org>
Peter/all, I can only sympathise with the problems that you face with Internet-based resources disappearing or moving. The source of the problem I'm afraid is probably the exact same thing that makes the Internet such a valuable tool. Its fair to say that those driving the evolution of the all the supporting technology have alot of vision, but unfortunately design prowess is really only displayed within working groups with narrow focus. Don't get me wrong, this approach has been largely successful, in terms of human achievement I think the Internet and related technologies rate right up there. Today we have some fantastic infrastructure with only a few minor problems. The problem that you have expressed crosses a few technology boundaries and I understand that this group exists to solve just such a problem. I'd like to see further exploration of your UCI idea, what are the problems it would solve? One road block that I see is that Internet hosts are definately autonomous entities so any prescriptive solution which rigidly defining a static structure would almost certainly not be accepted. As an alternative, how about a relitively simple extension to the HTTP protocol via a header which defines the volatility of the requested URI. Volatility: Dynamic 2592000 The above would suggest to the user-agent that the page is dynamic and could change on each request but the URI used to request it is valid for thirty days (in seconds). The user-agent could analyse that header and when a bookmark is requested it could determine whether it needs to pull it down and archive it. Ofcourse this behaviour would be based on user preferences. From my perspective there are several key benefits to this approach: - Doesn't break existing user-agent and server implementations. - Can be implemented on the server-side now by systems that support dynamic content like JSP, PHP, ASP, ASP.NET etc. - Gets around the problem of host independent resource description today by encouraging an "archive it if you need it" mindset. There are also a few issues that would need to be solved: - User-agents would need to be updated to support this extension. - This works for HTTP, similar mechanisms would need to be built for other content delivery protocols. - Static content would need to have the support of the server to correctly flag it as volatile content or not - more management overhead. - Archiving content from sites might introduce some legal issues - but they were always there. From a users perspective, I think this could become quite an intuitive process - today I do something similar, for example when I purchase something online I archive the page because I know that if I bookmark the site its not going to be there when I go back. Anyway, thats my proposal, I'm not even sure if this is the right place to bring it up, but its out there now, feel free to critique it constructively. ---------------------------------------- - Mitch Denny - mitch.denny@warbyte.com - +61 (414) 610-141 - -----Original Message----- From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] On Behalf Of Peter Pappamikail Sent: Monday, 17 December 2001 9:23 AM To: www-tag@w3.org Subject: Resource discovery: limits of URIs I'm flagging up an issue as a private individual rather than in my official capacity of Head of Information Resources Management in the European Parliament, although the issue I address has been considered in my professional work as well as my private research and work on XML implementation issues. My concern is the mechanisms available to "translate" information on a uniquely identifiable artefact to an addressable URI. Please accept my apologies in advance if the issue is not appropriate for this list. As an "ordinary user" I can "identify" or name a particular information artefact, a book, a document, etc. With a URL, I can address it. The URL will give me an address that usually combines details of an originating authority, a content identifier, sometimes a language version and an application format (MIME extension). However, with the exception of the language version - that might, depending on the server infrastructure, serve up a version according to my indicated preferences set in the browser - the "discovery" of the full URL cannot be deducted algorithmically from the content identifier. A couple of examples to demonstrate my concern more clearly: - "bookmark rot": I mark a set of resources from a particular site, only to find a year later that all the references are rotten as the .htm extension has been replaced by .php throughout the site, although no single item of content has changed; - I reference an item found via a WAP service, knowing that a more complete version of the same content is available in HTML on a parallel web site: the 'URLs' however are completely different despite referring to the same artefact; - I copy a URL in a site, only to discover that the the URL is attributed not only dynamically but is ession specific and sometimes personalised, and thus un re-useable; - I'm listening to a voice synthesised web page that contains links to resources thatare available in audio and text, but the link takes me to the text file via the hypertext link; In architectural terms, my concern is that more and more sites, in the absence of any clear mechanisms for resolving addresses from identifiers, have increasingly complex interfaces with proprietary resolution mechanisms than practically render resources discovery impossible, except indirectly. A user should be able to indicate the minimum information that distinguishes a particular artefact uniquely (I'm not sure the URN does this, because it is still only a URI with a commitment to persistence) and not be bothered with whether it is the most recent version, which languages are available, whether it is in pdf, html, xml,wml, but that the server will resolve this in a context-sensitive manner. The issue will become critical when XPointer starts to be used to identify resource fragments: in fact the XPointer's potential weakness is precisely that the containing document may itself be poorly addressable. My "ideal scenario" would be the replacement, in the hyperlink target data, of an URI - pointing as it does to a specific file - by a "UCI" ( a "Uniform Content Identifier") that resolves to the specific components: - a DNS entry or other service locator; - on the server side, to an URI appropriate to the client context, made up of the content identifier 'wrapped' with language, version, format and other context specific data; If this sort of issue is handled elsewhere, I'd be happy to be pointed the way, but I feel the issue goes beyond the scope of current W3C activity on addressing and is too "instance specific" to be in the realm of RDF or other semantic resource discovery issues: I believe the issue is analoguous to HTTP language negotiation, and warrants similar treatment. Peter
Received on Tuesday, 18 December 2001 05:03:41 UTC