Re: "Hash URIs" and content negotiation from Alan Ruttenberg on 2006-11-13 (semantic-web@w3.org from November 2006)

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Mon, 13 Nov 2006 00:03:36 -0500
To: Karl Dubost <karl@w3.org>
Cc: Semantic Web <semantic-web@w3.org>
Message-Id: <33B1FDA1-9D8D-4AE1-B702-0F3BA94A3715@gmail.com>
On Nov 9, 2006, at 9:09 PM, Karl Dubost wrote:

>
> Le 8 nov. 2006 à 01:32, Alan Ruttenberg a écrit :
>> On Nov 7, 2006, at 10:50 AM, Dan Brickley wrote:
>>> You're very right of course, it's problematic to conneg in  
>>> context of such URIs. This is why I always preferred slash URIs!  
>>> Ah well...
>>
>> Personally, I can't tell why content negotiation is a good idea in  
>> any context. To my mind it's hiding interesting information in the  
>> innards of a network protocol instead having it explicitly  
>> available, in say, OWL or RDF.
>

First, thanks for your comments. I've responded with some of my  
thoughts.

> because there are cases where it is difficult to do in another way.  
> Content-Negotiation is a tough issue with many faces. It is not  
> only a linear list of choices, but a multi-dimensional matrix
> 	- languages (fr, en, ja, …)

As I have pointed out, the problem is one of naming. I argue that  
these are not the "same" and so shouldn't be named the same. The  
intent is certainly that they are rdf:about the same thing. But don't  
you think that in the context of teaching people about what it would  
mean to have machines understand the web by communicating in RDF,   
conneg, in confusing the different between "x == y" and "x about z &  
y about z" undermines the cause?

> 	- format of representation (png, gif, html, …)

Unless the html has no other content than an image tag for the png or  
gif, it can not even be contemplated to be the "same". They are  
possibly both rdf:about the same thing.

> 	- format of transport (gzip)

This is the only one that seems uncontroversial to me.

> 	- …
>
> * Discoverability depending on formats
> For example, there is no obvious linking mechanism in PNG, GIF or  
> JPEG to list alternate URIs of the "same" content *inside* the  
> content. How do I say inside a PNG, that there is a GIF version.

Why can't our agents retrieve RDF (and RDF only) when they need  
discovery information like this.

Here is another proposal. Conneg is vastly simplified: For uri U, the  
only option is to retrieve RDF instead of the resource. That rdf  
would itself be given a name(URI) distinct from U. That rdf returned  
neither describes U, nor what U is about (so if U names an rdf  
document, the conneged RDF is not the same as the RDF resource U  
names). Rather it is RDF that describes the set of documents that the  
server considers relevant to serve in place of U. These can either be  
rdf:about the same subject (such as translations in different human  
languages), or same as the document, where sameas means there is a  
lossless, unambiguous, machine implementable translation between the  
two documents (same size gif->png ok gif->jpeg quality 8 not ok). The  
RDF would use some standard set of relations to describe the  
relationship of the proposed documents to U, and
various properties of these documents (like their file formats,  
languages, etc).  Based on this information, the agent can decide  
whether the original document, or some other, is relevant to retrieve  
for the task at hand. In a browser, I would expect the browser to  
show me a different URI in the address bar if an alternate document  
was chosen.

> * Keeping up to date - Cost of management
> Another issue is updating. For example, with languages, if we got  
> let's say a version of our HTML document in French, then it has  
> been translated in Japanese and Korean later on. We have to update  
> 3 files,
>
>    <link rel="Alternate"
>          href="index.html.ja"
>          hreflang="ja"
>          title="Version japonaise"/>
>
> then when we add another one, we have now to update 4 files, and so  
> on. It becomes very difficult to update.

Unless the RDF in the html says : go to xxx for a comprehensive list  
of translations of this document.

> * Wrong Content-Type
> I was wondering how the "hash uris" proposal is working in the  
> context of wrong content-type sent by the server. Is there someone  
> who played with this a bit making cases with obviously bogus files  
> and then thinking about which mechanisms, we could put in place to  
> recover or notify of the problems
>
> There is something missing which could help a Web site to expose  
> its information space map, a bit ala sitemap of Google.
>
> * On Linking Alternative Representations To Enable Discovery And  
> Publishing
>   http://www.w3.org/2001/tag/doc/alternatives-discovery
>   TAG Finding 1 November 2006
> * Transparent Negotiation - the Missing HTTP Feature
>   http://www.w3.org/QA/2006/10/missing_http_feature
>   QA Weblog 20 October 2006
> * Google Sitemap Gen
>   http://goog-sitemapgen.sourceforge.net/
>
>
> -- 
> Karl Dubost - http://www.w3.org/People/karl/
> W3C Conformance Manager, QA Activity Lead
>   QA Weblog - http://www.w3.org/QA/
>      *** Be Strict To Be Cool ***
Received on Monday, 13 November 2006 05:03:55 UTC