W3C home > Mailing lists > Public > public-rww@w3.org > March 2013

Important Change to HTTP semantics re. hashless URIs

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sun, 24 Mar 2013 13:39:23 -0400
Message-ID: <514F3A4B.8000301@openlinksw.com>
To: "public-lod@w3.org" <public-lod@w3.org>, "public-rww@w3.org" <public-rww@w3.org>, "public-webid@w3.org" <public-webid@w3.org>, "dbpedia-discussion@lists.sourceforge.net" <dbpedia-discussion@lists.sourceforge.net>
All,

Here is a key HTTP enhancement from Hypertext Transfer Protocol 
(HTTP/1.1): Semantics and Content note from IETF [1].

"
    4.  If the response has a Content-Location header field and its
        field-value is a reference to a URI different from the effective
        request URI, then the sender asserts that the payload is a
        representation of the resource identified by the Content-Location
        field-value.  However, such an assertion cannot be trusted unless
        it can be verified by other means (not defined by HTTP).
"


Implications:

This means that when hashless (aka. slash) HTTP URIs are used to denote 
entities, a client can use value from the Content-Location response 
header to distinguish a URI that denote an Entity Description Document 
(Descriptor) distinct from the URI of the Entity Described by said 
document. Thus, if a client de-references the URI 
<http://dbpedia.org/resource/Barack_Obama> and it gets a 200 OK from the 
server combined with <http://dbpedia.org/page/Barack_Obama> in the 
Content-Location response header, the client (user agent) can infer the 
following:

1. <http://dbpedia.org/resource/Barack_Obama> denotes the real-world 
entity 'Barack Obama' .
2. <http://dbpedia.org/page/Barack_Obama> denotes the Web Document that 
describes real-world entity 'Barack Obama' -- by virtue of the fact that 
the server has explicitly *identified* said resource via the 
Content-Location header .

Basically, the Toucan Affair [2][3][4] has now been incorporated into 
HTTP thereby providing an alternative to 303 redirection which has 
troubled/challenged many folks trying to exploit Linked Data via 
hashless HTTP URIs.

Implementations:

As per my comments in the Toucan Affair thread, our ODE [5] Linked Data 
client has always supported this heuristic. In addition, I am going 
propose implementing this heuristic in DBpedia which will simply have 
the net effect of not sending a 303 to user agents that look-up URIs in 
this particular Linked Data space.

Linked Data Client implementation suggestions:

I encourage clients to support this heuristic in addition to 303 with 
regards to Linked Data URI disambiguation. Implementation costs are 
minimal while the upside extremely high re., Linked Data comprehension, 
appreciation, and adoption.

Links:

1. http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#page-15 .
2. http://blog.iandavis.com/2010/11/04/is-303-really-necessary/ -- Is 
303 Really Necessary post by Ian Davis.
3. http://lists.w3.org/Archives/Public/public-lod/2010Nov/0090.html -- 
mailing list thread .
4. 
http://linkeddata.uriburner.com/about/html/http/iandavis.com/2010/303/toucan 
-- example of heuristic handling .
5. http://ode.openlinksw.com -- ODE Linked Data consumer service, 
bookmarklets, and cross-browser extensions.
6. http://bit.ly/YxW21k -- Illustrating Semiotic Triangle using 
DBpedia's Linked Data URIs .

-- 

Regards,

Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen







Received on Sunday, 24 March 2013 17:39:48 GMT

This archive was generated by hypermail 2.3.1 : Sunday, 24 March 2013 17:39:48 GMT