Re: WebIDs and Content-Location from Norman Gray on 2012-04-15 (public-webid@w3.org from April 2012)

From: Norman Gray <norman@astro.gla.ac.uk>
Date: Sun, 15 Apr 2012 21:48:09 +0100
To: Henry Story <henry.story@bblfish.net>
Cc: Tim Berners-Lee <timbl@w3.org>, Linked Data community <public-lod@w3.org>, public-webid <public-webid@w3.org>, public-cwm-talk@w3.org, Owen Sacco <owen.sacco@deri.org>
Message-Id: <1A9B47CE-544D-49F0-850E-4D7E0CB491EB@astro.gla.ac.uk>
Greetings.

In Henry's case of (cropped):

>>> $ curl -i http://vmuss13.deri.ie/foafprofiles/hada
>>> HTTP/1.1 200 OK
>>> Date: Fri, 13 Apr 2012 12:04:11 GMT
>>> Server: Apache/2.2.9 (Debian) PHP/5.2.6-1+lenny12 with Suhosin-Patch
>>> Content-Location: hada.rdf
>>> 
[...]
>>> <foaf:PersonalProfileDocument rdf:about="">
>>>    <foaf:maker rdf:resource="#me"/>
>>>    <foaf:primaryTopic rdf:resource="#me"/>
>>> </foaf:PersonalProfileDocument>

...I believe the full URI of the primaryTopic should be <http://vmuss13.deri.ie/foafprofiles/hada.rdf#me>, and not <.../hada#me>.  This is because RFC 2616 section 14.14 says: "The value of Content-Location also defines the base URI for the entity" (this is the second of only two mentions of "base URI" in the document), and "If the Content-Location is a relative URI, the relative URI is interpreted relative to the Request-URI."

I remember working through the logic here in fussy detail, for a librdf bug report -- see the notes and 'additional information' at <http://bugs.librdf.org/mantis/view.php?id=402>.

Sections 5.1.2 and 5.1.3 of RFC 3986 could be read as saying that the retrieval-URI should provide the base-URI, but they are more naturally consistent with the above interpretation, on the grounds that the 'content-location' header is part of the 'representation's retrieval context'.  The text in these sections could and should be clearer about this.

tl;dr: content-location trumps request-uri.

I don't think that librdf's reporting a relative URI is necessarily odd, as long as there's a well-defined base-URI at that point.

All the best,

Norman

[top-posting, to preserve the full exchange below, in case I'm missing a point in my quoting]

On 2012 Apr 13, at 18:02, Henry Story wrote:

> 
> On 13 Apr 2012, at 17:35, Tim Berners-Lee wrote:
> 
>> 
>> 1) The base address used for parsing an RDF document should be the request URI,
>> not the Content-Location: value.   Otherwise randomly clients who can accept
>> n3 and rdf/xml will get <hada.rdf#me> and <hada.n3#me> which is clearly a bad idea.
>> (Imagining that there is a hada.n3 option).
> 
> My thinking was that if a GET on <hada> has a Content-Location of <hada.rdf> then
> if one follows that procédure then the graphs for a GET on <hada> and the one returned
> by GETing <hada.rdf> will be different:
> one of them will have as foaf:primaryTopic <hada#me> and the other <hada.rdf#me> .
> That seems odd.
> 
> 
>> 
>> Is there is a 302 Moved  redirect, then that new Location:  URI should be used
>> as the URI for the document and the base URI for parsing it.  (But NOT for 301).
>> 
>> 2) You say that rapper is string outputting things as relative URIs as  but i would
>> support that, as often the absolute URI of a bit of RDF system is actually 
>> been mapped through proxying, or from looking at the files on a server in file:// space, 
>> and life is much easier if things default to 
>> 
>> I also like it that, anyone can do
>>        $ echo '<#a> <#p> 123 .' | cwm --quiet
>> and get
>>        @prefix : <#> .
>>        :a     :p 123 .
>> 
>> without being cluttered with  lots of references to the working directory.
>> 
>> On 2012-04 -13, at 08:54, Henry Story wrote:
>> 
>>> I have an issue about canonicalisation (de-relativisation?) of URLs. cwm and rapper
>>> don't return the same results, though cwm agrees with http://www.w3.org/RDF/Validator/
>>> 
>>> What is the full URL for the rdf:ID="me" in the XML returned below?
>>> Is it 
>>> 
>>> - <http://vmuss13.deri.ie/foafprofiles/hada#me> as cwm  ( cwm.py,v 1.198 2012-01-30) and the w3 validator state?
>>> or is it
>>> - <hada.rdf#me> as raptor 2.0.6 returns (bizarrely as a relative url though) and as I thought it should be.
>>> 
>>> 
>>> $ curl -i http://vmuss13.deri.ie/foafprofiles/hada
>>> HTTP/1.1 200 OK
>>> Date: Fri, 13 Apr 2012 12:04:11 GMT
>>> Server: Apache/2.2.9 (Debian) PHP/5.2.6-1+lenny12 with Suhosin-Patch
>>> Content-Location: hada.rdf
>>> Vary: negotiate
>>> TCN: choice
>>> Last-Modified: Fri, 13 Apr 2012 11:26:38 GMT
>>> ETag: "8080-6af-4bd8dbeebb780;4bd8dbeebb780"
>>> Accept-Ranges: bytes
>>> Content-Length: 1711
>>> Content-Type: application/rdf+xml
>>> 
>>> <?xml version="1.0" encoding="ISO-8859-1"?>
>>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>>>      xmlns:foaf="http://xmlns.com/foaf/0.1/"
>>>      xmlns:rsa="http://www.w3.org/ns/auth/rsa#"
>>>      xmlns:cert="http://www.w3.org/ns/auth/cert#"
>>>      xmlns:admin="http://webns.net/mvcb/">
>>> <foaf:PersonalProfileDocument rdf:about="">
>>>    <foaf:maker rdf:resource="#me"/>
>>>    <foaf:primaryTopic rdf:resource="#me"/>
>>> </foaf:PersonalProfileDocument>
>>> <foaf:Person rdf:ID="me">
>>>    <foaf:nick>HADAUser1</foaf:nick>
>>>    <foaf:givenName>Jane</foaf:givenName>
>>>    <foaf:familyName>Smith</foaf:familyName>
>>>    <foaf:workplaceHomepage rdf:resource="http://hhs.gov"/>	
>>>    <foaf:topic_interest rdf:resource="HEAR"/>
>>>    <foaf:topic_interest rdf:resource="Accounting"/>
>>> 
>>>    <cert:key>
>>>      <cert:RSAPublicKey rdf:ID="key1">
>>> 	<rdfs:label>HADA Admin</rdfs:label>
>>>        <cert:modulus rdf:datatype="http://www.w3.org/2001/XMLSchema#hexBinary">95052F88477A3F1ADC1964AFD1AB7438F34EADEF22D9C5BDB8739E671F4626A347A3031E9FD4A5E2176D3048DA52DCA6AFFD67C81588A27A088A7CD27E2F2CBA2FF83DA90700797BE75BB9122FE5375E13BCFA55BE5504176886B0AC0BBB792D5221FE5295C75A3654385B8490A478A64AA117430F88E42852061230CD1C32EE2F01CD5FDD9D6DD4B757163CC9C1DB29BAC3EA9605D82D76AD7D5BE26D53DC9EA7A6C87369F53B4C2BBA149406E4A0FD5B921338DCB5B355D0DBBA95A238924678211ED997657ABC7FEDD28A93F8A5A19B463E72A17EFD204A80BEAFC41B841B079AE49FDBD28B62D01B9675D3508B4BAC98B6BE972A17C27C2415281C650121</cert:modulus>
>>>        <cert:exponent rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">65537</cert:exponent>
>>>      </cert:RSAPublicKey>
>>>    </cert:key>
>>> </foaf:Person>
>>> </rdf:RDF>
>>> 
>>> In my code  on read-write-web I use the returned Content-Location to form the base URL
>>> 
>>>   148           val loc = headers("Content-Location").headOption match {
>>>   149             case Some(loc) =>  new URL(u,loc)
>>>   150             case None => new URL(u.getProtocol,u.getAuthority,u.getPort,u.getPath)
>>>   151           }
>>>   152           res>>{ in=> modelFromInputStream(in,loc,encoding) }
>>>   153 
>>> ( https://dvcs.w3.org/hg/read-write-web/file/c6520ef80d5c/src/main/scala/GraphCache.scala#l148 )
>>> 
>>> Where modelFromInputStream uses the Jena libraries like this:
>>> 
>>>    12   def modelFromInputStream(
>>>    13       is: InputStream,
>>>    14       base: URL,
>>>    15       lang: Lang): Validation[Throwable, Model] =
>>>    16     try {
>>>    17       val m = ModelFactory.createDefaultModel()
>>>    18       m.getReader(lang.jenaLang).read(m, is, base.toString)
>>>    19       m.success
>>>    20     } catch {
>>>    21       case t =>  {
>>>    22         logger.info("cought exception turning stream into model ",t)
>>>    23         t.fail
>>>    24       }
>>>    25     }
>>> ( https://dvcs.w3.org/hg/read-write-web/file/c6520ef80d5c/src/main/scala/util/package.scala )
>>> 
>>> 
>>> Social Web Architect
>>> http://bblfish.net/
>>> 
>> 
> 
> Social Web Architect
> http://bblfish.net/
> 

-- 
Norman Gray  :  http://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK
Received on Sunday, 15 April 2012 20:48:51 UTC