[Fwd: Re: Property for linking from a graph to HTTP connection meta-data?]

real world use-case from Martin Hepp, fwd'd from public-lod

-------- Original Message --------
Subject: Re: Property for linking from a graph to HTTP connection meta-data?

Hi All:
Thanks for the very useful feedback!

Just to clarify what I want to do:

There are many valuable commerce data resources available in XML and
CSV on the Web. It is fairly straightforward to translate them into
RDF, e.g. using GoodRelations. Now, whenever I create an RDF
representation of that data in a new namespace, I may want to store
the meta-data of the original HTTP GET request with which I fetched
the XML or CSV file, and attach that meta-data to the resulting RDF
graph.

This allows for (1) nice analytics and (2) data cleansing entirely in
the SPARQL / RDF world later-on.

So the subject to which the meta-data will be attached will not be the
resource URI (because there can naturally be multiple HTTP GET
requests for the same resource), but instead the resulting graph or
dataset.

I assume the same will be useful in many other application domains and
scenarios.

Here is an example:

Assume we have an XML file with company data at
    http://www.heppnetz.de/companies.xml

and we fetch that, convert it to RDF, and republish the data in the
namespace

    http://www.example.com/RDFizingResults/dataset1#

Let's further assume that the HTTP GET request to the XML file
returned the following HTTP header data:

# Meta-data from fetching the original data
# HTTP/1.1 200 OK
# Date: Mon, 17 Jan 2011 21:31:58 GMT
# Server: Apache
# Last-Modified: Mon, 25 Oct 2010 20:31:25 GMT
# Content-Length: 10971
# Content-Type: application/xml


So the following seems possible and useful:

@prefix void: <http://rdfs.org/ns/void#> .
@prefix http: <http://www.w3.org/2006/http#> .
@prefix headers: <http://www.w3.org/2008/http-headers#> .
@prefix status: <http://www.w3.org/2008/http-statusCodes#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix gr: <http://purl.org/goodrelations/v1#> .
@prefix foo: <http://www.example.com/RDFizingResults/dataset1#> .

# Define an entity for the resulting dataset / graph
foo:dataset a void:Dataset .

# Link the graph to the HTTP header info from the data transformation
foo:dataset rdfs:seeAlso foo:ResponseMetaData .

NOTE: My original question was which predicate to use for this
statement. rdfs:seeAlso seems valid, but it maybe suboptimal.
	
# Expose the meta-data from fetching the original data
foo:ResponseMetaData a http:Response ;
	http:httpVersion "1.1" ;
	dct:date "2008-01-11"^^xsd:date ;
	http:statusCodeNumber "200" ;
  	http:sc status:statusCode200  ;
	http:headers [ a http:MessageHeader ;
  				   http:fieldName "Server" ;
				   http:fieldValue "Apache" ] ;
	http:headers [ a http:MessageHeader ;
  				   http:fieldName "Last-Modified" ;
				   http:fieldValue
"2010-1025T20:31:25Z"^^xsd:datetime ] ;											
	http:headers [ a http:MessageHeader ;
  				   http:fieldName "Content-Length" ;
				   http:fieldValue 10971 ] ;
	http:headers [ a http:MessageHeader ;
  				   http:fieldName "Content-Type" ;
				   http:fieldValue "application/xml" ] .		

# Then comes the real instance data, derived from the original source
foo:ACME a gr:BusinessEntity ;
	rdfs:isDefinedBy foo:dataset .

foo:MillerInc a gr:BusinessEntity ;
	rdfs:isDefinedBy foo:dataset .
#  etc.	

Does that sound okay and useful for everybody?

Best
Martin

PS: I omitted the prefix declarations in the example above:

@prefix void: <http://rdfs.org/ns/void#> .
@prefix http: <http://www.w3.org/2006/http#> .
@prefix headers: <http://www.w3.org/2008/http-headers#> .
@prefix status: <http://www.w3.org/2008/http-statusCodes#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix gr: <http://purl.org/goodrelations/v1#> .
@prefix foo: <http://www.example.com/RDFizingResults/dataset1#> .



On 17.01.2011, at 21:21, Shadi Abou-Zahra wrote:

> Dear Martin, All,
>
> Just a reminder that you are looking at an old, outdated editors  
> draft of the HTTP-in-RDF Vocabulary. The latest Public Working Draft  
> is here:
> - <http://www.w3.org/TR/HTTP-in-RDF10/>
>
> I seem to recall updates to the vocabulary that allow more  
> flexibility to support uses as the one described below by Martin  
> (though I did not check back specifically for that case).
>
> Please note that the W3C/WAI Evaluation and Repair Tools Working  
> Group welcomes comments and feedback on HTTP-in-RDF (despite the  
> long passed deadline). Please send comments to <public-earl10-comments@w3.org 
> >.
>
> Best,
>  Shadi
>
>
> On 17.1.2011 20:23, Nathan wrote:
>> William Waites wrote:
>>> * [2011-01-17 16:39:27 +0100] Martin Hepp
>>> <martin.hepp@ebusiness-unibw.org> écrit:
>>>
>>> ] Does anybody know of a standard property for linking a RDF graph  
>>> to
>>> a ] http:GetRequest, http:Connection, or http:Response instance?  
>>> Maybe
>>> ] rdfs:seeAlso (@TBL: ;- ))?
>>>
>>> If you suppose that the name of the graph is the same as the
>>> request URI (it will not always be, of course) you can link
>>> in the other direction from http:Request using http:requestURI.
>>> I am not sure that http:requestURI has a standard inverse though.
>>
>> And remember of course, that the headers are split in to different
>> groups which relate to different things, many relate to the message  
>> (in
>> relation to the request), some relate to the server, some relate to  
>> the
>> entity (an encoded version of the representation for messaging) a few
>> (really not many) relate to the representation itself, and a couple
>> relate to the resource itself, the resource being the thing the URI
>> identifies.
>>
>> Best,
>>
>> Nathan
>>
>>
>
> -- 
> Shadi Abou-Zahra - http://www.w3.org/People/shadi/ |
>  WAI International Program Office Activity Lead   |
> W3C Evaluation & Repair Tools Working Group Chair |
>

Received on Monday, 17 January 2011 22:36:48 UTC