Re: Property for linking from a graph to HTTP connection meta-data? from Olaf Hartig on 2011-01-18 (public-lod@w3.org from January 2011)

From: Olaf Hartig <hartig@informatik.hu-berlin.de>
Date: Tue, 18 Jan 2011 10:30:00 +0100
To: Martin Hepp <martin.hepp@ebusiness-unibw.org>, semantic-web@w3.org
Cc: public-lod@w3.org
Message-Id: <201101181030.02746.hartig@informatik.hu-berlin.de>
Hey Martin,

What you describe seems to be exactly one of the use cases we developed the 
Provenance Vocabulary [1] for:The Provenance Vocabulary provides the class 
prv:DataAccess  which represents the execution of a data access on the Web. 
Using the property  prvTypes:exchangedHTTPMessage  you can associate instances 
of  prv:DataAccess  with the HTTP messages that have been exchanged. These 
HTTP messages can then be described using the W3C RDF vocabulary for HTTP. 
Here's an example:

   foo:DataAboutProduct1
             foaf:primaryTopic foo:Product1 ;
             prv:createdBy _:dc .

   _:dc
             a prv:DataCreation ;
             # ... additional information about the creation process ...
             prv:usedData _:xml .

   _:xml
             a prv:DataItem ;
             prv:retrievedBy _:da .

   _:da
             a prv:DataAccess ;
             prv:accessedResource <http://www.heppnetz.de/companies.xml> ;
             prvTypes:exchangedHTTPMessage _:m .

   _:m
             a http:Response ;
             http:httpVersion "1.1" ;
             # ...
             http:statusCodeNumber "200" .

(Needless to say that you may use URIs instead of the blank node identifiers 
that I used in the example for the sake of readability.)

Our "Guide to the Provenance Vocabulary" contains another example in Section
"3.3.2 Related Vocabularies: HTTP Vocabulary in RDF" [2].

Greetings,
Olaf


[1] http://purl.org/net/provenance/
[2] http://purl.org/net/provenance/guide#HTTP_Vocabulary_in_RDF



On Monday 17 January 2011 23:09:01 Martin Hepp wrote:
> Hi All:
> Thanks for the very useful feedback!
> 
> Just to clarify what I want to do:
> 
> There are many valuable commerce data resources available in XML and
> CSV on the Web. It is fairly straightforward to translate them into
> RDF, e.g. using GoodRelations. Now, whenever I create an RDF
> representation of that data in a new namespace, I may want to store
> the meta-data of the original HTTP GET request with which I fetched
> the XML or CSV file, and attach that meta-data to the resulting RDF
> graph.
> 
> This allows for (1) nice analytics and (2) data cleansing entirely in
> the SPARQL / RDF world later-on.
> 
> So the subject to which the meta-data will be attached will not be the
> resource URI (because there can naturally be multiple HTTP GET
> requests for the same resource), but instead the resulting graph or
> dataset.
> 
> I assume the same will be useful in many other application domains and
> scenarios.
> 
> Here is an example:
> 
> Assume we have an XML file with company data at
>     http://www.heppnetz.de/companies.xml
> 
> and we fetch that, convert it to RDF, and republish the data in the
> namespace
> 
>     http://www.example.com/RDFizingResults/dataset1#
> 
> Let's further assume that the HTTP GET request to the XML file
> returned the following HTTP header data:
> 
> # Meta-data from fetching the original data
> # HTTP/1.1 200 OK
> # Date: Mon, 17 Jan 2011 21:31:58 GMT
> # Server: Apache
> # Last-Modified: Mon, 25 Oct 2010 20:31:25 GMT
> # Content-Length: 10971
> # Content-Type: application/xml
> 
> 
> So the following seems possible and useful:
> 
> @prefix void: <http://rdfs.org/ns/void#> .
> @prefix http: <http://www.w3.org/2006/http#> .
> @prefix headers: <http://www.w3.org/2008/http-headers#> .
> @prefix status: <http://www.w3.org/2008/http-statusCodes#> .
> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
> @prefix dct: <http://purl.org/dc/terms/> .
> @prefix gr: <http://purl.org/goodrelations/v1#> .
> @prefix foo: <http://www.example.com/RDFizingResults/dataset1#> .
> 
> # Define an entity for the resulting dataset / graph
> foo:dataset a void:Dataset .
> 
> # Link the graph to the HTTP header info from the data transformation
> foo:dataset rdfs:seeAlso foo:ResponseMetaData .
> 
> NOTE: My original question was which predicate to use for this
> statement. rdfs:seeAlso seems valid, but it maybe suboptimal.
> 
> # Expose the meta-data from fetching the original data
> foo:ResponseMetaData a http:Response ;
> 	http:httpVersion "1.1" ;
> 	dct:date "2008-01-11"^^xsd:date ;
> 	http:statusCodeNumber "200" ;
>   	http:sc status:statusCode200  ;
> 	http:headers [ a http:MessageHeader ;
>   				   http:fieldName "Server" ;
> 				   http:fieldValue "Apache" ] ;
> 	http:headers [ a http:MessageHeader ;
>   				   http:fieldName "Last-Modified" ;
> 				   http:fieldValue
> "2010-1025T20:31:25Z"^^xsd:datetime ] ;
> 	http:headers [ a http:MessageHeader ;
>   				   http:fieldName "Content-Length" ;
> 				   http:fieldValue 10971 ] ;
> 	http:headers [ a http:MessageHeader ;
>   				   http:fieldName "Content-Type" ;
> 				   http:fieldValue "application/xml" ] .
> 
> # Then comes the real instance data, derived from the original source
> foo:ACME a gr:BusinessEntity ;
> 	rdfs:isDefinedBy foo:dataset .
> 
> foo:MillerInc a gr:BusinessEntity ;
> 	rdfs:isDefinedBy foo:dataset .
> #  etc.
> 
> Does that sound okay and useful for everybody?
> 
> Best
> Martin
> 
> PS: I omitted the prefix declarations in the example above:
> 
> @prefix void: <http://rdfs.org/ns/void#> .
> @prefix http: <http://www.w3.org/2006/http#> .
> @prefix headers: <http://www.w3.org/2008/http-headers#> .
> @prefix status: <http://www.w3.org/2008/http-statusCodes#> .
> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
> @prefix dct: <http://purl.org/dc/terms/> .
> @prefix gr: <http://purl.org/goodrelations/v1#> .
> @prefix foo: <http://www.example.com/RDFizingResults/dataset1#> .
> 
> On 17.01.2011, at 21:21, Shadi Abou-Zahra wrote:
> > Dear Martin, All,
> > 
> > Just a reminder that you are looking at an old, outdated editors
> > draft of the HTTP-in-RDF Vocabulary. The latest Public Working Draft
> > is here:
> > - <http://www.w3.org/TR/HTTP-in-RDF10/>
> > 
> > I seem to recall updates to the vocabulary that allow more
> > flexibility to support uses as the one described below by Martin
> > (though I did not check back specifically for that case).
> > 
> > Please note that the W3C/WAI Evaluation and Repair Tools Working
> > Group welcomes comments and feedback on HTTP-in-RDF (despite the
> > long passed deadline). Please send comments to
> > <public-earl10-comments@w3.org
> > 
> > >.
> > 
> > Best,
> > 
> >  Shadi
> > 
> > On 17.1.2011 20:23, Nathan wrote:
> >> William Waites wrote:
> >>> * [2011-01-17 16:39:27 +0100] Martin Hepp
> >>> <martin.hepp@ebusiness-unibw.org> écrit:
> >>> 
> >>> ] Does anybody know of a standard property for linking a RDF graph
> >>> to
> >>> a ] http:GetRequest, http:Connection, or http:Response instance?
> >>> Maybe
> >>> ] rdfs:seeAlso (@TBL: ;- ))?
> >>> 
> >>> If you suppose that the name of the graph is the same as the
> >>> request URI (it will not always be, of course) you can link
> >>> in the other direction from http:Request using http:requestURI.
> >>> I am not sure that http:requestURI has a standard inverse though.
> >> 
> >> And remember of course, that the headers are split in to different
> >> groups which relate to different things, many relate to the message
> >> (in
> >> relation to the request), some relate to the server, some relate to
> >> the
> >> entity (an encoded version of the representation for messaging) a few
> >> (really not many) relate to the representation itself, and a couple
> >> relate to the resource itself, the resource being the thing the URI
> >> identifies.
> >> 
> >> Best,
> >> 
> >> Nathan
Received on Tuesday, 18 January 2011 09:30:43 UTC