- From: Olaf Hartig <hartig@informatik.hu-berlin.de>
- Date: Tue, 18 Jan 2011 10:30:00 +0100
- To: Martin Hepp <martin.hepp@ebusiness-unibw.org>, semantic-web@w3.org
- Cc: public-lod@w3.org
Hey Martin, What you describe seems to be exactly one of the use cases we developed the Provenance Vocabulary [1] for:The Provenance Vocabulary provides the class prv:DataAccess which represents the execution of a data access on the Web. Using the property prvTypes:exchangedHTTPMessage you can associate instances of prv:DataAccess with the HTTP messages that have been exchanged. These HTTP messages can then be described using the W3C RDF vocabulary for HTTP. Here's an example: foo:DataAboutProduct1 foaf:primaryTopic foo:Product1 ; prv:createdBy _:dc . _:dc a prv:DataCreation ; # ... additional information about the creation process ... prv:usedData _:xml . _:xml a prv:DataItem ; prv:retrievedBy _:da . _:da a prv:DataAccess ; prv:accessedResource <http://www.heppnetz.de/companies.xml> ; prvTypes:exchangedHTTPMessage _:m . _:m a http:Response ; http:httpVersion "1.1" ; # ... http:statusCodeNumber "200" . (Needless to say that you may use URIs instead of the blank node identifiers that I used in the example for the sake of readability.) Our "Guide to the Provenance Vocabulary" contains another example in Section "3.3.2 Related Vocabularies: HTTP Vocabulary in RDF" [2]. Greetings, Olaf [1] http://purl.org/net/provenance/ [2] http://purl.org/net/provenance/guide#HTTP_Vocabulary_in_RDF On Monday 17 January 2011 23:09:01 Martin Hepp wrote: > Hi All: > Thanks for the very useful feedback! > > Just to clarify what I want to do: > > There are many valuable commerce data resources available in XML and > CSV on the Web. It is fairly straightforward to translate them into > RDF, e.g. using GoodRelations. Now, whenever I create an RDF > representation of that data in a new namespace, I may want to store > the meta-data of the original HTTP GET request with which I fetched > the XML or CSV file, and attach that meta-data to the resulting RDF > graph. > > This allows for (1) nice analytics and (2) data cleansing entirely in > the SPARQL / RDF world later-on. > > So the subject to which the meta-data will be attached will not be the > resource URI (because there can naturally be multiple HTTP GET > requests for the same resource), but instead the resulting graph or > dataset. > > I assume the same will be useful in many other application domains and > scenarios. > > Here is an example: > > Assume we have an XML file with company data at > http://www.heppnetz.de/companies.xml > > and we fetch that, convert it to RDF, and republish the data in the > namespace > > http://www.example.com/RDFizingResults/dataset1# > > Let's further assume that the HTTP GET request to the XML file > returned the following HTTP header data: > > # Meta-data from fetching the original data > # HTTP/1.1 200 OK > # Date: Mon, 17 Jan 2011 21:31:58 GMT > # Server: Apache > # Last-Modified: Mon, 25 Oct 2010 20:31:25 GMT > # Content-Length: 10971 > # Content-Type: application/xml > > > So the following seems possible and useful: > > @prefix void: <http://rdfs.org/ns/void#> . > @prefix http: <http://www.w3.org/2006/http#> . > @prefix headers: <http://www.w3.org/2008/http-headers#> . > @prefix status: <http://www.w3.org/2008/http-statusCodes#> . > @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . > @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . > @prefix dct: <http://purl.org/dc/terms/> . > @prefix gr: <http://purl.org/goodrelations/v1#> . > @prefix foo: <http://www.example.com/RDFizingResults/dataset1#> . > > # Define an entity for the resulting dataset / graph > foo:dataset a void:Dataset . > > # Link the graph to the HTTP header info from the data transformation > foo:dataset rdfs:seeAlso foo:ResponseMetaData . > > NOTE: My original question was which predicate to use for this > statement. rdfs:seeAlso seems valid, but it maybe suboptimal. > > # Expose the meta-data from fetching the original data > foo:ResponseMetaData a http:Response ; > http:httpVersion "1.1" ; > dct:date "2008-01-11"^^xsd:date ; > http:statusCodeNumber "200" ; > http:sc status:statusCode200 ; > http:headers [ a http:MessageHeader ; > http:fieldName "Server" ; > http:fieldValue "Apache" ] ; > http:headers [ a http:MessageHeader ; > http:fieldName "Last-Modified" ; > http:fieldValue > "2010-1025T20:31:25Z"^^xsd:datetime ] ; > http:headers [ a http:MessageHeader ; > http:fieldName "Content-Length" ; > http:fieldValue 10971 ] ; > http:headers [ a http:MessageHeader ; > http:fieldName "Content-Type" ; > http:fieldValue "application/xml" ] . > > # Then comes the real instance data, derived from the original source > foo:ACME a gr:BusinessEntity ; > rdfs:isDefinedBy foo:dataset . > > foo:MillerInc a gr:BusinessEntity ; > rdfs:isDefinedBy foo:dataset . > # etc. > > Does that sound okay and useful for everybody? > > Best > Martin > > PS: I omitted the prefix declarations in the example above: > > @prefix void: <http://rdfs.org/ns/void#> . > @prefix http: <http://www.w3.org/2006/http#> . > @prefix headers: <http://www.w3.org/2008/http-headers#> . > @prefix status: <http://www.w3.org/2008/http-statusCodes#> . > @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . > @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . > @prefix dct: <http://purl.org/dc/terms/> . > @prefix gr: <http://purl.org/goodrelations/v1#> . > @prefix foo: <http://www.example.com/RDFizingResults/dataset1#> . > > On 17.01.2011, at 21:21, Shadi Abou-Zahra wrote: > > Dear Martin, All, > > > > Just a reminder that you are looking at an old, outdated editors > > draft of the HTTP-in-RDF Vocabulary. The latest Public Working Draft > > is here: > > - <http://www.w3.org/TR/HTTP-in-RDF10/> > > > > I seem to recall updates to the vocabulary that allow more > > flexibility to support uses as the one described below by Martin > > (though I did not check back specifically for that case). > > > > Please note that the W3C/WAI Evaluation and Repair Tools Working > > Group welcomes comments and feedback on HTTP-in-RDF (despite the > > long passed deadline). Please send comments to > > <public-earl10-comments@w3.org > > > > >. > > > > Best, > > > > Shadi > > > > On 17.1.2011 20:23, Nathan wrote: > >> William Waites wrote: > >>> * [2011-01-17 16:39:27 +0100] Martin Hepp > >>> <martin.hepp@ebusiness-unibw.org> écrit: > >>> > >>> ] Does anybody know of a standard property for linking a RDF graph > >>> to > >>> a ] http:GetRequest, http:Connection, or http:Response instance? > >>> Maybe > >>> ] rdfs:seeAlso (@TBL: ;- ))? > >>> > >>> If you suppose that the name of the graph is the same as the > >>> request URI (it will not always be, of course) you can link > >>> in the other direction from http:Request using http:requestURI. > >>> I am not sure that http:requestURI has a standard inverse though. > >> > >> And remember of course, that the headers are split in to different > >> groups which relate to different things, many relate to the message > >> (in > >> relation to the request), some relate to the server, some relate to > >> the > >> entity (an encoded version of the representation for messaging) a few > >> (really not many) relate to the representation itself, and a couple > >> relate to the resource itself, the resource being the thing the URI > >> identifies. > >> > >> Best, > >> > >> Nathan
Received on Tuesday, 18 January 2011 09:32:15 UTC