- From: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
- Date: Sat, 16 Feb 2002 04:11:50 +0000
- To: Ronald Daniel <rdaniel@interwoven.com>
- Cc: w3c-rdfcore-wg@w3.org
Ron, I think your particular example is not affected by the choice of 'stating' vs. 'statement'. There are no additional properties of the rdf:Statement resource (?_S in your query) to be affected by the corresponding entailment (c.f. Q1 of [1]). It's interesting to note a difference in information modelling style: I have assumed that provenance would be additional properties applied to the reification of some statement whose provenance is described. Your approach (which as far as I can tell is perfectly valid) is that the assertion of provenance itself is itself a reified statement, and the statements whose provenance is described are indicated by <someArticleURI>. #g -- [1] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Feb/0359.html At 11:20 AM 2/15/02 -0800, Ronald Daniel wrote: >As I mentioned, I'm not real clear on just what the impact of >deciding on 'stating' vs. 'statement' would be, so here's an >example of the sorts of things I think PRISM's users might want >to do re. provenance. > > >Problem to be solved: >===================== > >PDQ Publications sends electronic copies of their content >to an aggregator/distributor who makes them accessible along >with lots of other articles. Each article has a metadata >record that provides information including the embargo date >for the article. One of the records contains: > ><rdf:RDF ... namespace decls...> > <rdf:Description rdf:about="someArticleURI"> > <dc:rights rdf:parseType="Resource"> > <prism:releaseTime>2002-02-14T18:00:00-05:00</prism:releaseTime> > <!-- 6:00 PM EST on Valentine's Day --> > </dc:rights> > <dc:title>Valentine's Day Ideas</dc:title> > <dc:publisher>PDQ Publications</dc:publisher> > </rdf:Description> ></rdf:RDF> > >Someone connected with the publisher does a search on the >distributor's site and finds the article before they think >it should have been available. They call the distributor >to demand an explanation. The distributor needs to be >able to search and find out a few things: > 1) What is the embargo time in their system? > 2) What embargo time was in the original metadata provided > by the publisher? > 3) If they differ, who changed it and when - did the publisher > send an updated metadata record or was the change made by > someone at the distributor? > > >Possible solution: >================== >(This does not have to be the way it happens, but this shows the >sort of stuff I think needs to be represented.) > >To enable such searches, the Distributor archives not only >the articles but also the incoming metadata descriptions of >the articles. It also maintains a simple 'provenance' database >recording the association between things, and the time the >records were received. The triples below are part of that >provenance database. They show the initial reception of the >article and metadata, and also that the metadata for the article >was updated by the publisher the next morning. > > <someArticleURI> <D:describedBy> <MD-record-URI> > <MD-record-URI> <prism:receptionTime> "2002-02-13T18:00:27.43" > <MD-record-URI> <dc:source> "PDQ Publications" > <someArticleURI> <prism:receptionTime> "2002-02-13T18:00:27.45" > <someArticleURI> <D:describedBy> <MD-record2-URI> > <MD-record2-URI> <prism:receptionTime> "2002-02-14T09:12:34.56" > <MD-record2-URI> <dc:source> "PDQ Publications" > > >Checking the releaseTime provenance is a common operation, so the >person who gets the irate phone call brings up the web form for checking >it. They ask the caller for info like the title and publisher, >then search to get info on the article. That web form is converted >into a query: > > ?X <dc:title> "Valentine's Day Ideas" > ?X <dc:publisher> "PDQ Publications" > > => X = <someArticleURI> > >(of course, exact match searching would not be used, and the user >would probably have to pick the article from a list of results, >but I digress) > >and then the system queries the provenance database (plus the >stored metadata records, probably no need to extract all their >triples into the prov. DB, but that's an implementation detail): > > <someArticleURI> <D:describedBy> ?M # M is metadata record URI > ?M <prism:receptionTime> ?Tgot > ?M <dc:source> ?FromWho > ?M <x:contains> ?_S # _S is statement URI > ?_S <rdf:type> <rdf:Statement> > ?_S <rdf:subject> <someArticleURI> > ?_S <rdf:predicate> <prism:releaseTime> > ?_S <rdf:object> ?Trel > > => [ > M = <MD-record-URI> > Tgot = "2002-02-13T18:00:27.43" > FromWho = "PDQ Publications" > _S = <intStmtId1234567> > Trel = "2002-02-14T18:00:00-05:00" > ] > [ > M = <MD-record2-URI> > Tgot = "2002-02-14T09:12:34.56" > FromWho = "PDQ Publications" > _S = <intStmtId987654321> > Trel = "2002-02-14T09:12:33.01-05:00" > ] > > >Those results are formatted for display. They show something like > > Title: Valentine's Day Ideas > Records: > Initial reception: Feb. 13, 2002, 6:00:27.43 PM EST > Source: PDQ Publishing > Stated embargo time: Feb. 14, 2002, 6:00 PM EST > > First update received: Feb. 14, 2002, 9:12:34.56 AM EST > Source: PDQ Publishing > Stated Embargo time: Feb. 14, 2002, 9:12:33.01 AM EST > >Thus it looks like when the publisher sent an update for the >article's metadata the release time was replaced by the >current time on their end. (We don't know exactly when the >publisher sent something, we just know when we got it and >what it contained. But since the embargo time in the second >record is a second before the time we got it, it seems a >reasonable assumption for a person to make) > >The user handling the complaint asks for the caller's email >address. The report above, as well as copies of the two >received metadata records, are emailed to the caller. > > >Requirements: >============= > >1) It must be possible for the receiver to give an identity to a > received RDF serialization and store it for later queries. >2) It must be possible for the receiver to search stored RDF > records for statements with particular subjects, objects, and/or > predicates. > a) it must be possible to give at least a temporary identity > to each 'statement' in a received RDF record > (By 'statement' I do not mean to imply anything about Brian's > 'stating' vs. 'statement' terminology) > b) It must be possible to indicate which serialized RDF document > contains a statement and vice versa. > c) it should be possible for creators of metadata records to > assign IDs to statements > >Thanks, > >Ron Daniel Jr. >Standards Architect >Interwoven, Inc. >Tel: 408-530-5922 >Cell: 925-368-8371 >Email: rdaniel@interwoven.com > >Visit www.interwoven.com >The Leader in Enterprise Content Management ------------------------------------------------------------ Graham Klyne MIMEsweeper Group Strategic Research <http://www.mimesweeper.com> <Graham.Klyne@MIMEsweeper.com>
Received on Friday, 15 February 2002 23:35:57 UTC