W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > February 2002

Re: Provenance usage scenario - releaseTime

From: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
Date: Sat, 16 Feb 2002 04:11:50 +0000
Message-Id: <>
To: Ronald Daniel <rdaniel@interwoven.com>
Cc: w3c-rdfcore-wg@w3.org

I think your particular example is not affected by the choice of 'stating' 
vs. 'statement'.  There are no additional properties of the rdf:Statement 
resource (?_S in your query) to be affected by the corresponding entailment 
(c.f. Q1 of [1]).

It's interesting to note a difference in information modelling style:  I 
have assumed that provenance would be additional properties applied to the 
reification of some statement whose provenance is described.  Your approach 
(which as far as I can tell is perfectly valid) is that the assertion of 
provenance itself is itself a reified statement, and the statements whose 
provenance is described are indicated by <someArticleURI>.


[1] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Feb/0359.html

At 11:20 AM 2/15/02 -0800, Ronald Daniel wrote:
>As I mentioned, I'm not real clear on just what the impact of
>deciding on 'stating' vs. 'statement' would be, so here's an
>example of the sorts of things I think PRISM's users might want
>to do re. provenance.
>Problem to be solved:
>PDQ Publications sends electronic copies of their content
>to an aggregator/distributor who makes them accessible along
>with lots of other articles. Each article has a metadata
>record that provides information including the embargo date
>for the article. One of the records contains:
><rdf:RDF  ... namespace decls...>
>  <rdf:Description rdf:about="someArticleURI">
>   <dc:rights rdf:parseType="Resource">
>    <prism:releaseTime>2002-02-14T18:00:00-05:00</prism:releaseTime>
>    <!-- 6:00 PM EST on Valentine's Day -->
>   </dc:rights>
>   <dc:title>Valentine's Day Ideas</dc:title>
>   <dc:publisher>PDQ Publications</dc:publisher>
>  </rdf:Description>
>Someone connected with the publisher does a search on the
>distributor's site and finds the article before they think
>it should have been available. They call the distributor
>to demand an explanation. The distributor needs to be
>able to search and find out a few things:
>   1) What is the embargo time in their system?
>   2) What embargo time was in the original metadata provided
>      by the publisher?
>   3) If they differ, who changed it and when - did the publisher
>      send an updated metadata record or was the change made by
>      someone at the distributor?
>Possible solution:
>(This does not have to be the way it happens, but this shows the
>sort of stuff I think needs to be represented.)
>To enable such searches, the Distributor archives not only
>the articles but also the incoming metadata descriptions of
>the articles. It also maintains a simple 'provenance' database
>recording the association between things, and the time the
>records were received. The triples below are part of that
>provenance database. They show the initial reception of the
>article and metadata, and also that the metadata for the article
>was updated by the publisher the next morning.
>   <someArticleURI> <D:describedBy> <MD-record-URI>
>   <MD-record-URI>  <prism:receptionTime> "2002-02-13T18:00:27.43"
>   <MD-record-URI>  <dc:source>  "PDQ Publications"
>   <someArticleURI> <prism:receptionTime> "2002-02-13T18:00:27.45"
>   <someArticleURI> <D:describedBy> <MD-record2-URI>
>   <MD-record2-URI>  <prism:receptionTime> "2002-02-14T09:12:34.56"
>   <MD-record2-URI>  <dc:source>  "PDQ Publications"
>Checking the releaseTime provenance is a common operation, so the
>person who gets the irate phone call brings up the web form for checking
>it. They ask the caller for info like the title and publisher,
>then search to get info on the article. That web form is converted
>into a query:
>      ?X  <dc:title> "Valentine's Day Ideas"
>      ?X  <dc:publisher> "PDQ Publications"
>       => X = <someArticleURI>
>(of course, exact match searching would not be used, and the user
>would probably have to pick the article from a list of results,
>but I digress)
>and then the system queries the provenance database (plus the
>stored metadata records, probably no need to extract all their
>triples into the prov. DB, but that's an implementation detail):
>       <someArticleURI> <D:describedBy> ?M     # M is metadata record URI
>       ?M   <prism:receptionTime>  ?Tgot
>       ?M   <dc:source>  ?FromWho
>       ?M   <x:contains> ?_S                   # _S is statement URI
>       ?_S  <rdf:type>  <rdf:Statement>
>       ?_S  <rdf:subject> <someArticleURI>
>       ?_S  <rdf:predicate> <prism:releaseTime>
>       ?_S  <rdf:object> ?Trel
>       => [
>           M = <MD-record-URI>
>           Tgot = "2002-02-13T18:00:27.43"
>           FromWho = "PDQ Publications"
>           _S = <intStmtId1234567>
>           Trel = "2002-02-14T18:00:00-05:00"
>          ]
>          [
>           M = <MD-record2-URI>
>           Tgot = "2002-02-14T09:12:34.56"
>           FromWho = "PDQ Publications"
>           _S = <intStmtId987654321>
>           Trel = "2002-02-14T09:12:33.01-05:00"
>          ]
>Those results are formatted for display. They show something like
>   Title:  Valentine's Day Ideas
>   Records:
>     Initial reception:  Feb. 13, 2002, 6:00:27.43 PM EST
>     Source: PDQ Publishing
>     Stated embargo time:  Feb. 14, 2002, 6:00 PM EST
>     First update received: Feb. 14, 2002, 9:12:34.56  AM EST
>     Source: PDQ Publishing
>     Stated Embargo time: Feb. 14, 2002, 9:12:33.01  AM EST
>Thus it looks like when the publisher sent an update for the
>article's metadata the release time was replaced by the
>current time on their end. (We don't know exactly when the
>publisher sent something, we just know when we got it and
>what it contained. But since the embargo time in the second
>record is a second before the time we got it, it seems a
>reasonable assumption for a person to make)
>The user handling the complaint asks for the caller's email
>address. The report above, as well as copies of the two
>received metadata records, are emailed to the caller.
>1) It must be possible for the receiver to give an identity to a
>    received RDF serialization and store it for later queries.
>2) It must be possible for the receiver to search stored RDF
>    records for statements with particular subjects, objects, and/or
>    predicates.
>      a) it must be possible to give at least a temporary identity
>         to each 'statement' in a received RDF record
>         (By 'statement' I do not mean to imply anything about Brian's
>          'stating' vs. 'statement' terminology)
>     b) It must be possible to indicate which serialized RDF document
>        contains a statement and vice versa.
>     c) it should be possible for creators of metadata records to
>         assign IDs to statements
>Ron Daniel Jr.
>Standards Architect
>Interwoven, Inc.
>Tel: 408-530-5922
>Cell: 925-368-8371
>Email: rdaniel@interwoven.com
>Visit www.interwoven.com
>The Leader in Enterprise Content Management

Graham Klyne                    MIMEsweeper Group
Strategic Research              <http://www.mimesweeper.com>
Received on Friday, 15 February 2002 23:35:57 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:53:55 UTC