- From: Libby Miller <Libby.Miller@bristol.ac.uk>
- Date: Sun, 3 Mar 2002 12:35:00 +0000 (GMT)
- To: Leo Obrst <lobrst@mitre.org>
- cc: W3C Web Ontology WG <www-webont-wg@w3.org>
Hi Leo, I dont have a good *definition* to hand, but I can tell you how some people have been using provenance in our RDF implementations. In the RDF world I think the term is used to to represent some indication of the source of information found on the web. So if you had an RDF database that collected information from many different sources, it is often useful to be able to keep track of the source so that you can delete all information from the source, or update it as necessary. Currently many databases which store RDF have some form of provenance storage *outside* of the RDF model. So the fact that information came from a particular source is not represented as RDF triples but as an extra field in the database. I think you could represent this kind of information within the reification model, but preserving information about the source of triples outside the RDF model is useful for database management rather than getting into issues of whether such statements are asserted or not, or whether a join on such data is a join on the reified versions of the data or on the asserted versions. (As an aside, it would be useful to treat reified data as asserted sometimes, while also retaining the information that it is reified...) So an identifier for the source could have all sorts of information associated with it, including information about when it was found or updated, and RDF properties indicating whether the information is signed, for example. I'm not totally happy with moving in and out of the RDF model like this, especially since many RDF APIs and dataabses don't yet support provenance information, but being able to trace where each triple came from, and when, and associate other information with the triple is necessary, I think. Its worth asking this question on the RDF interest group list: I'm sure there are many different views and implementations. Libby On Sat, 2 Mar 2002, Leo Obrst wrote: > Excuse my ignorance: can someone give me a definition of "Provenance"? > I've seen it in the RDF discussions and recently in WOW-G/OWL. > > The definition (as far as I can determine) is something along the lines > of (from http://www.mtholyoke.edu/offices/library/arch/def.htm): > > "Provenance: > > The place of origin of an object or document(s). In archival terms, this > refers to the administrative office of origin of a given record, group > of records, or files. In the case of manuscript collections, provenance > refers to the person, family, firm or other source from which the > materials were obtained. Provenance can also refer to information about > the successive transfers of ownership and custody of a particular, book, > object, or document." > > This smacks to me of the legal and museum worlds (not that there's > anything wrong with that). > > Do they really mean by this what I know as "product metadata" or "data > lineage" from the database world? Or do they mean ontological/semantic > properties associated with an instance/individual that change over time? > Big difference. > > Is this the same as "claims", per, e.g., SHOE? It does always seem to be > related to indirect discourse or modal issues, as far as I can tell: > "John believed <blah>." So, reification? > > I am sorry for my confusion; perhaps this is common parlance in the > Semantic Web? > > And, by the way, what is the model semantics of this notion of > "provenance"? Metadata tag or time-dependent property? > > Leo > > -- > _____________________________________________ > Dr. Leo Obrst The MITRE Corporation > mailto:lobrst@mitre.org Intelligent Information Management/Exploitation > Voice: 703-883-6770 7515 Colshire Drive, M/S W640 > Fax: 703-883-1379 McLean, VA 22102-7508, USA > >
Received on Sunday, 3 March 2002 07:38:37 UTC