- From: Jim McCusker <james.mccusker@yale.edu>
- Date: Wed, 27 Mar 2013 15:42:34 -0400
- To: Bob Futrelle <bob.futrelle@gmail.com>
- Cc: Rafael Richards <rafaelrichards@jhu.edu>, Oliver Ruebenacker <curoli@gmail.com>, David Booth <david@dbooth.org>, "<public-semweb-lifesci@w3.org>" <public-semweb-lifesci@w3.org>
- Message-ID: <CAAtgn=Q9uBA2Kva+vbjgC6J+H_a-rE_Uz3JrVwtGqpV8QN_26w@mail.gmail.com>
Which is why PROV exists. Now we have a floor to work from. I've already integrated it into a number of projects. Jim On Wed, Mar 27, 2013 at 3:39 PM, Bob Futrelle <bob.futrelle@gmail.com>wrote: > Provenance techniques/tools/systems are nowhere near what they could to be. > Each provenance system or "standard" ends up being unique so the > information is not inter-operative. > > One example among the many: http://openprovenance.org/ > > These days, I'm more focused on NLP than serious knowledge systems. > But I find that logging and versioning can allow me generate provenance > graphs > if I really need them. Often a shift in design is enough to blur earlier > designs > that did have some good ideas that shouldn't be lost. > > - Bob Futrelle > BioNLP.org > > > > On Wed, Mar 27, 2013 at 1:31 PM, Rafael Richards <rafaelrichards@jhu.edu>wrote: > >> This has been a very prolific thread, but did we discuss provenance? >> >> A slideshare on owl:sameAs - Harmful to Provenance is here: >> >> >> http://www.slideshare.net/jpmccusker/owlsameas-considered-harmful-to-provenance >> >> Presentation Abstract: >> GOTO was once a standard operation in most computer programming >> languages. Edsger Dijkstra argued in 1968 that GOTO is a low level >> operation that is not appropriate for higher-level programming languages, >> and advocated structured programming in its place. Arguably, owl:sameAs in >> its current usage may be poised to go through a similar discussion and >> transformation period. In biomedical research, the provenance of >> information gathered is nearly as important as, and sometimes even more >> important than, the information itself. owl:sameAs allows someone to state >> that two separate descriptions really refer to the same entity. Currently >> that means that operational systems merge the descriptions and at the same >> time, merge the provenance information, thus losing the ability to retrieve >> where each individual description came from. This merging of provenance can >> be problematic or even catastrophic in biomedical applications that demand >> access to provenance information. Based on our knowledge of integration >> issues of data in biomedicine, we give examples as use cases of this issue >> in biospecimen management and experimental metadata representations. We >> suggest that systems using any construct like owl:sameAs must provide an >> option preserve the provenance of the entities and ground assertions >> related to those entities in question. >> >> >> Rafael >> >> *Rafael M. Richards, M.D., M.S.* >> *Assistant Professor, *Anesthesiology & Critical Care Medicine**** >> *Faculty, *Division of Health Science Informatics >> Johns Hopkins School of Medicine >> Baltimore, MD 2224-2760**** >> rafaelrichards [at] jhu edu >> >> >> >> On Mar 27, 2013, at 11:02 AM, Oliver Ruebenacker <curoli@gmail.com> >> wrote: >> >> Hello David, >> >> So if I understand your view correctly, then it could be expressed >> in a language close to yours as: >> >> "Some people believe that if a URI occurs twice within a graph or >> statement, it refers to the same thing. But this is a myth! RDF never >> guarantees that two occurrences of the same URI mean the same thing." >> >> Take care >> Oliver >> >> On Wed, Mar 27, 2013 at 9:37 AM, David Booth <david@dbooth.org> wrote: >> >> Hi Oliver, >> >> On 03/25/2013 04:02 PM, Oliver Ruebenacker wrote: >> >> >> Hello David, >> >> We agree that there are different interpretations. But you haven't >> shown that the boundaries between interpretations are graphs >> boundaries (others, including me, think that each interpretation is >> global). >> >> >> >> I don't know what you mean by "boundaries between interpretations". >> An interpretation may be applied to any graph or statement to determine >> its >> truth value (or to a URI to determine the resource to which it is bound in >> that interpretation). >> >> The notion of a graph boundary is purely a matter of convenience and >> utility. A graph can consist of *any* set of RDF triples. If you wanted, >> you could apply an interpretation to a graph consisting of three randomly >> selected triples from each RDF document on the web, but it probably >> wouldn't >> be very useful to do so, because you probably would not care about the >> truth >> value of that graph. We generally only apply an interpretation to a graph >> whose truth value we care about. >> >> An interpretation corresponds to the *use* of a graph. Suppose I have a >> graph that "ambiguously" uses the same URI to denote both a toucan and its >> web page, without asserting that toucans cannot be web pages: >> >> @prefix : <http://example/> >> :tweety a :Toucan . >> :tweety a :WebPage . >> >> When a conforming RDF application takes that RDF graph as input, assumes >> it >> is true, and produces some output such as "Tweety is a toucan", in effect >> the application has chosen a particular interpretation to apply to that >> graph. In effect, the choice of interpretation causes the app to produce >> that particular output. For example, the app might categorize animals >> into >> species, choosing an interpretation that maps :tweety to a kind of bird. >> But a different conforming RDF application that only cares about web page >> authorship might take that *same* RDF graph as input and choose a >> different >> interpretation that maps :tweety to a web page, instead outputting "Tweety >> is a web page". In effect, the app has chosen an interpretation that is >> appropriate for its purpose. >> >> If the graph had also asserted :Toucan owl:disjointWith :WebPage, then the >> graph cannot be true under OWL semantics, and the graph (as is) would be >> unusable to both apps. >> >> >> That makes me wonder whether you consider it in conformance with the >> specs to choose different boundaries? >> >> For example, would you consider it conforming to apply a different >> interpretation to each statement? Or how about a different >> interpretation for each node of a statement? Do you see anything in >> the specs against doing so? >> >> >> >> Sure it is in conformance with the spec. An interpretation can be applied >> to any graph or any RDF statement. And certainly you could determine the >> truth value of N different statements according to N different >> interpretations. But would it be useful to do so? Probably not. >> Furthermore, if two statements are true under two different >> interpretations, >> that would not tell you whether a graph consisting of those two statements >> would be true under a single interpretation. >> >> OTOH, it *is* useful to apply different intepretations to different >> graphs, >> and one reason is that you may be using those graphs for different >> applications, each app in effect applying its own interpretation. But the >> fact that those graphs may be true under different interpretations does >> *not* tell you whether the merge of those graphs will be true under a >> single >> interpretation. >> >> The RDF Semantics spec only tells you how to compute the truth value of >> one >> <interpretation, graph> pair at a time, but you can certainly apply it to >> as >> many <interpretation, graph> pairs as you want -- in full conformance with >> the intent of the spec. This is the same as if I define a function f of >> two >> arguments, such that f(x,y) = x+y, that function definition only tells you >> how to compute f(x,y) for one pair of numbers at a time, but you can >> certainly apply it to as many pairs as you want, without in any way >> violating the intent of f's definition. >> >> David >> >> >> >> >> -- >> IT Project Lead at PanGenX (http://www.pangenx.com) >> The purpose is always improvement >> >> >> > -- Jim McCusker Programmer Analyst Krauthammer Lab, Pathology Informatics Yale School of Medicine james.mccusker@yale.edu | (203) 785-4436 http://krauthammerlab.med.yale.edu PhD Student Tetherless World Constellation Rensselaer Polytechnic Institute mccusj@cs.rpi.edu http://tw.rpi.edu
Received on Wednesday, 27 March 2013 19:43:24 UTC