- From: Rafael Richards <rafaelrichards@jhu.edu>
- Date: Wed, 27 Mar 2013 17:31:11 +0000
- To: Oliver Ruebenacker <curoli@gmail.com>
- CC: David Booth <david@dbooth.org>, "<public-semweb-lifesci@w3.org>" <public-semweb-lifesci@w3.org>
- Message-ID: <29C2E86EB9143E4397BDFA62B6F6E748120A10F9@BAYEXCH-CL-4.win.ad.jhu.edu>
This has been a very prolific thread, but did we discuss provenance? A slideshare on owl:sameAs - Harmful to Provenance is here: http://www.slideshare.net/jpmccusker/owlsameas-considered-harmful-to-provenance Presentation Abstract: GOTO was once a standard operation in most computer programming languages. Edsger Dijkstra argued in 1968 that GOTO is a low level operation that is not appropriate for higher-level programming languages, and advocated structured programming in its place. Arguably, owl:sameAs in its current usage may be poised to go through a similar discussion and transformation period. In biomedical research, the provenance of information gathered is nearly as important as, and sometimes even more important than, the information itself. owl:sameAs allows someone to state that two separate descriptions really refer to the same entity. Currently that means that operational systems merge the descriptions and at the same time, merge the provenance information, thus losing the ability to retrieve where each individual description came from. This merging of provenance can be problematic or even catastrophic in biomedical applications that demand access to provenance information. Based on our knowledge of integration issues of data in biomedicine, we give examples as use cases of this issue in biospecimen management and experimental metadata representations. We suggest that systems using any construct like owl:sameAs must provide an option preserve the provenance of the entities and ground assertions related to those entities in question. Rafael Rafael M. Richards, M.D., M.S. Assistant Professor, Anesthesiology & Critical Care Medicine Faculty, Division of Health Science Informatics Johns Hopkins School of Medicine Baltimore, MD 2224-2760 rafaelrichards [at] jhu edu On Mar 27, 2013, at 11:02 AM, Oliver Ruebenacker <curoli@gmail.com> wrote: Hello David, So if I understand your view correctly, then it could be expressed in a language close to yours as: "Some people believe that if a URI occurs twice within a graph or statement, it refers to the same thing. But this is a myth! RDF never guarantees that two occurrences of the same URI mean the same thing." Take care Oliver On Wed, Mar 27, 2013 at 9:37 AM, David Booth <david@dbooth.org> wrote: Hi Oliver, On 03/25/2013 04:02 PM, Oliver Ruebenacker wrote: Hello David, We agree that there are different interpretations. But you haven't shown that the boundaries between interpretations are graphs boundaries (others, including me, think that each interpretation is global). I don't know what you mean by "boundaries between interpretations". An interpretation may be applied to any graph or statement to determine its truth value (or to a URI to determine the resource to which it is bound in that interpretation). The notion of a graph boundary is purely a matter of convenience and utility. A graph can consist of *any* set of RDF triples. If you wanted, you could apply an interpretation to a graph consisting of three randomly selected triples from each RDF document on the web, but it probably wouldn't be very useful to do so, because you probably would not care about the truth value of that graph. We generally only apply an interpretation to a graph whose truth value we care about. An interpretation corresponds to the *use* of a graph. Suppose I have a graph that "ambiguously" uses the same URI to denote both a toucan and its web page, without asserting that toucans cannot be web pages: @prefix : <http://example/> :tweety a :Toucan . :tweety a :WebPage . When a conforming RDF application takes that RDF graph as input, assumes it is true, and produces some output such as "Tweety is a toucan", in effect the application has chosen a particular interpretation to apply to that graph. In effect, the choice of interpretation causes the app to produce that particular output. For example, the app might categorize animals into species, choosing an interpretation that maps :tweety to a kind of bird. But a different conforming RDF application that only cares about web page authorship might take that *same* RDF graph as input and choose a different interpretation that maps :tweety to a web page, instead outputting "Tweety is a web page". In effect, the app has chosen an interpretation that is appropriate for its purpose. If the graph had also asserted :Toucan owl:disjointWith :WebPage, then the graph cannot be true under OWL semantics, and the graph (as is) would be unusable to both apps. That makes me wonder whether you consider it in conformance with the specs to choose different boundaries? For example, would you consider it conforming to apply a different interpretation to each statement? Or how about a different interpretation for each node of a statement? Do you see anything in the specs against doing so? Sure it is in conformance with the spec. An interpretation can be applied to any graph or any RDF statement. And certainly you could determine the truth value of N different statements according to N different interpretations. But would it be useful to do so? Probably not. Furthermore, if two statements are true under two different interpretations, that would not tell you whether a graph consisting of those two statements would be true under a single interpretation. OTOH, it *is* useful to apply different intepretations to different graphs, and one reason is that you may be using those graphs for different applications, each app in effect applying its own interpretation. But the fact that those graphs may be true under different interpretations does *not* tell you whether the merge of those graphs will be true under a single interpretation. The RDF Semantics spec only tells you how to compute the truth value of one <interpretation, graph> pair at a time, but you can certainly apply it to as many <interpretation, graph> pairs as you want -- in full conformance with the intent of the spec. This is the same as if I define a function f of two arguments, such that f(x,y) = x+y, that function definition only tells you how to compute f(x,y) for one pair of numbers at a time, but you can certainly apply it to as many pairs as you want, without in any way violating the intent of f's definition. David -- IT Project Lead at PanGenX (http://www.pangenx.com) The purpose is always improvement
Received on Wednesday, 27 March 2013 17:32:04 UTC