- From: Hugh Glaser <hg@ecs.soton.ac.uk>
- Date: Tue, 31 Jul 2007 20:25:42 +0100
- To: Chris Bizer <chris@bizer.de>, Pat Hayes <phayes@ihmc.us>
- CC: Tim Berners-Lee <timbl@w3.org>, <semantic-web@w3.org>, Linking Open Data <linking-open-data@simile.mit.edu>
Pat, Chris, I think we share a view that there are some issues here, at least with ontology design, that might benefit from wider awareness, perhaps even in the Linked Data Tutorial. On 31/7/07 09:33, "Chris Bizer" <chris@bizer.de> wrote: > Hi Hugh, > >>> If you put all this in one triplestore, with the owl:sameAs assertions, >>> then >>> it will not be possible to distinguish where facts came from, or rather >>> which facts are associated with which others. >> >> Whoa, careful. It will probably will be >>possible<< to distinguish this, >> in fact. It might be that unwanted consequences are entailed by the >> combination of the various RDF graphs and the sameAs, but a careful >> querying process should be able to determine which of the various triples >> are present and even whether they are linked. One simple way is to query >> under sub-OWL entailment, for example, which can be little more than a >> direct syntactic matching process (see SPARQL). > > Some practical backup for Pat's argumentation. Within applications like the > DISCO Semantic Web browser or the Semantic Web Client Library, we use the > Named Graphs data model to represent RDF data that has been retrieved from > the Web. This allows us to clearly keep track where information came from > and which facts are associated with each other. Yes, it is possible to distinguish. This begs the question: if I need to use Named Graphs for the simplest query about Tim's three roles, effectively bypassing the sameAs inference, was sameAs the right thing to use? > > Beside of this, I think Semantic Web clients have to take two other things > into account before they start reasoning over retrieved data: > Trustworthiness and vocabulary mappings. Think about what you are doing in > the offline world when you read some political newspapers: First you will > try to align the different terminology used by the authors in your head to > get a consistent model. Afterwards you will decide which articles to trust > and which to consider untrustworthy. Only after these two steps, you will > start to reason about the consequences of what you have read. > > I think it would be a good idea for Semantic Web clients to do the same. > Therefore, I think it is a bit naive to throw lots of RDF data from the Web > straight into a single RDF model and then wonder that reasoning over this > data leads to unintended consequences. Trust is a big issue (and especially motivates Named graphs), but I don't think it illuminates this case. I am not describing a situation where I am throwing lots of RDF into a triplestore. The situation is that I want to do some querying, say about people at W3C. I find Tim's URI, and retrieve the RDF, and his associated sameAs URIs -> RDF, and put it all into a triplestore cache, so that I can conveniently do some work on it. Since it all starts from Tim's page, I don't see there is much of a trust issue here either. This is a straightforward bit of SW business. > > I also think that it would not be harmful if OWL tutorials and best practice > guides would state this fact more clearly so that they do not raise wrong > expectations. That would be good. So what is the recommended best practice? Either on the querying side, to use Named Graphs model all the time; or on the representation side, as I said in my original message (which seemed to get lost off the end of Pat's reply): > This means that the ontologies have to be much more carefully constructed > than they appear to be at present, taking cognisance of the consequences of > others making such sameAs statements, in our open world. Hugh > > In the light of the current Semantic Web layer cake discussion, I have been > wondering for years why the trust layer is up that far in the layer cake. It > is obvious that you will only get junk if you try to reason over data from > the web before applying some heuristics to determine trustworthiness and > filter out low quality information. Therefore, I think the trust layer > should be positioned lower in the cake. Maybe below Unifying Logic? If this > is the point where things change from representation to reasoning. > > Cheers > > Chris > > > -- > Chris Bizer > Freie Universität Berlin > +49 30 838 54057 > chris@bizer.de > www.bizer.de > ----- Original Message ----- > From: "Pat Hayes" <phayes@ihmc.us> > To: "Hugh Glaser" <hg@ecs.soton.ac.uk> > Cc: "Tim Berners-Lee" <timbl@w3.org>; "Chris Bizer" <chris@bizer.de>; > <www-tag@w3.org>; <semantic-web@w3.org>; "Linking Open Data" > <linking-open-data@simile.mit.edu> > Sent: Monday, July 30, 2007 9:49 PM > Subject: Re: Terminology Question concerning Web Architecture and Linked > Data > > >> >>> I am trying hard to keep up (I suspect like many), and was hoping someone >>> would address a concern I have; forgive me if I missed it somewhere in the >>> discussion. >>> I have hung this off this message from Tim, which seems the most relevant. >>> And congratulations on the Linked Data Tutorial - a really useful >>> document. >>> >>> So here we go: >>> >>> On 25/7/07 14:35, "Tim Berners-Lee" <timbl@w3.org> wrote: >>> >>>> >>>> (Going back to the original question, as it is much simpler than much >>>> which follows!) >>>> >>>> On 2007-07 -07, at 08:43, Chris Bizer wrote: >>>> >>>> >>>>> Question 3: Depending on the answer to question 1, is it correct to >>>>> use owl:sameAs [6] to state that http://www.w3.org/People/Berners- >>>>> Lee/card#i and http://dbpedia.org/resource/Tim_Berners-Lee refer to >>>>> the same thing as it is done in Tim's profile. >>>> >>>> Yes. >>>> >>> So Tim absolutely right. >>> This is an entirely logical thing to say. >>> These two NIRs (Non-Information Resources) should be considered the same. >> >> (Aside) I wish folk would not say 'two' when there is only one. Two NIRs >> should never be considered the same: rather, two names may refer to the >> same, single, NIR. Thanks. Sorry. >> >>> But it is important to consider how this statement will be used, and worry >>> whether there may be unexpected consequences. >>> As we now know, the URIs should be resolvable, and so interesting Semantic >>> Web applications will use the URI to get the Description (or whatever we >>> call it), probably going via a 303. >>> So my SW app will get the RDF of them both, and add it to my triplestore, >>> along with all the other linked data. >>> >>> Tim, as often, is a good example. >>> Consider the places Tim works (W3C, MIT, Southampton, I guess). >>> It is likely that each will publish RDF about him, hopefully using an >>> agreed >>> ontology (one day!). >>> Now comes the rub. >>> If you put all this in one triplestore, with the owl:sameAs assertions, >>> then >>> it will not be possible to distinguish where facts came from, or rather >>> which facts are associated with which others. >> >> Whoa, careful. It will probably will be >>possible<< to distinguish this, >> in fact. It might be that unwanted consequences are entailed by the >> combination of the various RDF graphs and the sameAs, but a careful >> querying process should be able to determine which of the various triples >> are present and even whether they are linked. One simple way is to query >> under sub-OWL entailment, for example, which can be little more than a >> direct syntactic matching process (see SPARQL). >> >>> Perhaps 3 job titles, 3 telephone numbers and 3 institution addresses will >>> be returned from the appropriate SPARQL queries, and there will be no >>> (legal) way of working out which corresponds to which. >> >> That would be a symptom of poor RDF/OWL usage, though. Assertions in RDF >> are not supposed to be local-context-sensitive in the way you seem to be >> assuming. So for example it would be a mistake to simply assert, in the >> w3c page, that Tim's status WAS Director. It ought to say that a >> relationship holds between him and the entity he is the Director of, i.e. >> the W3C; so that this stays true even when it is moved somewhere else on >> the Web. In fact, I suggest that as a basic, fundamental principle of any >> 'web logic' is that assertions in it should have the same meaning wherever >> they occur on the Web (see for example >> http://www.ihmc.us:16080/users/phayes/IKL/GUIDE/GUIDE.html#LogicForInt) >> >>> So I can infer that the person http://www.w3.org/People/Berners-Lee/card#i >>> is a Professor at MIT, or a Senior Research Scientist at W3C, or Director >>> at >>> Southampton, none of which we consider true. >>> (Of course, this was the intention of the sameAs assertion.) >>> >>> I suggest that this is a bad state of affairs >> >> It would be, yes, but it should not arise if the RDF is written properly. >> >>> , and applies to any NIR, not >>> just people. >> >> It applies to any R, I or NI. Its really nothing to do with the nature of >> the thing named. >> >> Pat Hayes >> -- >> --------------------------------------------------------------------- >> IHMC (850)434 8903 or (650)494 3973 home >> 40 South Alcaniz St. (850)202 4416 office >> Pensacola (850)202 4440 fax >> FL 32502 (850)291 0667 cell >> phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes >> >> >
Received on Tuesday, 31 July 2007 19:27:34 UTC