- From: Michael Schneider <schneid@fzi.de>
- Date: Wed, 5 Sep 2007 11:47:08 +0200
- To: "Bijan Parsia" <bparsia@cs.man.ac.uk>, "Richard Cyganiak" <richard@cyganiak.de>
- Cc: "K-fe bom" <u9x3n_15so@hotmail.com>, <semantic-web@w3.org>
[sorry, this has again become a very long mail] Hi, Richard and Bijan! >-----Original Message----- >From: semantic-web-request@w3.org >[mailto:semantic-web-request@w3.org] On Behalf Of Bijan Parsia >Sent: Tuesday, September 04, 2007 6:51 PM >To: Richard Cyganiak >Cc: Michael Schneider; K-fe bom; semantic-web@w3.org >Subject: Re: statements about a graph (Named Graphs, reification) > > >On 4 Sep 2007, at 17:30, Richard Cyganiak wrote: > >> Michael, >> >> On 4 Sep 2007, at 15:29, Michael Schneider wrote: >>> Ok, then let's discuss more practical issues (leaving this >subtle RDF >>> semantics stuff to the academic world). Until now, we had the only >>> usecase >>> that someone wanted to annotate a complete RDF document, > >Sorry to be jumping in, but do you mean "in this thread"? Yes. I tried to be at least a little on-topic. ;-) >Because other use cases are prevalent. > >>> which already exist >>> somewhere having an URI. This is certainly the easiest case to >>> handle in >>> practice. >> >> Yes. I think it's also by far the most common case. > >I think almost certainly not. Consider EARL: > http://www.w3.org/TR/EARL10-Schema/ > >Or annotation axioms in OWL 1.1. > >Or Swoop Change Sets (which do chunk out, so they are a little >different). > >>> But there will probably often be the more demanding situation, >>> where I want to make assertions about some ad hoc set of RDF >>> triples, which >>> is not yet published as a special RDF document anywhere. >> >> To be honest, I'm not sure that this case occurs *that* much in >> practice. > >Quite often (or will). I want to record when an axiom in my owl >ontology has been last modified. Do I have extract that axiom and >publish it in a separate document? I have been pondering about some specific szenario for quite a while now, which I did not yet see being discussed elsewhere. And I would like to know from you what you are thinking about it. I will try to present this scenario in the form of a little story, because this will make things easier to understand. Assume there is Alice, who owns a homepage, which is enriched with some additional RDF. One of the statements within her homepage is me:alice foaf:knows he:bob . by which Alice tries to tell the world that she knows some other person Bob. Now there is Charly, who is an old friend of both Alice and Bob. He knows, that Alice knows Bob since 1998. Charly also owns an RDF'ed homepage, and so he likes to make this knowledge explicit by stating something like "Alice knows Bob" dc:date 1998 . Charly does not have access to Alice's homepage, so she cannot put this statement just into Alice's triple store, or even adjust Alice's foaf:knows-triple into some n-tuple. But even if she could, she would not like to do this: It's actually her, who asserts this statement, so this information should really go into her own triple store. But what she wants to ensure in any case is that this statement is "visible" on the semantic web. This means that if anyone (or any semantic web crawler) should stumble over this statement, he/it should, with pretty high confidence, be able to understand that this is really a statement which annotates Alice's foaf:knows statement - rather than just being some arbitrary RDF triple. Last, there is Dave. Dave has recently found Alice's homepage with her "foaf:knows" statement within. Dave does not know Alice personally, but he is very interested in social relationships between arbitrary people. And more, he is interested in what others have to say about such social relationships. :) So he wonders if there are any additional statements about Alice's foaf:knows statement anywhere on the Semantic Web. Dave has already installed a copy of the Semantic Web Client Library [1], so he has at least a good chance to have access to some larger portions of the SemanticWeb (let's suppose for a moment that we are already a few years in the future from now, where there is already satisfying linking between existing data). Now, what SPARQL query should he execute? He want's to find as many assertions about the Alice's foaf:knows statement, as possible, but he also want's to avoid too many false positives, of course. So, this example demonstrates the scenario. There are on the one hand parties (the Alices) which create informations on the SemWeb, encoded in triple form. There are other parties (the Charlies) wanting to create annotations for these triples in separated stores. These parties are interested in having their stored annotations encoded in a searchable way. And there are again other parties (the Daves) which like to search for such triple annotations. Now, the above example is a little oversimplified, I admit. But it is not hard for me to imagine professional mashup services ("Charly 2.0" :)), which crawl the whole Semantic Web for triple data of a specific kind (e.g. social relationships), and then enrich this found data by additional annotations. This will provide quite new views on the original data. For these mashup services it will be of utmost importance that their triple annotations will be effectively searchable. And then, there will also be general SemanticWeb search services (the professional Daves). The value of these search services will enhance largly for their users, if these services also take the triple annotations of the diverse mashup services into account. So, there are two questions here, which turn out to be closely related: * How should triple annotations be encoded on the public Semenatic Web, so that they can easily be detected, and identified to really be triple annotations? * How should queries for triple annotations look like in the Semantic Web? First, it is clear that if Charly uses some special custom method to encode her triple annotations, there will be no realistic perspective that her data will be found. "Custom reification" methods can be completely resonable for being used within specific applications, or for closed user groups. But for a searcher like Dave, who wants to broadly query the whole SemanticWeb for data created by possibly lots of different, unknown, and unrelated parties, this is certainly not an option. But even, if Dave really is going to include specialized encoding schemes into his query, then this will only be the published schemes of very important parties. So no hope then for Charly (and many other normal users or "small players" in the Semantic Web) to get their data being found. So what will happen in such a situation? If no standard encoding scheme already exists, there will probably emerge a few encoding schemes, rapidly introduced by some first-to-marked organisations (simply because these orgs need such a scheme AFAP), and everyone else will then use these few schemes. And after some years of usage, the W3C would step in making a standard based on those encoding schemes which have survived until then. But in the case of RDF, I think that people will rather adopt RDF reification, for several reasons: * It's already there, ready for use, and it's part of the official RDF standard. * It is just more triple data, so it can simply be put into the existing triple stores. And every RDF aware software out there will be able to handle this kind of data out of the box. * It seems reasonably easy to understand and use for non-expert people (I have experienced this, when I tried to explain RDF reification to a complete RDF novice). * There is existing tool support (like in Topbraid Composer [2]) * At least in the beginning, Charly will probably think: "Well, whoever will search for triple annotations, he will certainly at least come to the idea to search for rdf:Statements. I don't have any clue for what else he will search, so I use RDF reification for my encoding. This will be the savest path, if any." I would call this argumentation a "maximum likelyhood estimation". :) * And Dave will think: "Well, at least I should search for rdf:Statements, because this will be the nearest people will think of, when they encode their triple annotations." Again some maximum likelihood estimation. And an according SPARQL query is pretty simple: construct { $stmt $p $o } where { $stmt a rdf:Statement; rdf:subject me:Alice; rdf:predicate foaf:knows; rdf:object he:bob . } Well, not nice, but it works for Dave, and that is the important point. And anticipating one of the most likely objections to my argument: I don't believe that anyone of the "ordenary semantic web users" out there, who is actually interested in putting triple annotations into the SemWeb or searching for them, will really be interested in debates about "non-existing" or "broken semantics" of RDF reification. I, personally, like such debates, but this is in the end just ivory tower bosh. So I won't bother these people with questions like: "Hey, don't you know that talking about the insertion date of a triple into an RDF store is something semantically completely different, than talking about the date since Alice knows Bob?" These people do not need the academic world to provide them lessons in philosophy. :) What they really need from the academic community is a pragmatic tool, which serve their needs, so they can start to do their most important job: Filling the SemWeb with content! And RDF reification actually provides such a tool, when it is only regarded as a common vocabulary, which makes it technically possible to associate an URI to some RDF triple. (Sorry, this paragraph has gone a little flamy, but I really couldn't resist. ;-)) The third candidate is NamedGraphs. But in order to estimate if this approach can be used for the above scenario, I need to know more about it. This was the reason why I asked in my last mail "How do named graph data get published into the Semantic Web?". If it is (with reasonabe effort) possible for instance to search for the URIs of all NamedGraphs of the form :g { me:alice foaf:knows he:bob } then NamedGraphs work equally well like Reification for this purpose, because I can then, in a second step, query for all those triples in the SemWeb, which have the found NamedGraph's URI as their subject. And NamedGraphs would bring this big advantage with them that they can talk about more than a single triple (though I have difficulties to see what this serves me in my usecase above. Perhaps other people will be able to find an example, where searching for annotated "multi-triples" would really make sense). But, we must not conceil that NamedGraphs have a very bad disadvantage in comparison with Reification, anyway: NamedGraphs are not a standard. And if this approach does not get into RDF, or at least into common use, very soon, it will possibly lose its chance to become a player at least in the above scenario. /This/ will of course only be a topic /if/ the above scenario is relevant at all. Because my whole argumentation pro RDF reification depends on the estimation, that the above scenario is a really relevant usecase (of course with mashup and search services instead of Charlies and Daves :)). If this is not the case, then I won't speak for RDF reification any longer, because I then see no real use for it anymore. (At least, until another scenario comes to my mind ;-)). So what do you think? Cheers, Michael [1] http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/semwebclient/ [2] http://www.topbraidcomposer.com/ -- Dipl.-Inform. Michael Schneider FZI Forschungszentrum Informatik Karlsruhe Abtl. Information Process Engineering (IPE) Tel : +49-721-9654-726 Fax : +49-721-9654-727 Email: Michael.Schneider@fzi.de Web : http://www.fzi.de/ipe/eng/mitarbeiter.php?id=555 FZI Forschungszentrum Informatik an der Universität Karlsruhe Haid-und-Neu-Str. 10-14, D-76131 Karlsruhe Tel.: +49-721-9654-0, Fax: +49-721-9654-959 Stiftung des bürgerlichen Rechts Az: 14-0563.1 Regierungspräsidium Karlsruhe Vorstand: Rüdiger Dillmann, Michael Flor, Jivka Ovtcharova, Rudi Studer Vorsitzender des Kuratoriums: Ministerialdirigent Günther Leßnerkraus
Received on Wednesday, 5 September 2007 09:47:18 UTC