- From: Yves Raimond <yves.raimond@gmail.com>
- Date: Fri, 20 Jun 2008 10:37:19 +0100
- To: martin.hepp@uibk.ac.at
- Cc: "Giovanni Tummarello" <giovanni.tummarello@deri.org>, "Hausenblas, Michael" <michael.hausenblas@joanneum.at>, public-lod@w3.org, "Semantic Web" <semantic-web@w3.org>
Hello Martin! On Thu, Jun 19, 2008 at 2:08 PM, Martin Hepp <martin.hepp@uibk.ac.at> wrote: >>However, there are some cases where you can't really afford that, for >>example when "looking inside" takes too much time - for example >>because of the size of "inside". > > But how do you decide which part of the "inside" is contained in the > "outside" description? If you want all details from the inside in the > outside, then you have to replicate the whole inside - which does not gain > anything. And if the outside is just a subset (or even: proper abstraction) > of the inside, then you will face "false positive" (the outside indicates > something would be inside, but it actually isn't) and "false negative" > (there is something inside which the outside does not tell) situations. Now > for me the whole discussion boils down to the question on whether one can > produce good descriptions that are (1) substantially shorter than the inside > data and (2), on average, keep the false positive and false negative cases > low. So you would have to find a proper trade-off and then show by means of > a quantitative evaluation that there are relevant situations in which your > approach increases retrieval performance. I gave it a try, and described my experience at: http://blog.dbtune.org/post/2008/06/12/Describing-the-content-of-RDF-datasets I am not suggesting this is the best way to do it, but at least it was quite simple. Also, the object of void:example i used in this post is just the result of a DESCRIBE query on the end-point - so it can be synchronised with the actual content of the dataset). Cheers! y
Received on Friday, 20 June 2008 09:37:59 UTC