- From: Hamish Harvey <hamish@hamishharvey.com>
- Date: Thu, 27 Jul 2006 12:11:47 +0100
- To: semantic-web@w3c.org
On 26/07/06, Knud Hinnerk Möller <knud.moeller@deri.org> wrote: > Am 26.07.2006 um 10:35 schrieb Hamish Harvey: > > Presumably you can strongly encourage people (e.g. in the FOAF docs; I > > haven't looked) to prefer pages from e.g. Wikipedia over some > > arbitrary even if apparently canonical page. If the page isn't there, > > add it. > > Yes, you could, and such a convention would be nice. However, it will be > hard to get everyone to follow it. Of course. > Again, the heart of the problem I > mentioned (the "URI crisis") is that a lot of things can get messed up if > people use the URLs of physical things on the internet (e.g. a web page) to > _also_ denote an abstract concept that this URL is somehow related to. Of course. But, as you point out, the URI crisis has been designed out in this case. > If > you only have one homogeneous source of data or community, an informal > convention like the one you suggest might be enough. However, if you imagine > an agent traversing the whole wide SW, collecting and integrating data from > all kinds of sources (and that's what would be so cool about the SW), then a > solution like that could easily break. Any reasoning applied over data collected from all over the Internet had better be able to cope with all sorts of incompleteness and inconsistency. There are no 100% solutions at that scale. If you start coining URLs to identify the foaf:topics uniquely, how are you going to persuade people to use *those*? The problem isn't solved, if anything it's magnified, as now you have an n:m problem over the foaf:topic relation. You're going to have to start using imprecise methods---page content similarity, for example---to get anywhere. > As said before, this problem does not > arise in the foaf:interest example, due to the way this predicate is defined > (i.e. the object _is_ a foaf:Document). I was interested to see that. This seems to introduce the approach used in Topic Maps of explicitly marking the use of a URI as a subject indicator. So at least here you don't have to contend a problem baked into the Semantic Web at the RDF level. Of course encouraging the use of this approach cannot guarantee that use so, as you note above, this can alleviate the problems relating to the URI crisis, but it can't solve it. Topic Maps don't add the next layer of saying formally "this document describes *this* concept". Probably because TMs came from a world populated by people who already knew that to attempt this could lead only to madness. Concepts can't, in general, be pinned down down like that. > > Besides, what's wrong with saying, without using another URI: > > > > <http://en.wikipedia.org/wiki/Resource_Description_Framework> > > foaf:topic _:rdf . > > <http://www.w3.org/RDF/> foaf:topic _:rdf . > > > > ? > > Using a blank node only works if these two statements were made in the same > model. If they come from different places, you need an explicit URI. Which of course they would be, but not necessarily in the same place as the foaf:interest statements. You have statements from (up to) three documents, one of which contains these two statements establishing the identity of the concepts described in the two web pages. It then becomes important to know *who says* that the topics/subjects are the same and whether you trust them, to deal somehow with conflicts, and so on. In the big messy world of the SW, perhaps finding such an assertion somewhere would be worth a little more than finding that the documents referred to in foaf:interest statements are textually similar. Looking at the pages yourself and making that assertion is then worth much more again. Bernard picks up a more technical problem with using foaf:topic, but also suggests a solution. > > Another concern is the drift in meaning of assertions you'll get using > > something like Wikipedia to provide a "controlled" vocabulary. When I > > assert > > > > _:me foaf:interest <http://en.wikipedia.org/wiki/...> . > > > > I really mean I have an interest which is described by that page at > > the time the assertion was made. > > > > Probably not such an issue in geek space with well defined technologies. > > No, I think that is indeed an issue - the well defined technologies do not > define very well what using a URL like > <http://en.wikipedia.org/wiki/Resource_Description_Framework> > actually means! That seems to be a different issue. My point was that the concept "RDF" is much better defined and less contentious than, say, "communism". Although "Semantic Web", which describes an idea not a technology, comes probably closer to the latter than the former (even to the extent of questions about whether it will ever actually exist ;). The problem I was getting at has nothing to do with technologies being precise about in what way a URI means. In the end, you have to ground your symbols (URIs) somewhere, and that somewhere has to be outside what is defined by your technology standards; it always comes back to natural language. Wikipedia is a handy collection of stable symbols with natural language descriptions of subjects, so offers itself as a grounding mechanism. Using it as such has some compelling advantages, but as a semantic foundation it is also rather like quicksand, since although the symbols are stable, the descriptions are not; the natural language content of the pages can change at any time. That problem remains whatever the details of the technology you are using, and certainly isn't solved by making specific foaf:topic (or skos:primarySubject) assertions about Wikipedia pages. Cheers, Hamish -- Hamish Harvey Research Associate, School of Civil Engineering and Geosciences, Newcastle University
Received on Thursday, 27 July 2006 11:13:49 UTC