- From: Chris Bizer <chris@bizer.de>
- Date: Tue, 12 Jun 2007 10:29:10 +0200
- To: "Pat Hayes" <phayes@ihmc.us>, "Sandro Hawke" <sandro@w3.org>
- Cc: <semantic-web@w3.org>, "Linking Open Data" <linking-open-data@simile.mit.edu>
Hi Sandro and Pat, > My advice here is, I confess, not widely followed. But I hear more and > more people converging on the idea that this is both practical and > likely to be sufficiently effective. Sandro: Just to back your claim that more and more people are converging with some hard facts: Within the W3C SWEO Linking Open Data project, people are collaborating to publish and interlink huge amounts of RDF data on the Web according to Tim's Linked Data principles http://www.w3.org/DesignIssues/LinkedData.html Currently, this collaborative effort has "specified the meaning" (if you want to see it this way) of maybe 10 million URIs covering topics like geographic locations, books, publications, music, .... The descriptions altogether amount to a dataset of about one billion RDF triples. Any of this 10 million URIs can be looked up over the HTTP protocol to retrieve a description of its meaning. Some example URIs from DBpedia (http://dbpedia.org/docs/) which forms part of the Linking Open Data project: URI denoting to the concept of Berlin as a town in Germany: http://dbpedia.org/resource/Berlin RDF description about Berlin, which you get by dereferencing the URI above with the mime type application/rdf+xml http://dbpedia.org/data/Berlin Human-readabale HTML description about Berlin, which you get by dereferencing the URI above with the mime type text/html http://dbpedia.org/page/Berlin As you can see, the meaning of the term is pretty clearly defined by putting it into several SKOS categories, having several rdf:type statements about it and describing in in 10 different natural languages. All other 1 600 000 DBpedia terms are described in a similar way. An overview about the other 8 million concepts with dereferencable URIs that were created in the project is given in http://linkeddata.org/documents/eswc2007-poster-linking-open-data.pdf and on the project website http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData > So when users paste that URI into their browser, they get the official > documentation about it. This behavior can be demonstrated with Semantic Web browsers like Tabulator or DISCO or the OpenLink Data browser. Just click on a link below to start exploring the meaning of terms using DISCO. The WWW 2006 conference http://www4.wiwiss.fu-berlin.de/rdf_browser/?browse_uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdblp%2Fresource%2Frecord%2Fconf%2Fwww%2F2006 The Tetris computer game http://www4.wiwiss.fu-berlin.de/rdf_browser/?browse_uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FTetris Tim Berners-Lee http://www4.wiwiss.fu-berlin.de/rdf_browser/?browse_uri=http%3A%2F%2Fwww.w3.org%2FPeople%2FBerners-Lee%2Fcard%23i Concerning "practical and sufficently effictive", I liked a recent paper by Google about their plans for the Web-of-Data. "Web-scale Data Integration: You can only afford to Pay As You Go" http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p40.pdf The basic argumentation line is that we don't need completely unambiguous terms and schemata to provide usefull services to the end user. Even if the answers are only approximate they will be usefull for the user. Google seams to handle this by using uncertainty on all levels of their architecture including information extraction, schema matching and query routing. At the end, this uncertainty goes into their ranking algorithm and as the experience from the Web shows, users are very happy with ranked approximate results where high quality stuff tends to show up at the beginning of the list. Cheers Chris -- Chris Bizer Freie Universität Berlin +49 30 838 54057 chris@bizer.de www.bizer.de ----- Original Message ----- From: "Sandro Hawke" <sandro@w3.org> To: "Pat Hayes" <phayes@ihmc.us> Cc: <semantic-web@w3.org> Sent: Tuesday, June 12, 2007 12:11 AM Subject: homonym URIs (Re: What if an URI also is a URL) > > > Pat Hayes <phayes@ihmc.us> writes: >> Tim, as this discussion gets to the heart of what >> Ive been trying to argue for several years, >> please take the comments below as intended in a >> spirit of analysis rather than just pins and >> angels. > > Pat, I'm going to jump in here, if you don't mind. I think my position > on these issues is pretty much the same as Tim's but I could be wrong. > I don't argue that John's "dance" isn't required, just that part of the > Semantic Web version of the dance is: don't make your URIs unnecessarily > ambiguous. One might even say: don't pun. > >> And what about a URI >> that I own and wish it to denote, say, the planet >> Venus, or my pet cat? What do I do, to attach the >> URI to my intended referent for it? > > You publish a document (an ontology) so it's available through that URI. > If it's a hash URI, you publish the ontology at the non-hash version. > If it's a slash URI, you publish the ontology at the far end of a 303 > redirect. And you content-negotiate HTML and RDF. > > So when users paste that URI into their browser, they get the official > documentation about it. > > And when RDF software dereferences that URI, it gets some logical > formulas which should be understood (like the HTML) to be asserted by the > URI's owner/host/publisher. Those formulas constrain the possible > meanings of that URI, relative to other URIs. They can't nail a URI to > Venus, but they can use other ontologies to provide useful (and possibly > very constraining) information, like that it's an astronomical body with > a mass of about 5e+24kg. > > My advice here is, I confess, not widely followed. But I hear more and > more people converging on the idea that this is both practical and > likely to be sufficiently effective. > >> The point surely is that URIs used to refer (not >> as in HTTP, but as in OWL) do *not* have a >> standardized meaning. Standards are certainly a >> chore to create, but they only go so far. OWL >> defines the meanings of the OWL namespace, but it >> does not define the meanings of the FOAF >> vocabulary, > > No, that's up to the owner(s) of the FOAF terms. > >> or the URIrefs used in, say, >> ontologies published by the NIH or by JPL. > > And that's up to the NIH and JPL, respectively. > >> The >> only way those meanings can be specified is by >> writing ontologies: and finite ontologies do not >> - cannot possibly - nail down referents >> *uniquely*. > > Ah -- there we go. There must be a long history of this subject in > philosophy. Can things ever be nailed down uniquely? I haven't a clue. > But that's the wrong question. In this thread, I don't think we're > talking about whether we can really be sure what we mean when we say > such a URI denotes Venus. Instead, we're talking about whether it's a > good practice to use a single URI to denote clearly distinct things, > such as: > (1) the second rock from the sun > (2) the Roman goddess of love > (3) a star tennis player > (4) ... etc > The term "ambiguity" covers both these issues, but we don't need to > combine them. The first is a kind of imprecision, a fuzziness, while > the second is the re-use of a word for a second meaning, a homonym. > (Homonyms seem to be called "overloading" in computer programming.) > > I think we know how to work with homonyms, but since we're engineering a > new system, it seems like a good design decision to forbid them, doesn't > it? > > -- Sandro >
Received on Tuesday, 12 June 2007 08:29:33 UTC