- From: Hugh Glaser <hg@ecs.soton.ac.uk>
- Date: Sat, 7 Feb 2009 13:23:44 +0000
- To: "public-lod@w3.org" <public-lod@w3.org>
My proposal: *We should not permit any site to be a member of the Linked Data cloud if it does not provide a simple way of finding URIs from natural language identifiers.* Rationale: One aspect of our Linking Data (not to mention our Linking Open Data) world is that we want people to link to our data - that is, I have published some stuff about something, with a URI, and I want people to be able to use that URI. So my question to you, the publisher, is: "How easy is it for me to find the URI your users want?" My experience suggests it is not always very easy. What is required at the minimum, I suggest, is a text search, so that if I have a (boring string version of a) name that refers in my mind to something, I can hope to find an (exciting Linked Data) URI of that thing. I call this a projection from the Web to the Semantic Web. rdfs:label or equivalent usually provides the other one. At the risk of being seen as critical of the amazing efforts of all my colleagues (if not also myself), this is rarely an easy thing to do. Some recent experiences: OpenCalais: as in my previous message on this list, I tried hard to find a URI for Tim, but failed. dbtune: Saw a Twine message about dbtune, trundled over there, and tried to find a URI for a Telemann, but failed. dbpedia: wanted Tim again. After clicking on a few web pages, none of which seemed to provide a search facility, I resorted to my usual method:- look it up in wikipedia and then hack the URI and hope it works in dbpedia. (Sorry to name specific sites, guys, but I needed a few examples. And I am only asking for a little more, so that the fruits of your amazing labours can be more widely appreciated!) wordnet: [2] below So I have access to Linked Data sites that I know (or at least strongly suspect) have URIs I might want, but I can't find them. How on earth do we expect your average punter to join this world? What have I missed? Searching, such as Sindice: Well yes, but should I really have to go off to a search engine to find a dbpedia URI? And when I look up "Telemann dbtune" I don't get any results. And I wanted the dbtune link, not some other link. Did I miss some links on web pages? Quite probably, but the basic problem still stands. SPARQL: Well, yes. But we cannot seriously expect our users to formulate a SPARQL query simply to find out the dbpedia URI for Tim. What is the regexp I need to put in? (see below [1]) A foaf file: Well Tim's dbpedia URI is probably in his foaf file (although possibly there are none of Tim's URIs in his foaf file), if I can actually find the file; but for some reason I can't seem to find Telemann's foaf file. If you are still doubting me, try finding a URI for Telemann in dbpedia without using an external link, just by following stuff from the home page. I managed to get a Telemann by using SPARQL without a regexp (it times out on any regexp), but unfortunately I get the asteroid. Again, my proposal: *We should not permit any site to be a member of the Linked Data cloud if it does not provide a simple way of finding URIs from natural language identifiers.* Otherwise we end up in a silo, and the world passes us by. Very best Hugh [And since we have to take our own medicine, I have added a "Just search" box right at the top level of all the rkbexplorer.com domains, such as http://wordnet.rkbexplorer.com/ ] [1] Dbtune finding of Telemann: SELECT * WHERE {?s ?p ?name . FILTER regex(?name, "Telemann$") } I tried SELECT * WHERE {?s ?p ?name . FILTER regex(?name, "telemann$", "i") } first, but got no results - not sure why. [2] <rant> I cannot believe just how frustrating this stuff can be when you really try to use it. Because I looked at Sindice for telemann, I know that it is a word in wordnet ( http://sindice.com/search?q=Telemann reports loads of http://wordnet.rkbexplorer.com/ links). Great, he thinks, I can get a wordnet link from a "proper" wordnet publisher (ie not me). Goes to http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData to find wordnet. The link there is dead. Strips off the last bit, to get to the home princeton wordnet page, and clicks on the browser link I find - also dead. Go back and look on the http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSet s page, and find the link to http://esw.w3.org/topic/WordNet , but that doesn't help. So finally, I do the obvious - google "wordnet rdf". Of course I get lots of pages saying how available it is, and how exciting it is that we have it, and how it was produced; and somewhere in there I find a link: "Wordnet-RDF/RDDL Browser" at www.openhealth.org/RDDL/wnbrowse Almost unable to contain myself with excitement, I click on the link to find a text box, and with trembling hands I type "Telemann" and click submit. If I show you what I got, you can come some way to imagining my devastation: "Using org.apache.xerces.parsers.SAXParser Exception net.sf.saxon.trans.DynamicError: org.xml.sax.SAXParseException: White spaces are required between publicId and systemId. org.xml.sax.SAXParseException: White spaces are required between publicId and systemId." Does the emperor have any clothes at all? </rant>
Received on Saturday, 7 February 2009 13:24:38 UTC