- From: Giovanni Tummarello <g.tummarello@gmail.com>
- Date: Sat, 7 Feb 2009 21:04:48 +0000
- To: Yves Raimond <yves.raimond@gmail.com>
- Cc: Hugh Glaser <hg@ecs.soton.ac.uk>, "public-lod@w3.org" <public-lod@w3.org>
Yves, just on the side, yes there is not much dbtune in sindice. just a few http://sindice.com/search?q=dbtune&qt=term if you have an RDF dump of the site or of part of it and you express it in a semantic sitemap you would be indexed full in very short time . Otherwise we should have the ne crawler taking service in a few days and that should make a notable difference. thanks Giovanni On Sat, Feb 7, 2009 at 1:39 PM, Yves Raimond <yves.raimond@gmail.com> wrote: > > Hello! > > On Sat, Feb 7, 2009 at 1:23 PM, Hugh Glaser <hg@ecs.soton.ac.uk> wrote: >> >> My proposal: >> *We should not permit any site to be a member of the Linked Data cloud if it >> does not provide a simple way of finding URIs from natural language >> identifiers.* >> >> Rationale: >> One aspect of our Linking Data (not to mention our Linking Open Data) world >> is that we want people to link to our data - that is, I have published some >> stuff about something, with a URI, and I want people to be able to use that >> URI. >> >> So my question to you, the publisher, is: "How easy is it for me to find the >> URI your users want?" >> >> My experience suggests it is not always very easy. >> What is required at the minimum, I suggest, is a text search, so that if I >> have a (boring string version of a) name that refers in my mind to >> something, I can hope to find an (exciting Linked Data) URI of that thing. >> I call this a projection from the Web to the Semantic Web. >> rdfs:label or equivalent usually provides the other one. >> >> At the risk of being seen as critical of the amazing efforts of all my >> colleagues (if not also myself), this is rarely an easy thing to do. >> >> Some recent experiences: >> OpenCalais: as in my previous message on this list, I tried hard to find a >> URI for Tim, but failed. >> dbtune: Saw a Twine message about dbtune, trundled over there, and tried to >> find a URI for a Telemann, but failed. >> dbpedia: wanted Tim again. After clicking on a few web pages, none of which >> seemed to provide a search facility, I resorted to my usual method:- look it >> up in wikipedia and then hack the URI and hope it works in dbpedia. >> (Sorry to name specific sites, guys, but I needed a few examples. >> And I am only asking for a little more, so that the fruits of your amazing >> labours can be more widely appreciated!) >> wordnet: [2] below >> >> So I have access to Linked Data sites that I know (or at least strongly >> suspect) have URIs I might want, but I can't find them. >> How on earth do we expect your average punter to join this world? >> >> What have I missed? >> Searching, such as Sindice: Well yes, but should I really have to go off to >> a search engine to find a dbpedia URI? And when I look up "Telemann dbtune" >> I don't get any results. And I wanted the dbtune link, not some other link. >> Did I miss some links on web pages? Quite probably, but the basic problem >> still stands. >> SPARQL: Well, yes. But we cannot seriously expect our users to formulate a >> SPARQL query simply to find out the dbpedia URI for Tim. What is the regexp >> I need to put in? (see below [1]) >> A foaf file: Well Tim's dbpedia URI is probably in his foaf file (although >> possibly there are none of Tim's URIs in his foaf file), if I can actually >> find the file; but for some reason I can't seem to find Telemann's foaf >> file. >> >> If you are still doubting me, try finding a URI for Telemann in dbpedia >> without using an external link, just by following stuff from the home page. >> I managed to get a Telemann by using SPARQL without a regexp (it times out >> on any regexp), but unfortunately I get the asteroid. >> >> Again, my proposal: >> *We should not permit any site to be a member of the Linked Data cloud if it >> does not provide a simple way of finding URIs from natural language >> identifiers.* >> Otherwise we end up in a silo, and the world passes us by. >> > > > I think this is a really dangerous idea. Most "web-scale" identifiers, > eg Musicbrainz GUIDs and BBC PIDs are not human readable (for a lot of > reasons, and mainly because human-readable identifiers are not unique > enough!!), but both provide really easy-to-use lookup service. > Such lookups, for other sites, can be provided by semantic web search > engines. It is exactly as in the document web: web identifiers are > mostly opaque, but search engines are here to provide the help needed. > > So my proposal is: let's not confuse everything. Some people's job is > to make datasets available out there and as linked as possible to > others. Some other people make lookup services (eg Sindice), and I > think this separation of concerns works quite well. > > Best, > y > > >> Very best >> Hugh >> >> [And since we have to take our own medicine, I have added a "Just search" >> box right at the top level of all the rkbexplorer.com domains, such as >> http://wordnet.rkbexplorer.com/ ] >> >> >> [1] >> Dbtune finding of Telemann: >> SELECT * WHERE {?s ?p ?name . >> FILTER regex(?name, "Telemann$") } >> >> I tried >> SELECT * WHERE {?s ?p ?name . >> FILTER regex(?name, "telemann$", "i") } >> first, but got no results - not sure why. >> >> [2] >> <rant> >> I cannot believe just how frustrating this stuff can be when you really try >> to use it. >> Because I looked at Sindice for telemann, I know that it is a word in >> wordnet ( http://sindice.com/search?q=Telemann reports loads of >> http://wordnet.rkbexplorer.com/ links). >> Great, he thinks, I can get a wordnet link from a "proper" wordnet publisher >> (ie not me). >> Goes to >> http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData >> to find wordnet. >> The link there is dead. >> Strips off the last bit, to get to the home princeton wordnet page, and >> clicks on the browser link I find - also dead. >> Go back and look on the >> http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSet >> s page, and find the link to http://esw.w3.org/topic/WordNet , but that >> doesn't help. >> So finally, I do the obvious - google "wordnet rdf". >> Of course I get lots of pages saying how available it is, and how exciting >> it is that we have it, and how it was produced; and somewhere in there I >> find a link: "Wordnet-RDF/RDDL Browser" at www.openhealth.org/RDDL/wnbrowse >> Almost unable to contain myself with excitement, I click on the link to find >> a text box, and with trembling hands I type "Telemann" and click submit. >> If I show you what I got, you can come some way to imagining my devastation: >> "Using org.apache.xerces.parsers.SAXParser >> Exception net.sf.saxon.trans.DynamicError: org.xml.sax.SAXParseException: >> White spaces are required between publicId and systemId. >> org.xml.sax.SAXParseException: White spaces are required between publicId >> and systemId." >> >> Does the emperor have any clothes at all? >> </rant> >> >> >> > >
Received on Saturday, 7 February 2009 21:05:31 UTC