- From: Georgi Kobilarov <georgi.kobilarov@gmx.de>
- Date: Sun, 8 Feb 2009 16:48:47 +0100
- To: "Michael Hausenblas" <michael.hausenblas@deri.org>, "Andraz Tori" <andraz@zemanta.com>, "Hugh Glaser" <hg@ecs.soton.ac.uk>
- Cc: "Linked Data community" <public-lod@w3.org>
Hi Michael, > Looking forward to find and use a respective voiD description for > DBpedia ;) Very sorry for being picky here, but that's just such a good example of my argument: Why should I invest my time to provide a voiD description for DBpedia? Of course, I'd do the community a favor, but I don't see any other reason. If there was a great application that consumes void and nicely displays e.g. Musicbrainz and Geonames data, but doesn't display any DBpedia data because of the missing voiD description of DBpedia, I would have an incentive to provide it. No offence, I'm just trying to emphasize my point. Sure, you'll get me with the community favor approach, but I strongly doubt that you will get others... Cheers, Georgi -- Georgi Kobilarov Freie Universität Berlin www.georgikobilarov.com > -----Original Message----- > From: Michael Hausenblas [mailto:michael.hausenblas@deri.org] > Sent: Sunday, February 08, 2009 4:29 PM > To: Georgi Kobilarov; Andraz Tori; Hugh Glaser > Cc: Linked Data community > Subject: Re: Can we lower the LD entry cost please (part 1)? > > > Georgi, All, > > > If we don't reward the Linked Data publishers who provide clean data > and > > penalize those who don't, there will never be an incentive to do it > right. > > I couldn't agree more. I have contemplated about that recently (p16 in > [1]) > and, yes, one goal of voiD is helping publishers to concisely express > what > their dataset is about, under which license it is available, which > vocabularies are used or how many triples one can expect [2] and on the > other hand how the dataset is linked with other datasets [3]. > > Looking forward to find and use a respective voiD description for > DBpedia ;) > > Cheers, > Michael > > [1] http://www.talis.com/nodalities/pdf/nodalities_issue4.pdf > [2] http://rdfs.org/ns/void-guide#sec_1_Describing_Datasets > [3] http://rdfs.org/ns/void-guide#sec_2_Describing_Dataset_Interlink > > -- > Dr. Michael Hausenblas > DERI - Digital Enterprise Research Institute > National University of Ireland, Lower Dangan, > Galway, Ireland, Europe > Tel. +353 91 495730 > http://sw-app.org/about.html > > > > From: Georgi Kobilarov <georgi.kobilarov@gmx.de> > > Date: Sun, 8 Feb 2009 15:56:23 +0100 > > To: Andraz Tori <andraz@zemanta.com>, Hugh Glaser > <hg@ecs.soton.ac.uk> > > Cc: Linked Data community <public-lod@w3.org> > > Subject: RE: Can we lower the LD entry cost please (part 1)? > > Resent-From: Linked Data community <public-lod@w3.org> > > Resent-Date: Sun, 08 Feb 2009 14:57:12 +0000 > > > > Hi Andraz, > > > > I disagree, those two goals are not completely different in a sense > that > > different groups should address it separately. I had a delighting > > conversation with Andreas Harth of SWSE about that a week ago in > Berlin. > > Search Engines can't clean up other people's mess. It's even harmful > if > > they try. Data providers need incentives to provide clean data. See > the > > Google example: Google started indexing the web, and the webpages > with > > clean markup and site structure showed up in their search. And > Google's > > search provided real benefit to end-users. > > > > Hence web publishers started to do SEO (search engine optimization), > so > > that their stuff shows up in Google as well (or ranked higher). If we > > don't reward the Linked Data publishers who provide clean data and > > penalize those who don't, there will never be an incentive to do it > > right. > > > > Cheers, > > Georgi > > > > -- > > Georgi Kobilarov > > Freie Universität Berlin > > www.georgikobilarov.com > > > >> -----Original Message----- > >> From: public-lod-request@w3.org [mailto:public-lod-request@w3.org] > On > >> Behalf Of Andraz Tori > >> Sent: Saturday, February 07, 2009 4:02 PM > >> To: Hugh Glaser > >> Cc: public-lod@w3.org > >> Subject: Re: Can we lower the LD entry cost please (part 1)? > >> > >> > >> Hi Hugh, > >> > >> I think you are mixing two completely different goals. > >> > >> Why can't one set of people provide the data while the other set of > >> people provide search technologies over that data? > >> > >> It takes two completely different technologies, processes, etc. > >> > >> BTW: an easy way to search is also to write meaningful sentence or > >> paragraph (using the phrase/entity/concept) and put it into Zemanta > or > >> Calais. You will usually get properly disambiguated URIs back. > >> > >> bye > >> andraz > >> > >> On Sat, 2009-02-07 at 13:23 +0000, Hugh Glaser wrote: > >>> My proposal: > >>> *We should not permit any site to be a member of the Linked Data > >> cloud if it > >>> does not provide a simple way of finding URIs from natural language > >>> identifiers.* > >>> > >>> Rationale: > >>> One aspect of our Linking Data (not to mention our Linking Open > > Data) > >> world > >>> is that we want people to link to our data - that is, I have > >> published some > >>> stuff about something, with a URI, and I want people to be able to > >> use that > >>> URI. > >>> > >>> So my question to you, the publisher, is: "How easy is it for me to > >> find the > >>> URI your users want?" > >>> > >>> My experience suggests it is not always very easy. > >>> What is required at the minimum, I suggest, is a text search, so > > that > >> if I > >>> have a (boring string version of a) name that refers in my mind to > >>> something, I can hope to find an (exciting Linked Data) URI of that > >> thing. > >>> I call this a projection from the Web to the Semantic Web. > >>> rdfs:label or equivalent usually provides the other one. > >>> > >>> At the risk of being seen as critical of the amazing efforts of all > >> my > >>> colleagues (if not also myself), this is rarely an easy thing to > do. > >>> > >>> Some recent experiences: > >>> OpenCalais: as in my previous message on this list, I tried hard to > >> find a > >>> URI for Tim, but failed. > >>> dbtune: Saw a Twine message about dbtune, trundled over there, and > >> tried to > >>> find a URI for a Telemann, but failed. > >>> dbpedia: wanted Tim again. After clicking on a few web pages, none > > of > >> which > >>> seemed to provide a search facility, I resorted to my usual > method:- > >> look it > >>> up in wikipedia and then hack the URI and hope it works in dbpedia. > >>> (Sorry to name specific sites, guys, but I needed a few examples. > >>> And I am only asking for a little more, so that the fruits of your > >> amazing > >>> labours can be more widely appreciated!) > >>> wordnet: [2] below > >>> > >>> So I have access to Linked Data sites that I know (or at least > >> strongly > >>> suspect) have URIs I might want, but I can't find them. > >>> How on earth do we expect your average punter to join this world? > >>> > >>> What have I missed? > >>> Searching, such as Sindice: Well yes, but should I really have to > go > >> off to > >>> a search engine to find a dbpedia URI? And when I look up "Telemann > >> dbtune" > >>> I don't get any results. And I wanted the dbtune link, not some > > other > >> link. > >>> Did I miss some links on web pages? Quite probably, but the basic > >> problem > >>> still stands. > >>> SPARQL: Well, yes. But we cannot seriously expect our users to > >> formulate a > >>> SPARQL query simply to find out the dbpedia URI for Tim. What is > the > >> regexp > >>> I need to put in? (see below [1]) > >>> A foaf file: Well Tim's dbpedia URI is probably in his foaf file > >> (although > >>> possibly there are none of Tim's URIs in his foaf file), if I can > >> actually > >>> find the file; but for some reason I can't seem to find Telemann's > >> foaf > >>> file. > >>> > >>> If you are still doubting me, try finding a URI for Telemann in > >> dbpedia > >>> without using an external link, just by following stuff from the > > home > >> page. > >>> I managed to get a Telemann by using SPARQL without a regexp (it > >> times out > >>> on any regexp), but unfortunately I get the asteroid. > >>> > >>> Again, my proposal: > >>> *We should not permit any site to be a member of the Linked Data > >> cloud if it > >>> does not provide a simple way of finding URIs from natural language > >>> identifiers.* > >>> Otherwise we end up in a silo, and the world passes us by. > >>> > >>> Very best > >>> Hugh > >>> > >>> [And since we have to take our own medicine, I have added a "Just > >> search" > >>> box right at the top level of all the rkbexplorer.com domains, such > >> as > >>> http://wordnet.rkbexplorer.com/ ] > >>> > >>> > >>> [1] > >>> Dbtune finding of Telemann: > >>> SELECT * WHERE {?s ?p ?name . > >>> FILTER regex(?name, "Telemann$") } > >>> > >>> I tried > >>> SELECT * WHERE {?s ?p ?name . > >>> FILTER regex(?name, "telemann$", "i") } > >>> first, but got no results - not sure why. > >>> > >>> [2] > >>> <rant> > >>> I cannot believe just how frustrating this stuff can be when you > >> really try > >>> to use it. > >>> Because I looked at Sindice for telemann, I know that it is a word > > in > >>> wordnet ( http://sindice.com/search?q=Telemann reports loads of > >>> http://wordnet.rkbexplorer.com/ links). > >>> Great, he thinks, I can get a wordnet link from a "proper" wordnet > >> publisher > >>> (ie not me). > >>> Goes to > >>> > >> > > > http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpen > >> Data > >>> to find wordnet. > >>> The link there is dead. > >>> Strips off the last bit, to get to the home princeton wordnet page, > >> and > >>> clicks on the browser link I find - also dead. > >>> Go back and look on the > >>> > >> > > > http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/Da > >> taSet > >>> s page, and find the link to http://esw.w3.org/topic/WordNet , but > >> that > >>> doesn't help. > >>> So finally, I do the obvious - google "wordnet rdf". > >>> Of course I get lots of pages saying how available it is, and how > >> exciting > >>> it is that we have it, and how it was produced; and somewhere in > >> there I > >>> find a link: "Wordnet-RDF/RDDL Browser" at > >> www.openhealth.org/RDDL/wnbrowse > >>> Almost unable to contain myself with excitement, I click on the > link > >> to find > >>> a text box, and with trembling hands I type "Telemann" and click > >> submit. > >>> If I show you what I got, you can come some way to imagining my > >> devastation: > >>> "Using org.apache.xerces.parsers.SAXParser > >>> Exception net.sf.saxon.trans.DynamicError: > >> org.xml.sax.SAXParseException: > >>> White spaces are required between publicId and systemId. > >>> org.xml.sax.SAXParseException: White spaces are required between > >> publicId > >>> and systemId." > >>> > >>> Does the emperor have any clothes at all? > >>> </rant> > >>> > >>> > >> -- > >> Andraz Tori, CTO > >> Zemanta Ltd, London, Ljubljana > >> www.zemanta.com > >> mail: andraz@zemanta.com > >> tel: +386 41 515 767 > >> twitter: andraz, skype: minmax_test > >> > >> > >> > >
Received on Sunday, 8 February 2009 15:49:32 UTC