RE: Can we lower the LD entry cost please (part 1)? from Georgi Kobilarov on 2009-02-08 (public-lod@w3.org from February 2009)

From: Georgi Kobilarov <georgi.kobilarov@gmx.de>
Date: Sun, 8 Feb 2009 16:48:47 +0100
To: "Michael Hausenblas" <michael.hausenblas@deri.org>, "Andraz Tori" <andraz@zemanta.com>, "Hugh Glaser" <hg@ecs.soton.ac.uk>
Cc: "Linked Data community" <public-lod@w3.org>
Message-ID: <180C011CD4FF654AB4B73A9A5AD7472C0A4F67@aristoteles.zuhause.lan>
Hi Michael,

> Looking forward to find and use a respective voiD description for
> DBpedia ;)

Very sorry for being picky here, but that's just such a good example of
my argument:
Why should I invest my time to provide a voiD description for DBpedia? 

Of course, I'd do the community a favor, but I don't see any other
reason. If there was a great application that consumes void and nicely
displays e.g. Musicbrainz and Geonames data, but doesn't display any
DBpedia data because of the missing voiD description of DBpedia, I would
have an incentive to provide it. 

No offence, I'm just trying to emphasize my point. Sure, you'll get me
with the community favor approach, but I strongly doubt that you will
get others...

Cheers,
Georgi

--
Georgi Kobilarov
Freie Universität Berlin
www.georgikobilarov.com


> -----Original Message-----
> From: Michael Hausenblas [mailto:michael.hausenblas@deri.org]
> Sent: Sunday, February 08, 2009 4:29 PM
> To: Georgi Kobilarov; Andraz Tori; Hugh Glaser
> Cc: Linked Data community
> Subject: Re: Can we lower the LD entry cost please (part 1)?
> 
> 
> Georgi, All,
> 
> > If we don't reward the Linked Data publishers who provide clean data
> and
> > penalize those who don't, there will never be an incentive to do it
> right.
> 
> I couldn't agree more. I have contemplated about that recently (p16 in
> [1])
> and, yes, one goal of voiD is helping publishers to concisely express
> what
> their dataset is about, under which license it is available, which
> vocabularies are used or how many triples one can expect [2] and on
the
> other hand how the dataset is linked with other datasets [3].
> 
> Looking forward to find and use a respective voiD description for
> DBpedia ;)
> 
> Cheers,
>       Michael
> 
> [1] http://www.talis.com/nodalities/pdf/nodalities_issue4.pdf

> [2] http://rdfs.org/ns/void-guide#sec_1_Describing_Datasets

> [3] http://rdfs.org/ns/void-guide#sec_2_Describing_Dataset_Interlink

> 
> --
> Dr. Michael Hausenblas
> DERI - Digital Enterprise Research Institute
> National University of Ireland, Lower Dangan,
> Galway, Ireland, Europe
> Tel. +353 91 495730
> http://sw-app.org/about.html

> 
> 
> > From: Georgi Kobilarov <georgi.kobilarov@gmx.de>
> > Date: Sun, 8 Feb 2009 15:56:23 +0100
> > To: Andraz Tori <andraz@zemanta.com>, Hugh Glaser
> <hg@ecs.soton.ac.uk>
> > Cc: Linked Data community <public-lod@w3.org>
> > Subject: RE: Can we lower the LD entry cost please (part 1)?
> > Resent-From: Linked Data community <public-lod@w3.org>
> > Resent-Date: Sun, 08 Feb 2009 14:57:12 +0000
> >
> > Hi Andraz,
> >
> > I disagree, those two goals are not completely different in a sense
> that
> > different groups should address it separately. I had a delighting
> > conversation with Andreas Harth of SWSE about that a week ago in
> Berlin.
> > Search Engines can't clean up other people's mess. It's even harmful
> if
> > they try. Data providers need incentives to provide clean data. See
> the
> > Google example: Google started indexing the web, and the webpages
> with
> > clean markup and site structure showed up in their search. And
> Google's
> > search provided real benefit to end-users.
> >
> > Hence web publishers started to do SEO (search engine optimization),
> so
> > that their stuff shows up in Google as well (or ranked higher). If
we
> > don't reward the Linked Data publishers who provide clean data and
> > penalize those who don't, there will never be an incentive to do it
> > right.
> >
> > Cheers,
> > Georgi
> >
> > --
> > Georgi Kobilarov
> > Freie Universität Berlin
> > www.georgikobilarov.com
> >
> >> -----Original Message-----
> >> From: public-lod-request@w3.org [mailto:public-lod-request@w3.org]
> On
> >> Behalf Of Andraz Tori
> >> Sent: Saturday, February 07, 2009 4:02 PM
> >> To: Hugh Glaser
> >> Cc: public-lod@w3.org
> >> Subject: Re: Can we lower the LD entry cost please (part 1)?
> >>
> >>
> >> Hi Hugh,
> >>
> >> I think you are mixing two completely different goals.
> >>
> >> Why can't one set of people provide the data while the other set of
> >> people provide search technologies over that data?
> >>
> >> It takes two completely different technologies, processes, etc.
> >>
> >> BTW: an easy way to search is also to write meaningful sentence  or
> >> paragraph (using the phrase/entity/concept) and put it into Zemanta
> or
> >> Calais. You will usually get properly disambiguated URIs back.
> >>
> >> bye
> >> andraz
> >>
> >> On Sat, 2009-02-07 at 13:23 +0000, Hugh Glaser wrote:
> >>> My proposal:
> >>> *We should not permit any site to be a member of the Linked Data
> >> cloud if it
> >>> does not provide a simple way of finding URIs from natural
language
> >>> identifiers.*
> >>>
> >>> Rationale:
> >>> One aspect of our Linking Data (not to mention our Linking Open
> > Data)
> >> world
> >>> is that we want people to link to our data - that is, I have
> >> published some
> >>> stuff about something, with a URI, and I want people to be able to
> >> use that
> >>> URI.
> >>>
> >>> So my question to you, the publisher, is: "How easy is it for me
to
> >> find the
> >>> URI your users want?"
> >>>
> >>> My experience suggests it is not always very easy.
> >>> What is required at the minimum, I suggest, is a text search, so
> > that
> >> if I
> >>> have a (boring string version of a) name that refers in my mind to
> >>> something, I can hope to find an (exciting Linked Data) URI of
that
> >> thing.
> >>> I call this a projection from the Web to the Semantic Web.
> >>> rdfs:label or equivalent usually provides the other one.
> >>>
> >>> At the risk of being seen as critical of the amazing efforts of
all
> >> my
> >>> colleagues (if not also myself), this is rarely an easy thing to
> do.
> >>>
> >>> Some recent experiences:
> >>> OpenCalais: as in my previous message on this list, I tried hard
to
> >> find a
> >>> URI for Tim, but failed.
> >>> dbtune: Saw a Twine message about dbtune, trundled over there, and
> >> tried to
> >>> find a URI for a Telemann, but failed.
> >>> dbpedia: wanted Tim again. After clicking on a few web pages, none
> > of
> >> which
> >>> seemed to provide a search facility, I resorted to my usual
> method:-
> >> look it
> >>> up in wikipedia and then hack the URI and hope it works in
dbpedia.
> >>> (Sorry to name specific sites, guys, but I needed a few examples.
> >>> And I am only asking for a little more, so that the fruits of your
> >> amazing
> >>> labours can be more widely appreciated!)
> >>> wordnet: [2] below
> >>>
> >>> So I have access to Linked Data sites that I know (or at least
> >> strongly
> >>> suspect) have URIs I might want, but I can't find them.
> >>> How on earth do we expect your average punter to join this world?
> >>>
> >>> What have I missed?
> >>> Searching, such as Sindice: Well yes, but should I really have to
> go
> >> off to
> >>> a search engine to find a dbpedia URI? And when I look up
"Telemann
> >> dbtune"
> >>> I don't get any results. And I wanted the dbtune link, not some
> > other
> >> link.
> >>> Did I miss some links on web pages? Quite probably, but the basic
> >> problem
> >>> still stands.
> >>> SPARQL: Well, yes. But we cannot seriously expect our users to
> >> formulate a
> >>> SPARQL query simply to find out the dbpedia URI for Tim. What is
> the
> >> regexp
> >>> I need to put in? (see below [1])
> >>> A foaf file: Well Tim's dbpedia URI is probably in his foaf file
> >> (although
> >>> possibly there are none of Tim's URIs in his foaf file), if I can
> >> actually
> >>> find the file; but for some reason I can't seem to find Telemann's
> >> foaf
> >>> file.
> >>>
> >>> If you are still doubting me, try finding a URI for Telemann in
> >> dbpedia
> >>> without using an external link, just by following stuff from the
> > home
> >> page.
> >>> I managed to get a Telemann by using SPARQL without a regexp (it
> >> times out
> >>> on any regexp), but unfortunately I get the asteroid.
> >>>
> >>> Again, my proposal:
> >>> *We should not permit any site to be a member of the Linked Data
> >> cloud if it
> >>> does not provide a simple way of finding URIs from natural
language
> >>> identifiers.*
> >>> Otherwise we end up in a silo, and the world passes us by.
> >>>
> >>> Very best
> >>> Hugh
> >>>
> >>> [And since we have to take our own medicine, I have added a "Just
> >> search"
> >>> box right at the top level of all the rkbexplorer.com domains,
such
> >> as
> >>> http://wordnet.rkbexplorer.com/ ]
> >>>
> >>>
> >>> [1]
> >>> Dbtune finding of Telemann:
> >>> SELECT * WHERE {?s ?p ?name .
> >>> FILTER regex(?name, "Telemann$") }
> >>>
> >>> I tried
> >>> SELECT * WHERE {?s ?p ?name .
> >>> FILTER regex(?name, "telemann$", "i") }
> >>> first, but got no results - not sure why.
> >>>
> >>> [2]
> >>> <rant>
> >>> I cannot believe just how frustrating this stuff can be when you
> >> really try
> >>> to use it.
> >>> Because I looked at Sindice for telemann, I know that it is a word
> > in
> >>> wordnet ( http://sindice.com/search?q=Telemann reports loads of
> >>> http://wordnet.rkbexplorer.com/ links).
> >>> Great, he thinks, I can get a wordnet link from a "proper" wordnet
> >> publisher
> >>> (ie not me).
> >>> Goes to
> >>>
> >>
> >
>
http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpen

> >> Data
> >>> to find wordnet.
> >>> The link there is dead.
> >>> Strips off the last bit, to get to the home princeton wordnet
page,
> >> and
> >>> clicks on the browser link I find - also dead.
> >>> Go back and look on the
> >>>
> >>
> >
>
http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/Da

> >> taSet
> >>> s page, and find the link to http://esw.w3.org/topic/WordNet , but
> >> that
> >>> doesn't help.
> >>> So finally, I do the obvious - google "wordnet rdf".
> >>> Of course I get lots of pages saying how available it is, and how
> >> exciting
> >>> it is that we have it, and how it was produced; and somewhere in
> >> there I
> >>> find a link: "Wordnet-RDF/RDDL Browser" at
> >> www.openhealth.org/RDDL/wnbrowse
> >>> Almost unable to contain myself with excitement, I click on the
> link
> >> to find
> >>> a text box, and with trembling hands I type "Telemann" and click
> >> submit.
> >>> If I show you what I got, you can come some way to imagining my
> >> devastation:
> >>> "Using org.apache.xerces.parsers.SAXParser
> >>> Exception net.sf.saxon.trans.DynamicError:
> >> org.xml.sax.SAXParseException:
> >>> White spaces are required between publicId and systemId.
> >>> org.xml.sax.SAXParseException: White spaces are required between
> >> publicId
> >>> and systemId."
> >>>
> >>> Does the emperor have any clothes at all?
> >>> </rant>
> >>>
> >>>
> >> --
> >> Andraz Tori, CTO
> >> Zemanta Ltd, London, Ljubljana
> >> www.zemanta.com
> >> mail: andraz@zemanta.com
> >> tel: +386 41 515 767
> >> twitter: andraz, skype: minmax_test
> >>
> >>
> >>
> >
Received on Sunday, 8 February 2009 15:49:32 UTC