RE: Semantic Web pneumonia and the Linked Data flu (was: Can we lower the LD entry cost please (part 1)?) from Georgi Kobilarov on 2009-02-09 (public-lod@w3.org from February 2009)

From: Georgi Kobilarov <georgi.kobilarov@gmx.de>
Date: Mon, 9 Feb 2009 15:13:22 +0100
To: "Juan Sequeda" <juanfederico@gmail.com>, "Hugh Glaser" <hg@ecs.soton.ac.uk>
Cc: "Yves Raimond" <yves.raimond@gmail.com>, <public-lod@w3.org>
Message-ID: <180C011CD4FF654AB4B73A9A5AD7472C0A4F7B@aristoteles.zuhause.lan>

Wait a second. Publishing Linked Data from relational databases? 

D2R-Server, Virtuoso Relational Mappings? Juan, you should be familiar
with that stuff...

 

Easy linking of data? Not quite solved yet. But wait for the Linked Data
Workshop at WWW2009...

 

Georgi

 

--

Georgi Kobilarov

Freie Universität Berlin

www.georgikobilarov.com

 

From: public-lod-request@w3.org [mailto:public-lod-request@w3.org] On
Behalf Of Juan Sequeda
Sent: Monday, February 09, 2009 3:02 PM
To: Hugh Glaser
Cc: Yves Raimond; public-lod@w3.org
Subject: Re: Semantic Web pneumonia and the Linked Data flu (was: Can we
lower the LD entry cost please (part 1)?)

 

This is a point I have always brought up... it is hard! It is hard to
produce LD and hard to consume LD. No sane person will want to do
maintain this. Yves just explained everything he goes through and it is
wayyy to much! The majority of the data on the web is stored in rdbms.
Therefore, IMO, it is crucial to develop automatic ways of creating RDF
from relational data and linking it automatically. If this is not going
to happen, the whole web that runs on rdbms, will not have an incentive
to create LD. This is my futuristic position.


Juan Sequeda, Ph.D Student
Dept. of Computer Sciences
The University of Texas at Austin
www.juansequeda.com
www.semanticwebaustin.org



On Mon, Feb 9, 2009 at 5:45 AM, Hugh Glaser <hg@ecs.soton.ac.uk> wrote:


YES!
Now I don't have to spend my time writing Part 2.
(You did notice the (part 1) in the subject line?)
I was wondering of anyone would ask me what was part 2.
Well, this was it.
Pretty exactly, and very nicely put.
Many thanks.

Despite what I have said about providing a search facility, I think we
need to ensure it is easy to join the LD, and make medium-size-ish (or
any) dataset publishers welcome, whatever the perceived paucity of
missing facilities or components.
Maybe I am thinking two opposite things at the same time? I hope not.


On 09/02/2009 10:40, "Yves Raimond" <yves.raimond@gmail.com> wrote:



Hello!

Just to jump on the last thread, something has been bugging me lately.
Please don't take the following as a rant against technologies such as
voiD, Semantic Sitemaps, etc., these are extremely useful piece of
technologies - my rant is more about the order of our priorities, and
about the growing cost (and I insist on the word "growing") of
publishing linked data.

There's a lot of things the community asks linked data publisher to do
(semantic sitemaps, stats on the dataset homepages, example sparql
queries, void description, and now search function), and I really tend
to think this makes linked data publishing cost much, much more
costly. Richard just mentioned that it should just take 5 minutes to
write such a search function, but 5 minutes + 5 minutes + 5 minutes +
... takes a long time. Maintaining a linked dataset is already *lots*
of work: server maintenance, dataset maintenance, minting of new
links, keeping up-to-date with the data sources, it *really* takes a
lot of time to do properly.
Honestly, I begin to be quite frustrated, as a publisher of about 10
medium-size-ish datasets. I really have the feeling the work I
invested in them is never enough, every time there seems to be
something missing to make all these datasets a "real" part of the
linked data cloud.

Now for the most tedious part of my rant :-) Most of the datasets
published in the linked data world atm are using open source
technologies (easy enough to send a patch over to the data publisher).
Some of them provide SPARQL end points. What's missing for the
advocate of new technologies or requirements to fulfill their goal
themselves? After all, that's what we all did with this project since
the beginning! If someone really wants a smallish search engine on top
of some dataset, wrapping a SPARQL query, or a call to the web service
that the dataset wraps should be enough. I don't see how the data
publisher is required for achieving that aim. The same thing holds for
voiD and other technologies. Detailed statistics are available on most
dataset homepages, which (I think) provides enough data to write a
good enough voiD description.

To sum up, I am just increasingly concerned that we are building
requirements on top of requirements for the sake of lowering a  "LD
entry cost", whereas I have the feeling that this cost is really
higher and higher... And all that doesn't make the data more linked
:-)

Cheers!
y

Received on Monday, 9 February 2009 14:14:06 UTC