Re: Semantic Web pneumonia and the Linked Data flu (was: Can we lower the LD entry cost please (part 1)?)

Yves i share your views,

the point about RDF is that it is self descriptive, so one can count
how many this how many that... and you have to do it anyway if you
provide an index (search index, uri

.. so while void can be useful as a way to exchange these stats, e.g.
for distributed queries optimization it is a very technical thing and
not something i'd "ask to expose" (Sindice will produce void for all
hopefully soon)


SITEMAPS however are a different story.  There is no other way one can
distinguish between a site which published a single RDF model and one
that published plenty (e.g. many foaf files).

So if you DO publish as LOD you really should have a semantic sitemap
saying that this is the case. (or else? or else we'll have to do some
statistical guessing "mm we think its lod", bah.) (+ there would be no
way to find the sparql endpoint)

Giovanni

On Mon, Feb 9, 2009 at 10:40 AM, Yves Raimond <yves.raimond@gmail.com> wrote:
>
> Hello!
>
> Just to jump on the last thread, something has been bugging me lately.
> Please don't take the following as a rant against technologies such as
> voiD, Semantic Sitemaps, etc., these are extremely useful piece of
> technologies - my rant is more about the order of our priorities, and
> about the growing cost (and I insist on the word "growing") of
> publishing linked data.
>
> There's a lot of things the community asks linked data publisher to do
> (semantic sitemaps, stats on the dataset homepages, example sparql
> queries, void description, and now search function), and I really tend
> to think this makes linked data publishing cost much, much more
> costly. Richard just mentioned that it should just take 5 minutes to
> write such a search function, but 5 minutes + 5 minutes + 5 minutes +
> ... takes a long time. Maintaining a linked dataset is already *lots*
> of work: server maintenance, dataset maintenance, minting of new
> links, keeping up-to-date with the data sources, it *really* takes a
> lot of time to do properly.
> Honestly, I begin to be quite frustrated, as a publisher of about 10
> medium-size-ish datasets. I really have the feeling the work I
> invested in them is never enough, every time there seems to be
> something missing to make all these datasets a "real" part of the
> linked data cloud.
>
> Now for the most tedious part of my rant :-) Most of the datasets
> published in the linked data world atm are using open source
> technologies (easy enough to send a patch over to the data publisher).
> Some of them provide SPARQL end points. What's missing for the
> advocate of new technologies or requirements to fulfill their goal
> themselves? After all, that's what we all did with this project since
> the beginning! If someone really wants a smallish search engine on top
> of some dataset, wrapping a SPARQL query, or a call to the web service
> that the dataset wraps should be enough. I don't see how the data
> publisher is required for achieving that aim. The same thing holds for
> voiD and other technologies. Detailed statistics are available on most
> dataset homepages, which (I think) provides enough data to write a
> good enough voiD description.
>
> To sum up, I am just increasingly concerned that we are building
> requirements on top of requirements for the sake of lowering a  "LD
> entry cost", whereas I have the feeling that this cost is really
> higher and higher... And all that doesn't make the data more linked
> :-)
>
> Cheers!
> y
>
>

Received on Monday, 9 February 2009 11:05:39 UTC