Re: Best way for exposing Linked Open Data. Wrapper vs scrape from Luca Matteis on 2013-05-28 (public-lod@w3.org from May 2013)

From: Luca Matteis <lmatteis@gmail.com>
Date: Tue, 28 May 2013 11:22:36 +0200
To: j.jakobitsch@semantic-web.at
Cc: Linked Data community <public-lod@w3.org>
Message-ID: <CALp38EPgJXapB16aNUBzCH8z=scL7bD=VTunHyOU80azf99G8w@mail.gmail.com>

Thanks, Jürgen. Are you at #eswc2013? Maybe we can talk about this face to
face :-)
But anyway my two points were related to (i) letting my users do the work
of publishing LOD or (ii) doing the work myself by aggregating their data.

Cheers,
Luca


On Tue, May 28, 2013 at 11:07 AM, Jürgen Jakobitsch SWC <
j.jakobitsch@semantic-web.at> wrote:

> :-) experience shows that the technical aspect of your endeavor is
> probably the simplest and you'll have a lot of time to think about it
> until every group settles on a uri pattern and the vocabularies to be
> used unless you go north-korean and impose such things...
> when you have a couple of datasets the probability of one single
> solution that fits all parties is very low.
> such desicions depend on a lot of non-technical factors like willingness
> to move to the rdf/semweb/linkeddata world, are there current workflows
> that groups of people are using.
>
> technically it depends on things like dataset size, use cases (is it
> enough to simply make this data dereferenceable, is there need to make
> the data queryable (what kinds of queries, there are certain parts that
> are quite difficult to implement when with sparql to sql, limit and top
> in certain cases))
>
> i guess the => fastest <= (not necessarily the best) way would be to
> create dumps (custom scripts, rdb2rdf) and put these into a virtuoso or
> a triple store of your choice in combination with tools like
> "pubby" [2]. then use "limes" or another tool to create links to other
> lod sources. that way the change of peoples' behaviour is not a
> requirement for success.
>
> wkr jürgen
>
> [1] http://aksw.org/Projects/LIMES.html
> [2] http://wifo5-03.informatik.uni-mannheim.de/pubby/
>
> On Tue, 2013-05-28 at 10:18 +0200, Luca Matteis wrote:
> > Here's my scenario: I have several different datasets. Most in MySQL
> > databases. Some in PostrgreSQL. Others in MS Access. Many in CSV. Each
> > one of these datasets is maintained by its own group of people.
> >
> >
> > Now, my end goal is to have all these datasets published as 5 stars
> > Linked Open Data. But I am in doubt between these two solutions:
> >
> >
> > 1) Give a generic wrapper tool to each of these groups of people, that
> > would basically convert their datasets to RDF, and allow them to
> > publish this data as LOD automatically. This tool would allow them to
> > publish LOD on their own, using their own server (does such a generic
> > tool even exist? Can it even be built?).
> >
> >
> > 2) Scrape these datasets, which are at times simply published on the
> > Web as HTML paginated tables, or published as dumps on their server,
> > for example a .CSV dump of their entire database. Then I would
> > aggregate all these various datasets myself, and publish them as
> > Linked Data.
> >
> >
> > Pros and cons for each of these methods? Any other ideas?
> >
> >
> > Thanks!
>
> --
> | Jürgen Jakobitsch,
> | Software Developer
> | Semantic Web Company GmbH
> | Mariahilfer Straße 70 / Neubaugasse 1, Top 8
> | A - 1070 Wien, Austria
> | Mob +43 676 62 12 710 | Fax +43.1.402 12 35 - 22
>
> COMPANY INFORMATION
> | web       : http://www.semantic-web.at/
> | foaf      : http://company.semantic-web.at/person/juergen_jakobitsch
> PERSONAL INFORMATION
> | web       : http://www.turnguard.com
> | foaf      : http://www.turnguard.com/turnguard
> | g+        : https://plus.google.com/111233759991616358206/posts
> | skype     : jakobitsch-punkt
> | xmlns:tg  = "http://www.turnguard.com/turnguard#"
>
>

Received on Tuesday, 28 May 2013 09:23:11 UTC