W3C home > Mailing lists > Public > semantic-web@w3.org > June 2011

Re: Silk - Link Discovery Framework Version 2.4 release

From: Mischa Tuffield <mmt04r@ecs.soton.ac.uk>
Date: Wed, 1 Jun 2011 20:00:02 +0100
Cc: paoladimaio10@googlemail.com, Sören Auer <auer@informatik.uni-leipzig.de>, Robert Isele <robertisele@googlemail.com>, Linking Open Data <public-lod@w3.org>, SW-forum Web <semantic-web@w3.org>, marta.nagy-rothengass@ec.europa.int, "VAN ORANJE-NASSAU Constantijn (CAB-KROES)" <Constantijn.Van-Oranje-Nassau@ec.europa.eu>
Message-ID: <EMEW3|26d6777921be59ba3d9e539eda6c5ec9n50K1S06mmt04r|ecs.soton.ac.uk|C15DB387-BA2C-48D1-82A0-EC39D7D7FBAD@ecs.soton.ac.uk>
To: adam.saltiel@gmail.com
Hello All, 

I little to no idea about academic funding processes, I was lucky enough to be exposed to such things when I was a PhD student, but I left academia before wrapping up my studies and now work in the private sector. Nonetheless, I feel a part of the semantic web community and effort.  

A simple web search for the LOD2 project, has got to the two following pages: 

http://cordis.europa.eu/fetch?CALLER=PROJ_ICT&ACTION=D&CAT=PROJ&RCN=95562 

which in turn links to their 2010 annual report: 

http://ec.europa.eu/information_society/apps/projects/logos/3/257943/080/publishing/readmore/LOD2_Annual_Report_2010_final.pdf

Again, I am not sure what type of transparency with regards to EU funding and project outputs has to do with the Linked Data or the Semantic Web interest group mailing lists. I too am all for transparency, it is definitely not a bad thing, but why should people who are new to our community have to experience such finger-pointing on mailing lists to do with technology. Surely there is an "EU funding mailing list" which would be a more suitable places for such comments to be sent to!

A final 2 cents from me, I am a firm believer in engineering, and how important it is to science and progress in general. I see nothing wrong with a project funded by tax payer money which sets out to build upon and improve (through solid engineering) the semantic web technology and infrastructure. IMHO good solid engineering is key to the success of any new technology.

Regards, 

Mischa *goes back to lurking now ....

On 1 Jun 2011, at 18:58, adam.saltiel@gmail.com wrote:

> On the other hand I completely agree with this tack. Academic research goals can be opaque and that can leave the feeling there is some funding shenanigans going on while, if in puplic domain, there is frustration for exactly the reasons you draw attention to. Not enough to go on to form independent evaluation.  
> Sent using BlackBerry® from Orange
> 
> -----Original Message-----
> From: Paola Di Maio <paola.dimaio@gmail.com>
> Sender: semantic-web-request@w3.org
> Date: Wed, 1 Jun 2011 18:33:31 
> To: Sören Auer<auer@informatik.uni-leipzig.de>
> Reply-To: paoladimaio10@googlemail.com
> Cc: Mischa Tuffield<mmt04r@ecs.soton.ac.uk>; Robert Isele<robertisele@googlemail.com>; <public-lod@w3.org>; SW-forum<semantic-web@w3.org>; <marta.nagy-rothengass@ec.europa.int>; VAN ORANJE-NASSAU Constantijn (CAB-KROES)<Constantijn.Van-Oranje-Nassau@ec.europa.eu>
> Subject: Re: Silk - Link Discovery Framework Version 2.4 release
> 
> Sören
> 
> Thanks for additional info and clarification.
> 
> Since most of the LOD2 technology stack is made of software and tools
> already existing, (some of which developed with public funding from
> previous calls), it would be nice to have a clearer representation (a
> table perhaps?) of what LOD2 is expected to yield, and how this is
> going to be achieved.
> 
> For example, what functionalities/features are going to be new, and in
> what requirements do they fulfil? (I am sure there is a rationale, I
> am just saying it is not openly shared)
> 
> This would help the public understand and evaluate how much
> development time/resources is required to deliver the new
> functionalities, and make sure that these correspond to real
> requirements , and increase public confidence in th work being done.
> 
> At the moment this is not immediately clear (to me, at least) from the
> deliverables tab
> (ie, the LOD2 project page does not provide this kind of knowledge,
> but fragmentary information that does not to relate to the overall
> project schedule, for example )
> 
> Also, following up your keynote @WIMS, which I enjoyed, you talked
> mostly about already existing LOD technologies, and said indeed that
> LOD2 is about providing integration about various existing layers and
> tools.   But no hints as to 'how' this is going to be achieved (the
> interesting research issues related to systems integration, for
> example),  or if the approach being followed is feasible at all
> 
> A project proposal overview/project schedule document, or a summary
> thereof, may be helpful. (just a suggestion)
> 
> The good will of the community can only be increased by transparent
> project governance :-)
> 
> At your convenience,
> 
> Cheers
> 
> PDM
> 
> 
> 
> 
> 
>> As you can easily see in the Version History substantial development has
>> happened with SILK since the LOD2 project was started:
>> 
>> http://www4.wiwiss.fu-berlin.de/bizer/silk/#history
>> 
>> You are right, SILK existed before LOD2 as did some other tools which
>> are further developed during the LOD2 project. However, there are also a
>> number of new developments such as DBpedia Spotlight [1], the Digital
>> Agenda Scoreboard [2] or LIMES [3].
>> 
>> The main development of LOD2 is actually an integrated stack of new
>> *and* substantially improved tools for Linked Data life-cycle
>> management. We are currently heavily working on the stack, which will be
>> released in September.
>> 
>> I understand your concern wrt. the efficiency of research, but as LOD2
>> coordinator I can assure you, that we do our best to give the European
>> taxpayer as much bang for the buck as possible. I fact you can easily
>> keep track of all the LOD2 activities on the LOD2 blog, publications and
>> deliverables posted at: http://lod2.eu
>> 
>> Best,
>> 
>> Sören
>> 
>> [1] http://dbpedia.org/spotlight
>> [2]
>> http://ec.europa.eu/information_society/digital-agenda/scoreboard/graphs/index_en.htm
>> [3] http://aksw.org/Projects/LIMES/
>> 
>> PS: The two 2010 dates you spotted on top in the SILK version history
>> were clearly typos and are now corrected.
>> 
>> Am 01.06.2011 17:48, schrieb Paola Di Maio:
>>> Thanks Misha
>>> 
>>> I agree
>>> 
>>> I took a screenshot, attached, for future reference.
>>> 
>>> Since SILK according to the information provided in the link was
>>> released last year, and LOD2 funding started after the release date, I
>>> am just asking for a clarification (or clearer project information?).
>>> 
>>> It is important for these clarifications to be made in public fora,
>>> and that people concerned are kept in the loop, for the benefit for
>>> everyone involved
>>> 
>>> Look forward to learn more about SILK and LOD2!
>>> 
>>> Cheers
>>> 
>>> PDM
>>> 
>>> On Wed, Jun 1, 2011 at 4:35 PM, Mischa Tuffield <mmt04r@ecs.soton.ac.uk> wrote:
>>>> Hi,
>>>> I don't usually write to this list, and have no idea what SILK is about
>>>> (Sorry SILK people!), but I found the below email to be incredibly harsh.
>>>> Look at the git history of the project (which was 1 click way from the email
>>>> I am referring to below!), it does seem to be in active development, with a
>>>> number of committers:
>>>> http://www.assembla.com/code/silk/git/node/logs?page=1  (apache license 2.0)
>>>> And the page DOES seem to reflect this:
>>>> http://www4.wiwiss.fu-berlin.de/bizer/silk/
>>>> Perhaps there was a bug in the HTML(?), I don't know - but I would give
>>>> people the benefit of the doubt before pointing fingers in public. I do
>>>> think a personal email to Robert would probably have sufficed, but perhaps I
>>>> am just that way inclined.
>>>> I have recently unsubscribed from a few of the SW based mailing lists
>>>> because of trolling and people being incredibly rude - and I hope I don't
>>>> have to remove myself from any others. The Semantic Web community is full of
>>>> a great number of nice, helpful, intelligent people, and I find it a
>>>> pleasure and an honour to be involved with this international community of
>>>> awesome....  Lots of people put lots of time and effort into writing open
>>>> specs and open-source code - and i don't see how finger pointing helps
>>>> anyone!
>>>> Mischa
>>>> http://mmt.me.uk/
>>>> On 1 Jun 2011, at 16:16, Paola Di Maio wrote:
>>>> 
>>>> Robert
>>>> 
>>>> thanks  lot for the update, I look forward to be trying it out
>>>> 
>>>> I see from this page
>>>> http://www4.wiwiss.fu-berlin.de/bizer/silk/
>>>> 
>>>> that SILK V 2.4, announced on this list today was actually released
>>>> last year: See the snippet below
>>>> 
>>>> 2010-06-01: Version 2.4 released including the new Silk Workbench, a
>>>> web application which guides the user through the process of
>>>> interlinking different data sources.
>>>> 
>>>> I also seem to understand from the project page that much of LOD2
>>>> software are tools developed in previous years (ie, nothing new!)
>>>> 
>>>> Am I reading something wrong?
>>>> 
>>>> In the past decade or so, millions of euros of tax payers money have
>>>> been paid for projects for which the codebase had already been
>>>> developed, either by funded projects from prior calls( ie, for which
>>>> the tax payer had already paid ) or by other companies.
>>>> 
>>>> In essence, as it has been already pointed out, the public has been
>>>> paying for the same semantic web tools to be rebranded over and over,
>>>> and each time it has costed lots of public money, and each time it has
>>>> not delivered the semantic web functionality the public is waiting for
>>>> (ie, a useable web based application layer)
>>>> 
>>>> Since LOD2 has become a funded EU project in September 2010, I would
>>>> be grateful if you could explain what part of the tool/functionality
>>>> has been developed after September 2010, and for what part of this
>>>> development is the public funding being used for
>>>> 
>>>> 
>>>> Thanks a lot in advance
>>>> 
>>>> PDM
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Wed, Jun 1, 2011 at 3:35 PM, Robert Isele <robertisele@googlemail.com>
>>>> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> we are happy to announce version 2.4 of the Silk - Link Discovery
>>>> 
>>>> Framework for the Web of Data.
>>>> 
>>>> The central idea of the Web of Data is to interlink data items using
>>>> 
>>>> RDF links. However, in practice most data sources are not sufficiently
>>>> 
>>>> interlinked with related data sources. The Silk Link Discovery
>>>> 
>>>> Framework addresses this problem by providing tools to generate links
>>>> 
>>>> between data items based on user-provided link specifications. It can
>>>> 
>>>> be used by data publishers to generate links between datasets as well
>>>> 
>>>> as by Linked Data consumers to augment Web data with additional RDF
>>>> 
>>>> links.
>>>> 
>>>> Link specifications can either be written manually or developed using
>>>> 
>>>> the new Silk Workbench. The Silk Workbench, is a web application which
>>>> 
>>>> guides the user through the process of interlinking different data
>>>> 
>>>> sources. It’s being shipped with the 2.4 version of Silk.
>>>> 
>>>> The Silk Workbench offers the following features:
>>>> 
>>>> - It enables the user to manage different sets of data sources and
>>>> 
>>>> linking tasks.
>>>> 
>>>> - It offers a graphical editor which enables the user to easily create
>>>> 
>>>> and edit link specifications.
>>>> 
>>>> - As finding a good linking heuristics is usually an iterative
>>>> 
>>>> process, the Silk Workbench makes it possible for the user to quickly
>>>> 
>>>> evaluate the links which are generated by the current link
>>>> 
>>>> specification.
>>>> 
>>>> - It allows the user to create and edit a set of reference links used
>>>> 
>>>> to evaluate the current link specification.
>>>> 
>>>> The Silk Link Discovery Framework includes three applications to
>>>> 
>>>> execute the link specifications which address different use cases:
>>>> 
>>>> 1. Silk Single Machine is used to generate RDF links on a single
>>>> 
>>>> machine. The datasets that should be interlinked can either reside on
>>>> 
>>>> the same machine or on remote machines which are accessed via the
>>>> 
>>>> SPARQL protocol. Silk Single Machine provides multithreading and
>>>> 
>>>> caching. In addition, the performance can be further enhanced using an
>>>> 
>>>> optional blocking feature.
>>>> 
>>>> 2. Silk Server can be used as an identity resolution component within
>>>> 
>>>> applications that consume Linked Data from the Web. Silk Server
>>>> 
>>>> provides an HTTP API for matching instances from an incoming stream of
>>>> 
>>>> RDF data while keeping track of known entities. It can be used for
>>>> 
>>>> instance together with a Linked Data crawler to populate a local
>>>> 
>>>> duplicate-free cache with data from the Web.
>>>> 
>>>> 3. Silk MapReduce is used to generate RDF links between datasets using
>>>> 
>>>> a cluster of multiple machines. Silk MapReduce is based on Hadoop and
>>>> 
>>>> can for instance be run on Amazon Elastic MapReduce. Silk MapReduce
>>>> 
>>>> enables Silk to scale out to very big datasets by distributing the
>>>> 
>>>> link generation to multiple machines.
>>>> 
>>>> More information about the Silk framework, the Silk Link Specification
>>>> 
>>>> Language, as well as several examples that demonstrate how Silk is
>>>> 
>>>> used to set links between different data sources in the LOD cloud is
>>>> 
>>>> found at:
>>>> 
>>>> http://www4.wiwiss.fu-berlin.de/bizer/silk/
>>>> 
>>>> The Silk framework is provided under the terms of the Apache License,
>>>> 
>>>> Version 2.0 and can be downloaded from
>>>> 
>>>> http://www4.wiwiss.fu-berlin.de/bizer/silk/releases/
>>>> 
>>>> The development of Silk was supported by Vulcan Inc. as part of its
>>>> 
>>>> Project Halo (www.projecthalo.com) and by the EU FP7 project LOD2 -
>>>> 
>>>> Creating Knowledge out of Interlinked Data (http://lod2.eu/, Ref. No.
>>>> 
>>>> 257943).
>>>> 
>>>> Thanks to  Christian Becker, Michal Murawicki and Andrea Matteini for
>>>> 
>>>> contributing to the Silk Workbench.
>>>> 
>>>> Happy linking,
>>>> 
>>>> Robert Isele, Anja Jentzsch and Chris Bizer
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 
> 


Received on Wednesday, 1 June 2011 19:02:01 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:25 UTC