Re: Reproducible software experiments through semantic configurations from Hugh Glaser on 2017-05-19 (public-lod@w3.org from May 2017)

From: Hugh Glaser <hugh@glasers.org>
Date: Fri, 19 May 2017 16:42:29 +0100
To: Idafen Santana Pérez <isantana@fi.upm.es>
Cc: Ruben Taelman <ruben.taelman@ugent.be>, public-lod <public-lod@w3.org>
Message-Id: <94DDCF64-14E3-4743-8C94-934D3D3D3791@glasers.org>
In case the cultural memory is lost, so people don't remember this.

https://www.myexperiment.org is a Workflow management tool that supports Linked Data; it ran for many years, finishing a while ago, and seems to be impressive in that it is still being used (see https://www.myexperiment.org/workflows/4984 for the latest).
I am nothing to do with it, so may be wrong about:
It morphed into the Taverna project, which is now in Apache - https://en.wikipedia.org/wiki/Apache_Taverna, but I think the Linked Data may have got a bit lost on the way.
However, the latest developments (https://taverna.incubator.apache.org/documentation/scufl2/ ) seem to suggest they are getting Linked Data capabilities again.



> On 19 May 2017, at 16:15, Idafen Santana Pérez <isantana@fi.upm.es> wrote:
> 
> Hi Ruben,
> 
> thanks for sharing your paper. The base idea of publishing LD dataset containing description of software packages and their dependencies is really helpful for dealing with reproducibility, specially from the developer point of view in this case, which I assume is the main target of your approach, as discussed on the paper. During my PhD I also explored how semantic technologies can be applied for experimental reproducibility, focusing on scientific workflows [1], but we didn't cover its publication as proper Linked Data, which in my opinion is a really strong point of your work.
> 
> In our case we developed a more generic approach, not restricted to one technology or software framework, so at the end we had to rely on scripts for deploying and generic parameters for the infrastructure configuration. We developed a set of ontologies [2], for describing concept on a more top-level manner, as we assumed that in general, most scientist using computational tools don't have the required development skills. We applied it over several computational experiments, belonging to different scientific areas, testing how they allow to reproduce the experiments.
> 
> As I said, I think this is a great contribution for supporting semantic descriptions of experiments, and I would like to see more papers using this kind of initiatives coming in the future, not only within our community, but also in those not related to the semantic web or computational science in general.
> 
> I will also try to add some comments/question on some concrete parts of the paper itself.
> 
> Regards,
> Idafen
> 
> [1] http://dx.doi.org/10.1016/j.future.2015.12.017
> [2] http://purl.org/net/wicus
> 
> On Fri, May 19, 2017 at 9:55 AM, Ruben Taelman <ruben.taelman@ugent.be> wrote:
> Dear all,
> 
> Some of you may recognise the following problem:
> 
> Let’s say you just read an article that is based on a software-driven experiment,
> and want to reproduce the results.
> While the article mentions what software it uses,
> it doesn’t mention the versions of that software and its dependencies,
> or the configuration that was used to run the experiment.
> Yet, these are essential details for reproducing experimental results,
> as a slightly different configuration might lead to significantly different reslts.
> 
> That is why we (Joachim Van Herwegen, Sarven Capadisli, Ruben Verborgh and myself),
> decided to eat our own dogfood by describing and publishing these software configurations as Linked Data.
> 
> In reply to the ISWC 2017 call for in-use papers,
> we wrote an article titled:
>  
>  Reproducible software experiments through semantic configurations
> 
> In this work, we introduce ontologies to describe software components
> and their configurations to facilitate reproducible software experiments.
> For semantic interlinking between these components and their configurations,
> we publish the the metadata of all 480,000+ JavaScript libraries on npm as 174,000,000+ RDF triples [1].
> Furthermore, we introduce a dependency injection framework [2] that understands these configurations
> and is able to instantiate software based on this.
> 
> This article is self-published on:
> https://linkedsoftwaredependencies.org/articles/reproducibility/
> 
> Public reviews, feedback or other comments on the article itself are welcome.
> This can be done by signing in and commenting with your WebID, which is powered by dokieli [3].
> 
> [1] https://fragments.linkedsoftwaredependencies.org/npm
> [2] https://github.com/LinkedSoftwareDependencies/Components.js
> [3] https://dokie.li/
> 
> Kind regards,
> Ruben Taelman
> 
> 
> 
> -- 
> PhD, Ontology Engineering Group
>
Received on Friday, 19 May 2017 15:43:02 UTC