Re: Introduction: Jeremy J. Carroll from M. Scott Marshall on 2013-03-16 (public-semweb-lifesci@w3.org from March 2013)

From: M. Scott Marshall <mscottmarshall@gmail.com>
Date: Sat, 16 Mar 2013 02:42:23 +0100
To: Jeremy J Carroll <jjc@syapse.com>
Cc: "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
Message-ID: <CACHzV2NSm0j8zm96i5XLWKFboE5c8wg=c91zAmUvW1ikymq_cA@mail.gmail.com>

Hello Jeremy,

Thank you for introducing yourself. It would be an honor to have you
join any of the activities at HCLS. Your expertise will surely
strengthen our activities in HCLS.

On Monday, we will have a teleconference about common metadata (about
an RDF dataset in a named graph) at 3PM CET (10AM ET), as mentioned by
Michel, in attempt to include several timezones. The reason for the
time is to include people from PST and JST. As opposed to many
previous HCLS teleconferences, we will use fuzebox for this one.
Details will follow.

(Straw Man) Agenda for LLD:

* Past: Review of how we exited Biohackathon11 - Scott
* Current: Discussion of progress in the meantime (during a few
teleconference calls and mail threads, Bio2RDF, other?) - Scott,
Michel
* Future: Identify work to be done - All

Yes, the Pharmacogenomics task force should be interesting to you.
There, the focus is currently more on the modeling, for example of
genetic variants. In contrast, the Linked Life Data (LLD) task force
is looking at how to create on-the-fly federated queries based on
descriptions of datasets. You are more than welcome to join the LLD
task force discussion, I am confident that you can add to any
discussion about how to describe what is in a named graph that
contains a scientific dataset.

Kind regards,
Scott

On Fri, Mar 15, 2013 at 10:34 PM, Jeremy J Carroll <jjc@syapse.com> wrote:
>
> While it doesn't seem to be the convention in this group, now that I have graduated from lurking to participating, I thought I should say why I am here.
>
> I have a new job with a genomics company in Silicon Valley, and my remit is to work out how best to represent both scientific knowledge about genomics, and also the per-patient knowledge found in say a VCF file ….
>
> Having been working in SemWeb for over a decade now (RDF Concepts, OWL Tests, Jena, Named Graphs, TopBraid Suite), my bias is to use SemWeb technologies; but, particularly with the variant information, the back-of-an-envelope numbers look fairly large, see:
>
> http://lists.w3.org/Archives/Public/semantic-web/2013Mar/0048.html
>
> (That's 1B triples for science, and 1B triples per vcf file uploaded)
>
> The Clinical Pharmacogenomics TF seems like the most relevant part, unfortunately the 7.15 - 8.15 slot for the call coincides with some parental duties for me …
>
> I am still on the look out for some documentation by people who have been tackling similar problems.
>
> As far as I can tell, many people in the field are bought into an ontological approach, but the penetration of even pretty basic SemWeb good practices is low: e.g. use of URIs for concepts, RDF and OWL … and interoperating with various large datasets (e.g. dbsnp) is necessary, but most are not natively OWL
>
> Any thoughts or suggestions would be very welcome
>
> Jeremy
>
>

-- 
M. Scott Marshall, PhD
MAASTRO clinic, http://www.maastro.nl/en/1/
http://eurecaproject.eu/
https://plus.google.com/u/0/114642613065018821852/posts
http://www.linkedin.com/pub/m-scott-marshall/5/464/a22

Received on Saturday, 16 March 2013 01:42:50 UTC