Task:  Identify best practices and gaps in current available tools for converting structured data into RDF?

Task Objectives:

1) This HCLSIG task is designed to provide guidence, examples, and shared experiences collected using [WWW]GRDDL for exposing collections of [WWW]HCLSIG related content for use in the [WWW]Semantic Web.

2)  To build a life sciences demo using RDF and OWL to get a stronger understanding as to the work required, to explore the effectiveness of current tools, and to document our finding to help accelerate the adoption of the Semantic Web by others. 

Task Participants: Olivier Bodenreider, Kei Cheung, Roger Cutler, Brian Gilman, Joanne Luciano, Brian Osborne, Alan Ruttenburg, Matt Shanahan, Susie Stephens, Charles Tilford.   @@ who else @@

Use case context:

A neurosciences knowledge integration portal???

Problem statement for this use case:  (what does this task solve demonstrating the value of semantic web technologies in the context of this use case?)

Deliverable(s)

A demonstration of a use case with the following features:

· The demo will primarily take advantage of data sets from the NCBI. However, additional data sets will be required with time as the ultimate goal is to build a demo that spans from ‘bench to bedside’.

· The demo will focus on the domain area of neuroscience, in order to be able to build upon the ongoing work in Scientific Commons and SWANS.

·        The main focus of the demo is on the integration of many data sets, although scalability is also of interest.

·        ? A Report on tool gaps or best practices??

Related resources:

- [WWW]Gleaning Resource Descriptions from Dialects of Languages (GRDDL), Dan Connolly, Dominique Hazaël-Massieux, World Wide Web Consortium (W3C)

- [WWW]XML and RDF with GRDDL, Dominique Hazaël-Massieux, World Wide Web Consortium (W3C),

-         [WWW]GRDDL How To Guide

-         XSLT, RDB access

Task supports and dependencies:

- @@ user group 1 @@, @@ user group 2 @@

Tools and Services:

-         [WWW]W3C online GRDDL demo service

Timeline for Task Completion 

(below should outline sequence and owners for each step)

Stage 1   (3 month goals)   

·  Identify the initial data sources to be used in the demo. 

· Explore additional data sources that would be required for the demo to span ‘bench to bedside’.

· Learn about GRDDL, SPARQL, OWL, etc.

· Increase knowledge of neuroscience.

· Set up a Wiki for communication.

 

 

Stage 2   (6 months goals)

· Transform data into RDF from Word, Excel, XML, Relational, etc.

· Analysis of semantic requirements (connect to ontology sub-group).

· Move from screen scraping to an API.

· Create documents that describe work undertaken, and observations.

 

Stage 3    (12 months goals)   

· Use ontologies with the demo.

· Answer scientific questions and hopefully glean new scientific insights through using the demo.

· Validate the effectiveness of the data integration.