RE: [BioRDF] Project Proposal - Warning - NOT TERSE from Cutler, Roger (RogerCutler) on 2006-04-20 (public-semweb-lifesci@w3.org from April 2006)

From: Cutler, Roger (RogerCutler) <RogerCutler@chevron.com>
Date: Thu, 20 Apr 2006 12:43:39 -0500
To: "Brian Gilman" <gilmanb@pantherinformatics.com>, public-semweb-lifesci@w3.org
cc: "Eric Miller" <em@w3.org>, "Eric K. Neumann" <eneumann@teranode.com>, "Tonya Hongsermeier" <THONGSERMEIER@partners.org>, "Brian Osborne" <osborne1@optonline.net>, "Tom Stambaugh" <tms@stambaugh-inc.com>, "Susie Stephens" <susie.stephens@oracle.com>
Message-ID: <0C237C50B244FD44BE47B8DCE23A3052011C64B8@HOU150NTXC2MC.hou150.chevrontexaco.net>

I would like to highlight the analysis below of "blocking factors" (in
the second Q&A).  IMO this is the most succinct, acccurate and useful
statement of this sort I have seen.  I would particularly like to agree
with "the community needs to know why RDF is needed above and beyond
traditional RDBM technology", since relational DB technology is so
ubiquitous and highly developed.  As I have been saying in some off-line
correspondance, any Semantic Web technique that involves only relational
databases is, at some point, going to have to address a rather wide
field of very well developed alternatives (such as data warehousing,
mining tools, etc).

-----Original Message-----
From: public-semweb-lifesci-request@w3.org
[mailto:public-semweb-lifesci-request@w3.org] On Behalf Of Brian Gilman
Sent: Wednesday, April 19, 2006 6:48 PM
To: public-semweb-lifesci@w3.org
Cc: Eric Miller; Eric K. Neumann; Tonya Hongsermeier; Brian Osborne; Tom
Stambaugh; Susie Stephens
Subject: [BioRDF] Project Proposal - Warning - NOT TERSE


Hello Everyone,

	Sorry to be lurking so much lately. I'm taking this opportunity
to update you on my activities to date and propose a project that I
think would show the utility of RDF in the "wild". I've been playing
with Ruby and Ruby On Rails for about 1 month now and have come to
understand the reason why it has been getting so much attention in the
development community. Ruby on Rails (www.rubyonrails.org) makes web
application development a breeze. Rails is a web framework that puts
into practice the tenets of "agile" development methodologies. I don't
want to proselytize too much here so I'll end with a few metrics from 2
projects I recently embarked on at Panther Informatics.

	Ported proteomics application from Java -> Ruby (single lines of
code down by 45%)
	No Changes to database
	Exact same performance when porting

	New Development Using Rails
	Developed simple web based project sign up page with e-mail and
database backend
	30 mins into production completely working all requirements met
(amazing)

  	Why am I writing this list about these projects? I've found a
very interesting project called ActiveRDF (http://m3pe.org/activerdf/).

This toolkit abstracts away the complexities of RDF much like
ActiveRecord (Rails Object Relational (OR) mapping framework) abstracts
away the complexities of OR mapping. The implementation utilizes both
librdf (redland framework - librdf.org) and a java based triple store
called YARS - Yet Another RDF Store - http://
sw.deri.org/2004/06/yars/). I've used this library with librdf and have
to say that it is quite nice.

	Proposal outlines as a set of questions (comes from Panther's
internal project documents):

	Q: What problems are we solving with the project?

	A: The BioRDF group is tasked with producing a set of documents
that show how to produce RDF from common data formats. However, the
community needs to know why RDF is needed above and beyond traditional
RDBM technology. I propose a web based tool, that shows this utility by
taking common biological data formats (excel, biopax, mage-ml etc.),
transform them into RDF and allow for query and storage using an
intuitive user interface.

	Q: What impact will this project have in terms of customer
awareness/ community awareness around the problem/issue you are solving?

	A: While I don't have any quantitative data to support my claim,
an informal survey of top/mid level engineers and managers in the
Bioinformatics domain shows a low adoption rate of this technology.  
Blocking factors for these individuals are: 1) lack of experience with
the technology, 2) Unsure how/why this supersedes/compliments current
RDBM technology,  3) No public resource showing how to implement real
systems using current technology,  and 4) Don't understand how adoption
impacts development in terms of implementation, maintenance, security,
and complexity  (what is the impact to a project execution timeline and
cost?)

	Q: How do you propose to implement this project?

	A: I would like to work with the BioRDF group to produce a Ruby
On Rails based application that mimics the BioDash thick client first.  
I'd then like to work with people to take data sets that they have
already available in RDF and link them into the web dashboard. From
there (or in parallel) we can take other datasets that have been
transformed into RDF and show how data just "snaps in" (Eric Miller
likes to call this "recombinant data").

	Q: What is the duration of the project (show metrics if you have
any)?

	A: This project should take less than 3 months of time - based
on metrics stated above

	Q: How many people are required to meet your timeline (show
metrics/ data if any)?

	A: I believe 3 people would be ideal for this project, 2
developers and 1 scientific lead to show the utility of the data sets
that have been integrated - have no data to support this claim.

	Q: Are there any hurdles/risks/blocking factors you can identify
upfront?

	A: Yes, activerdf is still under development although I've
produced a proof of concept application to aid in mitigating this risk.

Ideally we'd use Oracle's RDF storage engine for this project.  
Unfortunately, activerdf does not support Oracle at the moment.  
(wonder if Susie Stevens could help me here?). Not sure how many people
(other than myself) are aware of Ruby On Rails or have experience
implementing applications using this technology. This may extend
timeline suggested above.


	I look forward to hearing back from interested parties. Please
e- mail me and the list. My e-mail:

	gilmanb@mac.com

						Best Regards,

	
-Brian

--
Brian Gilman
President Panther Informatics Inc.
E-Mail: gilmanb@pantherinformatics.com
         gilmanb@jforge.net
AIM: gilmanb1

01000010 01101001 01101111
01001001 01101110 01100110
01101111 01110010 01101101
01100001 01110100 01101001
01100011 01101001 01100001
01101110

Received on Thursday, 20 April 2006 17:44:51 UTC