[BioRDF] Project Proposal - Warning - NOT TERSE from Brian Gilman on 2006-04-19 (public-semweb-lifesci@w3.org from April 2006)

From: Brian Gilman <gilmanb@pantherinformatics.com>
Date: Wed, 19 Apr 2006 19:47:58 -0400
To: public-semweb-lifesci@w3.org
Cc: Eric Miller <em@w3.org>, "Eric K. Neumann" <eneumann@teranode.com>, Tonya Hongsermeier <THONGSERMEIER@partners.org>, Brian Osborne <osborne1@optonline.net>, Tom Stambaugh <tms@stambaugh-inc.com>, Susie Stephens <susie.stephens@oracle.com>
Message-Id: <C070C672-5450-4F6B-BA4F-C09B38C90B6F@pantherinformatics.com>

Hello Everyone,

	Sorry to be lurking so much lately. I'm taking this opportunity to  
update you on my activities to date and propose a project that I  
think would show the utility of RDF in the "wild". I've been playing  
with Ruby and Ruby On Rails for about 1 month now and have come to  
understand the reason why it has been getting so much attention in  
the development community. Ruby on Rails (www.rubyonrails.org) makes  
web application development a breeze. Rails is a web framework that  
puts into practice the tenets of "agile" development methodologies. I  
don't want to proselytize too much here so I'll end with a few  
metrics from 2 projects I recently embarked on at Panther Informatics.

	Ported proteomics application from Java -> Ruby (single lines of  
code down by 45%)
	No Changes to database
	Exact same performance when porting

	New Development Using Rails
	Developed simple web based project sign up page with e-mail and  
database backend
	30 mins into production completely working all requirements met  
(amazing)

  	Why am I writing this list about these projects? I've found a very  
interesting project called ActiveRDF (http://m3pe.org/activerdf/).  
This toolkit abstracts away the complexities of RDF much like  
ActiveRecord (Rails Object Relational (OR) mapping framework)  
abstracts away the complexities of OR mapping. The implementation  
utilizes both librdf (redland framework - librdf.org) and a java  
based triple store called YARS - Yet Another RDF Store - http:// 
sw.deri.org/2004/06/yars/). I've used this library with librdf and  
have to say that it is quite nice.

	Proposal outlines as a set of questions (comes from Panther's  
internal project documents):

	Q: What problems are we solving with the project?

	A: The BioRDF group is tasked with producing a set of documents that  
show how to produce RDF from common data formats. However, the  
community needs to know why RDF is needed above and beyond  
traditional RDBM technology. I propose a web based tool, that shows  
this utility by taking common biological data formats (excel, biopax,  
mage-ml etc.), transform them into RDF and allow for query and  
storage using an intuitive user interface.

	Q: What impact will this project have in terms of customer awareness/ 
community awareness around the problem/issue you are solving?

	A: While I don't have any quantitative data to support my claim, an  
informal survey of top/mid level engineers and managers in the  
Bioinformatics domain shows a low adoption rate of this technology.  
Blocking factors for these individuals are: 1) lack of experience  
with the technology, 2) Unsure how/why this supersedes/compliments  
current RDBM technology,  3) No public resource showing how to  
implement real systems using current technology,  and 4) Don't  
understand how adoption impacts development in terms of  
implementation, maintenance, security, and complexity  (what is the  
impact to a project execution timeline and cost?)

	Q: How do you propose to implement this project?

	A: I would like to work with the BioRDF group to produce a Ruby On  
Rails based application that mimics the BioDash thick client first.  
I'd then like to work with people to take data sets that they have  
already available in RDF and link them into the web dashboard. From  
there (or in parallel) we can take other datasets that have been  
transformed into RDF and show how data just "snaps in" (Eric Miller  
likes to call this "recombinant data").

	Q: What is the duration of the project (show metrics if you have any)?

	A: This project should take less than 3 months of time - based on  
metrics stated above

	Q: How many people are required to meet your timeline (show metrics/ 
data if any)?

	A: I believe 3 people would be ideal for this project, 2 developers  
and 1 scientific lead to show the utility of the data sets that have  
been integrated - have no data to support this claim.

	Q: Are there any hurdles/risks/blocking factors you can identify  
upfront?

	A: Yes, activerdf is still under development although I've produced  
a proof of concept application to aid in mitigating this risk.  
Ideally we'd use Oracle's RDF storage engine for this project.  
Unfortunately, activerdf does not support Oracle at the moment.  
(wonder if Susie Stevens could help me here?). Not sure how many  
people (other than myself) are aware of Ruby On Rails or have  
experience implementing applications using this technology. This may  
extend timeline suggested above.


	I look forward to hearing back from interested parties. Please e- 
mail me and the list. My e-mail:

	gilmanb@mac.com

						Best Regards,

									-Brian

--
Brian Gilman
President Panther Informatics Inc.
E-Mail: gilmanb@pantherinformatics.com
         gilmanb@jforge.net
AIM: gilmanb1

01000010 01101001 01101111
01001001 01101110 01100110
01101111 01110010 01101101
01100001 01110100 01101001
01100011 01101001 01100001
01101110

Received on Wednesday, 19 April 2006 23:48:05 UTC