RE: [BIORDF] edit of Top Level Task - scalability from Kashyap, Vipul on 2006-03-27 (public-semweb-lifesci@w3.org from March 2006)

From: Kashyap, Vipul <VKASHYAP1@PARTNERS.ORG>
Date: Mon, 27 Mar 2006 09:56:45 -0500
To: "M. Scott Marshall" <marshall@science.uva.nl>, <public-semweb-lifesci@w3.org>
Message-ID: <2BF18EC866AF0448816CDB62ADF65381033C316E@PHSXMB11.partners.org>

+1

RDF wrappers and interfaces are the best way to go.

As we deploy this technology into the real world, IT folks will be
against moving and re-designing data repositories, but will be more amenable
to any approach that doesn't require them to relocate, redesign or disrupt
current applications to the data.

Also, creating an RDF data warehouse essentially destroys the "incremental low
cost" value proposition that Eric Miller talks about.

Would propose that the BIORDF group explore best practices for creating RDF
wrappers from a variety of data sources....

Cheers,

---Vipul

=======================================
Vipul Kashyap, Ph.D.
Senior Medical Informatician
Clinical Informatics R&D, Partners HealthCare System
Phone: (781)416-9254
Cell: (617)943-7120
http://www.partners.org/cird/AboutUs.asp?cBox=Staff&stAb=vik 
 
To keep up you need the right answers; to get ahead you need the right questions
---John Browning and Spencer Reiss, Wired 6.04.95
> -----Original Message-----
> From: public-semweb-lifesci-request@w3.org [mailto:public-semweb-lifesci-
> request@w3.org] On Behalf Of M. Scott Marshall
> Sent: Monday, March 27, 2006 9:50 AM
> To: public-semweb-lifesci@w3.org
> Subject: [BIORDF] edit of Top Level Task - scalability
> 
> 
> After e-mail with Susie, I have edited the BioRDF Top Level Task [1] to
> reflect some of the scalability issues.
> 
> Some of my comments to Susie were:
> > I can imagine 'collecting' data into an RDF repository for a demo but
> > we should keep in mind that this approach won't scale. Example: One of
> > the data files that we imported was 53Mb. Once transformed into RDF,
> > it has become ~800Mb. Obviously, this is survivable for reasonably
> > small datasets, but..
> >
> > That's why HCLSIG should hope to eventually have RDF export
> > functionality "on demand" at the data source (instigate widespread
> > adoption of SW values by omics database managers?). But, lately, I
> > think that in the long run, rather than convert legacy databases into
> > RDF repositories or export from them, that query mapping/rewriting
> > approaches such as D2RQ[2]
> > could be more effective. Also, federation/p2p/broker approaches could
> > help to consolidate biobase interfaces for the user. Does this say
> > anything to you?
> 
> -scott
> 
> [1] http://esw.w3.org/topic/BioRDF_Top_Level_Task
> [2] http://www.wiwiss.fu-berlin.de/suhl/bizer/d2rq/
> --
> M. Scott Marshall
> tel. +31 (0) 20 525 7765
> http://staff.science.uva.nl/~marshall
> http://integrativebioinformatics.nl/
> Integrative Bioinformatics Unit, University of Amsterdam
> 
>

Received on Monday, 27 March 2006 14:56:51 UTC