RE: SemWeb BOF @ ISMB 04

You might be interested in the Data Format Description Language (DFDL)
being standardized in the Global Grid Forum
(http://forge.gridforum.org/projects/dfdl-wg/). It is essentially a way
to describe ASCII or binary content in terms of an XML schema, allowing
you to reference data inside binary files via Xpath, etc. (i.e. DFDL is
meant to describe existing formats rather than being a way to encode XML
in binary.) DFDL itself is still in flux, but there are a number of
efforts that have started in that direction including BinX
(www.edikt.org/binx/), and BFD
(http://collaboratory.emsl.pnl.gov/sam/bfd/). In the not too distant
future (months), I expect a DFDL 1.0 standard, Java-based parser(s), and
eventually web/grid data virtualization services. If files have LSIDs,
you have an easy way to define LSIDs for structures inside and will be
able to use standard parsers to 'resolve' the data and retrieve the
desired sub-structure(s).

  Jim

James D. Myers
Chief Scientist, Scientific Computing Environments Group
Computational Science and Mathematics Department
Pacific Northwest National Laboratory
Phone: 610-355-0994
Fax:     208-474-4616
Jim.Myers@pnl.gov <mailto:Jim.Myers@pnl.gov> 


> -----Original Message-----
> From: public-semweb-lifesci-request@w3.org 
> [mailto:public-semweb-lifesci-request@w3.org] On Behalf Of wangxiao
> Sent: Monday, July 26, 2004 11:49 PM
> To: public-semweb-lifesci@w3.org
> Subject: RE: SemWeb BOF @ ISMB 04
> 
> 
> 
> Greeting all:
> 
> I am glad to see this lifesci thread getting life now :-).  
> 
> I believe one of the main problem in life science is the same 
> as the original problem that Eric Neumann has posted.  That 
> is how to refer part of non-RDF document.  
> 
> I am working on a project aimed at using RDF to represent 
> 2D-Gel data.  I understood why Mr. Peter Murray-Rust did not 
> accept Tim Berners-Lee's suggestion to create a CML ontology 
> because I have the same concern of overhead for using either 
> RDF or XML to represent Gel data as well.  In my cases, my 
> concern is this.  For some applications, such as data 
> submission and retrieval, I actually wanted the gel encoded 
> in a "compact" form. Personally, I don't think XML is a good 
> solution because there are more overhead than the actual 
> payload.  Neither will RDF a good solution for the same 
> reason.  It is not that difficult to create an arbitrary 
> format to encode the gel data in a compact text based format 
> but then I don't know how the URI should be assigned to each 
> individual spot.  The URI for the spot is important because 
> if a spot is IDed or used to perform MS.  The URI can make it 
> easy to associate other type of descriptions. 
> 
> I guess this problem - assigning URI to part of another 
> resource -  is more general than it is just the life science 
> alone. I would be very glad to hear if your "BDF" meeting can 
> get some good solution.
> 
> By the way, I think RDF lacks a graphical modelling language 
> like UML.  One of the main RDF activities will be designing 
> ontologies.  So, a graphical language sort of like UML will 
> be very useful during the design phase and for the purpose of 
> discussion and presentation.  I don't think UML is suitable 
> to presenting RDF/OWL and I goggled internet but didn't find 
> any useful one. I therefore created one.  It helps me and it 
> might be of useful to you too.  An introduction of the 
> language, which I called DLG2, is at 
> http://bioinformatics.musc.edu/cc/tools/dlg2/
> and a detailed 
> documentation is at 
> http://bioinformatics.musc.edu/cc/docs/dlg2/dl> g2.htm.  Please 
> note, I just started writing these document. 
>  There must be many problems.  Please forgive my careless 
> writing and any suggestions and criticism are welcome!
> 
> Also, our project - Charleston Core 
> (http://bioinformatics.musc.edu/cc/) is > similar Eric Jain's 
> but focuses on proteomic data.  If any of you have suggestion 
> or want to collaborate, please feel free to contact me.
> 
> Xiaoshu Wang
> 
> 

Received on Tuesday, 27 July 2004 09:20:15 UTC