W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > July 2004

RE: SemWeb BOF @ ISMB 04

From: Myers, James D <jim.myers@pnl.gov>
Date: Tue, 27 Jul 2004 07:12:01 -0700
To: public-semweb-lifesci@w3.org
Message-id: <67AF35FA07A89948AFA88E64793A1DB35B4302@pnlmse35.pnl.gov>

I guess the quickest answer would be that you could use DFDL to view any
of the legacy hex formats discussed in the SHF spec as SHF (assuming
someone who knows the legacy format creats a DFDL file describing it).

To take a different example (same topic), DFDL will allow you to state
that a file contains, for example, two 32 bit integers in little endian
format that should be named x and y, followed by an array (population)
of floating point numbers (count) that has dimensions x by y. You could
also state that the units used in the array are organisms/km**2
(information that is not in the file - just a constant for this file
type). Thus, the binary data (two ints and some floats), could be
handled as though it were
<DFDL>
<x>3</x>
<y>4</y>
<population
units="organisms/km**2"><count>2</count><count>42</count><count>34</coun
t>...
</population>
</DFDL>

Thus, rather than defining how to encode an array in XML, DFDL allows
you to map from your binary data to the structure (potentially a
community standard schema) you desire.

DFDL parsers could realize the XML and hand it back or, if you only
wanted the value at [2,2] the parser could compute the offset and simply
return the requested <count/> element. One might also use a parser
within an application and start with a DOM tree, without ever realizing
the XML text.

  Jim



> -----Original Message-----
> From: Hammond, Tony [mailto:T.Hammond@nature.com] 
> Sent: Tuesday, July 27, 2004 9:37 AM
> To: Myers, James D; public-semweb-lifesci@w3.org
> Subject: RE: SemWeb BOF @ ISMB 04
> 
> 
> Hi Jim:
> 
> How does DFDL relate to the proposed Standard Hex Format
> 
> 	 
> http://www.ietf.org/internet-drafts/draft-strombergson-shf-01.txt
> 
> which would also (I expect) allow binary data to be addressed 
> using Xpath?
> 
> Cheers,
> 
> Tony
> 
> 
> Tony Hammond
> 
> New Technology, Nature Publishing Group
> 4 Crinan Street, London N1 9XW, UK 
> 
> tel:+44-20-7843-4659
> mailto:t.hammond@nature.com
> 
> 
> ====
> Standards Track                                          J. 
> Strombergson
> Internet-Draft                                             
> InformAsic AB
> Expires: December 28, 2004                                    
> L. Walleij
>                                                           
> Ledasa Rangers
>                                                             
> P. Faltstrom
>                                                        Cisco 
> Systems Inc
>                                                            
> June 29, 2004
> 
> 
>                         The Standard Hex Format
>                      draft-strombergson-shf-01.txt
> Abstract
> 
>    This document specifies the Standard Hex Format (SHF), a new,
>    XML-based open format for describing hexadecimal data. SHF provides
>    the ability to describe both small and large, simple and complex
>    hexadecimal data dumps in an open, modern, transport and vendor
>    neutral format.
> 
> ====
> 
> > -----Original Message-----
> > From: public-semweb-lifesci-request@w3.org
> > [mailto:public-semweb-lifesci-request@w3.org] On Behalf Of 
> > Myers, James D
> > Sent: 27 July 2004 14:20
> > To: public-semweb-lifesci@w3.org
> > Subject: RE: SemWeb BOF @ ISMB 04
> > 
> > 
> > 
> > You might be interested in the Data Format Description
> > Language (DFDL) being standardized in the Global Grid Forum 
> > (http://forge.gridforum.org/projects/dfdl-wg/). It is 
> > essentially a way to describe ASCII or binary content in 
> > terms of an XML schema, allowing you to reference data inside 
> > binary files via Xpath, etc. (i.e. DFDL is meant to describe 
> > existing formats rather than being a way to encode XML in 
> > binary.) DFDL itself is still in flux, but there are a number 
> > of efforts that have started in that direction including BinX 
> > (www.edikt.org/binx/), and BFD 
> > (http://collaboratory.emsl.pnl.gov/sam/bfd/). In the not too 
> > distant future (months), I expect a DFDL 1.0 standard, 
> > Java-based parser(s), and eventually web/grid data 
> > virtualization services. If files have LSIDs, you have an 
> > easy way to define LSIDs for structures inside and will be 
> > able to use standard parsers to 'resolve' the data and 
> > retrieve the desired sub-structure(s).
> > 
> >   Jim
> > 
> > James D. Myers
> > Chief Scientist, Scientific Computing Environments Group
> > Computational Science and Mathematics Department Pacific 
> > Northwest National Laboratory
> > Phone: 610-355-0994
> > Fax:     208-474-4616
> > Jim.Myers@pnl.gov <mailto:Jim.Myers@pnl.gov> 
> > 
> > 
> > > -----Original Message-----
> > > From: public-semweb-lifesci-request@w3.org
> > > [mailto:public-semweb-lifesci-request@w3.org] On Behalf 
> Of wangxiao
> > > Sent: Monday, July 26, 2004 11:49 PM
> > > To: public-semweb-lifesci@w3.org
> > > Subject: RE: SemWeb BOF @ ISMB 04
> > > 
> > > 
> > > 
> > > Greeting all:
> > > 
> > > I am glad to see this lifesci thread getting life now :-).
> > > 
> > > I believe one of the main problem in life science is the 
> same as the 
> > > original problem that Eric Neumann has posted.  That is 
> how to refer 
> > > part of non-RDF document.
> > > 
> > > I am working on a project aimed at using RDF to represent 2D-Gel 
> > > data.  I understood why Mr. Peter Murray-Rust did not accept Tim 
> > > Berners-Lee's suggestion to create a CML ontology because 
> I have the 
> > > same concern of overhead for using either RDF or XML to represent 
> > > Gel data as well.  In my cases, my concern is this.  For some 
> > > applications, such as data submission and retrieval, I actually 
> > > wanted the gel encoded in a "compact" form. Personally, I don't 
> > > think XML is a good solution because there are more overhead than 
> > > the actual payload.  Neither will RDF a good solution for the same
> > > reason.  It is not that difficult to create an arbitrary 
> > > format to encode the gel data in a compact text based format 
> > > but then I don't know how the URI should be assigned to each 
> > > individual spot.  The URI for the spot is important because 
> > > if a spot is IDed or used to perform MS.  The URI can make it 
> > > easy to associate other type of descriptions. 
> > > 
> > > I guess this problem - assigning URI to part of another 
> resource -  
> > > is more general than it is just the life science alone. I 
> would be 
> > > very glad to hear if your "BDF" meeting can get some good 
> solution.
> > > 
> > > By the way, I think RDF lacks a graphical modelling language like 
> > > UML.  One of the main RDF activities will be designing 
> ontologies.  
> > > So, a graphical language sort of like UML will be very 
> useful during 
> > > the design phase and for the purpose of discussion and 
> presentation.  
> > > I don't think UML is suitable to presenting RDF/OWL and I goggled 
> > > internet but didn't find any useful one. I therefore 
> created one.  
> > > It helps me and it might be of useful to you too.  An 
> introduction 
> > > of the language, which I called DLG2, is at
> > > http://bioinformatics.musc.edu/cc/tools/dlg2/
> > > and a detailed 
> > > documentation is at 
> > > http://bioinformatics.musc.edu/cc/docs/dlg2/dl> g2.htm.  Please 
> > > note, I just started writing these document. 
> > >  There must be many problems.  Please forgive my careless 
> > > writing and any suggestions and criticism are welcome!
> > > 
> > > Also, our project - Charleston Core
> > > (http://bioinformatics.musc.edu/cc/) is > similar Eric Jain's
> > > but focuses on proteomic data.  If any of you have suggestion 
> > > or want to collaborate, please feel free to contact me.
> > > 
> > > Xiaoshu Wang
> > > 
> > > 
> > 
> 
> 
> 
> **************************************************************
> ******************
> DISCLAIMER: This e-mail is confidential and should not be 
> used by anyone who is not the original intended recipient. If 
> you have received this e-mail in error please inform the 
> sender and delete it from your mailbox or any other storage 
> mechanism. Neither Macmillan Publishers Limited nor any of 
> its agents accept liability for any statements made which are 
> clearly the sender's own and not expressly made on behalf of 
> Macmillan Publishers Limited or one of its agents. Please 
> note that neither Macmillan Publishers Limited nor any of its 
> agents accept any responsibility for viruses that may be 
> contained in this e-mail or its attachments and it is your 
> responsibility to scan the e-mail and attachments (if any). 
> No contracts may be concluded on behalf of Macmillan 
> Publishers Limited or its agents by means of e-mail 
> communication. Macmillan Publishers Limited Registered in 
> England and Wales with registered number 785998 Registered 
> Office Brunel Road, Houndmills, Basingstoke RG21 6XS
> **************************************************************
> ******************
> 
> 
Received on Tuesday, 27 July 2004 10:12:43 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:00:39 GMT