- From: <matt@biomedcentral.com>
- Date: Thu, 23 Feb 2006 18:12:23 -0000
- To: <lists@hubmed.org>, <public-semweb-lifesci@w3.org>
Not sure if this is the same question, but I wonder how it might be most appropriate to express, within an XHTML TH header (and the equivalent within the a journal XML DTD), the URI of the datatype that the content values in that table column represent. This could also be used to render CSV data in a form suitable for easy unambiguous scraping, with the datatypes identified. Matt > -----Original Message----- > From: public-semweb-lifesci-request@w3.org > [mailto:public-semweb-lifesci-request@w3.org]On Behalf Of Alf Eaton > Sent: 23 February 2006 18:01 > To: public-semweb-lifesci@w3.org > Subject: Re: [BIORDF] Re: Unstructured vs. Structured (was: HL7 and > patient records in RDF/OWL?) > > > > To follow up on this, do you think it would be possible to create a > generic GRDDL transformation that would extract information from any > well-structured XHTML table, using the scoped <th> row and column > headers? > > alf. > > On 19 Feb 2006, at 15:07, Alf Eaton wrote: > > > > > I've been trying to decide on a good way to provide tabular > data in > > papers using XHTML, for presentation online. The best options seem > > to be either just embedding the data as an array using JSON, or > > using tables with class and id markup and allowing them to be > > processed with GRDDL or Javascript to transform the data. > Has there > > been any work on presenting spreadsheets in XHTML? > > > > alf. > > > > On 19 Feb 2006, at 12:17, Eric Neumann wrote: > > > >> > >> Matt, > >> > >> Spreadsheets are indeed useful as formatted sources that can be > >> readily converted into RDF. We've used them as the primary source > >> of expression data for BioDash (see attached averages; full > >> GeneLogic data at http://www.samsi.info/200304/dmml/web-internal/ > >> bio/data/data_rsvd.xls ). It almost seems a mapping tool could be > >> written to take any excel files, a GRDDL-like conversion > of column > >> headers, row-headers, and cells, to produce RDF from these (see > >> the example). > >> > >> In our example, we wrote the conversion scripts directly into the > >> excel file. The resulting (adenine/N3) file is show as well, with > >> symbols strings mapped to URI's. The cool thing here is that if > >> you add a DB query using the symbols strings (we did this within > >> BioDash), you can take the returned gene information, convert it > >> to RDF, and conenct it to the expression graph through the probes > >> for each the row (see resulting adenine file). > >> > >> Perhaps the BIORDF group should include using sdf sources as part > >> of their overall strategy for producing RDF from current > >> structured files (e.g., gene expression, screening, and clinical > >> data in sdf). Many published papers have data tables, and this > >> would be a great way to auto convert them to RDF! > >> > >> Eric > >> > >> --- Matthew Cockerill <matt@biomedcentral.com> wrote: > >> > >>> > >>> I couldn't agree more. > >>> > >>> Spreadsheets (and equivalently, CSV files) are a > >>> large fraction of > >>> the 'additional datafiles' that BioMed Central > >>> receives from authors. > >>> > >>> What would be great would be to be able to define > >>> some simple > >>> standards and/or templates which authors could > >>> follow in their > >>> spreadsheets, to allow the automatic recognition of > >>> key life science > >>> identifiers, and quantitative attributes, and so > >>> the generation of RDF. > >>> > >>> From my point of view, that's the most basic, > >>> practical and > >>> prevalent example of the whole semi-structured data, > >>> and so seems > >>> like a good starting point. > >>> > >>> Matt > >>> > >>> On 15 Feb 2006, at 5:42, Cutler, Roger (RogerCutler) > >>> wrote: > >>> > >>>> > >>>> That's too deep for me. I'll be satisfied, at > >>> least in an immediate > >>>> sense, with a demonstration of how to generate RDF > >>> from an Excel > >>>> spreadsheet. I think I'll just start saying > >>> "Excel spreadsheet" and > >>>> forget about the term that we use internally to > >>> categorize the > >>>> kinds of > >>>> problems we have. Spreadsheets are pretty much > >>> the 80-20 of that > >>>> problem, so why not call a spade a spade. I'm > >>> really not very good at > >>>> generalizing and categorizing. > > > > > This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com
Received on Thursday, 23 February 2006 18:12:38 UTC