- From: Tom Stambaugh <tms@stambaugh-inc.com>
- Date: Wed, 1 Mar 2006 08:55:12 -0500
- To: <public-semweb-lifesci@w3.org>
I think Eric correctly observes > ... most (bioinformatics) databases have data > models that are such that nothing but a full blown programming language > will do. (And a lot of manual clean up work may be required in addition.) I fear that the discussion about XSLT specifically and conversion in general misses a fundamental aspect of how a "full blown programming language" is used in this context -- namely, that the RDF/OWL representations are used to *generate* models in various languages. I suspect that most of us won't write tools that read and write RDF/OWL to manipulate this data. Instead, I suspect that programmers will build "harnesses" that accept an RDF/OWL representation and emit a dynamic model in a specific language -- Java, Python, Javascript, whatever -- and scientists will then use the resulting program to, for example, access various databases. In my view, the achilles heel of XSLT and any similar *query* tool is that these tools are not designed to handle the dynamics of the information being modeled -- they have at best a very limited execution or process model. This is not to say that such tools are useless for us, it is instead to observe that they belong in the quiver of arrows that we use to analyze the *results* of running a model generated in a "full blown programming language" from an RDF/OWL representation. Thus, while I agree that XQuery might "abolish the need to commit to a certain programming language", in my view it does so NOT because we'll all start writing our models in XQuery, but because we'll be able to write our models in whatever language we choose, relying on semantic web technology -- including XQuery -- to ensure their completeness, reliability, and accuracy. Sure, some of us will choose to read and write XQuery; I just don't see this as being particularly widespread. After all, some of us choose to read and write assembler. > I agree with that, too. This might be a major problem for the transition > from non-RDF- to RDF- based > bioinformatics. People will not switch to RDF from one day to the other, > so you need a transitional > period where you offer your data both in non-RDF and in RDF format (like > Reactome does, for example). > The problem is that most databases are growing steadily, and you have to > keep both versions updated. > This is severly complicated because of the inevitable need for manual > clean-up work that has to be done > prior to the conversion to RDF. I don't see RDF as an *alternative* to a database. It might be an alternative serialization of a database, but I'm not sure about even that. Won't we use RDF/OWL to emit SQL that we'll then use to query our databases? I'm thus not sure that we ever "switch to RDF" -- don't we instead begin using RDF to qualify, validate, and optimize our database representations? It seems to me that RDF helps us describe and model the structure of our data. In my view, we'll then *use* this RDF-derived description and model to build relational databases that hold said data. In this worldview, the existence of the RDF description then helps us keep the dynamic models -- written in Java, Python or whatever -- in synch with the underlying relational descriptions, kept in relational DB's like MySql and Oracle. Perhaps I'm the one who's fundamentally mistaken about all this, though. Thanks, Tom
Received on Wednesday, 1 March 2006 13:55:31 UTC