- From: Danny Ayers <danny.ayers@gmail.com>
- Date: Tue, 4 Apr 2006 22:13:30 +0200
- To: "Cutler, Roger (RogerCutler)" <RogerCutler@chevron.com>
- Cc: public-semweb-lifesci@w3.org
On 4/4/06, Cutler, Roger (RogerCutler) <RogerCutler@chevron.com> wrote: My feeling is that until there are scalability issues that can be analysed, it's rather premature to try and solve them. Having said that - > 3 - Limit the amount of information that is actually put into RDF to > some sort of descriptive metadata and keep pointers to the real data, > which is in some other format. A contract I'm currently working on involves a large amount of geospatial data. Ideally this would all be represented in RDF, the data structures being highly irregular. But before I joined the project some feasibility studies had been done and it was decided (with good reason) that this wasn't a realistic option at this point in time because of the quantity of data. The overall strategy I suppose is to exploit proven tech wherever possible, in the interests of "just make it work". Incoming raw data will be handled as (streamed) XML, with a cluster of relational DBs for storage. RDF-everywhere will be approximated by layering: raw data (XML/relational); metadata (RDF(S)/OWL and XML Schema); meta-metadata (RDF(S)/OWL). As an aside this has thrown up some interesting problems relating to validation at the cusp of XML and RDF/OWL. Essentially XML validation doesn't cover enough; the notion of validation doesn't make much sense in the context of RDF(S); OWL consistency checking is pretty much a non-starter for performance reason (and probably quite strange modelling would be needed to get the requisite checks). Right now I'm looking at a bit of a Frankenstein setup, rules are probably going to figure highly. If anyone has pointers to related material I'd be grateful. Cheers, Danny. -- http://dannyayers.com
Received on Tuesday, 4 April 2006 20:13:39 UTC