Re: [tangle] getting the semweb exactly wrong

tim.glover@bt.com wrote:
> Frank,
> 
> 
>>I want the 
>>ability to take data in any form people find useful, for whatever
> reason 
>>(relational tables, XML, whatever) and *interpreting it* as RDF, 
> without  
>>the need to necessarily store it that way.
> 
> 
> I tend to take the complementary view, that in an ideal world data
> should be stored as triples, and then accessed using multiple XML,
> relational model, or RDF "views".  

I tend to disagree, but generally for the reason you mention below. 
What you describe is essentially the ANSI/SPARC 3-schema architecture of 
storage, conceptual, and application levels.  I see RDF (or OWL, or 
something still-richer) as being in the central (conceptual) position, 
supporting multiple application views.  What the storage level looks 
like shouldn't necessarily be determined by what makes sense either as 
an application model or a conceptual model, but rather by efficiency and 
other implementation-level considerations.

Sometimes those implementation considerations may involve keeping an 
older application level model.  E.g., if you have lots of relational 
data, and find it efficient to process it that way for lots of your 
apps, it should be possible to retain relational storage, even though 
you have an RDF conceptual model.  If you're generating lots of XML 
documents, it should be possible to store those directly and still have 
an RDF conceptual model (many applications will want to access the XML 
structures directly, they'll be able to, and that will improve 
efficiency by reducing transformations).  You will still have an RDF 
conceptual model, and be able to generate or interpret data in that form 
if you need it (for integrating your data with outside apps, for example).

Implementation-level considerations may also involve storing the data in 
a form that doesn't look like either any of the applications models, or 
like triples (e.g., clustering by subject, plus various kinds of 
indexes).   But, it could also turn out that triples (or maybe variants 
like quads for supporting named graphs, provenance, or other similar 
add-ons) might be the right storage level structure.  That's the value 
of data independence;  the storage level can be defined independently of 
the other levels.  My point isn't that triples is the *wrong* way to 
store the data;  just that it isn't necessarily the only *right* way.

--Frank

> 
> The relational model was introduced as a way of decoupling applications
> from data storage, because a relational model can support different
> *application* views (and in my view XML is a mighty leap backwards in
> this respect!). As you point out, triples are the relational model taken
> to its logical conclusion - RDF triples can support different
> *relational* views (schemas).
> 
> Tim
> 

Received on Wednesday, 4 January 2006 13:39:26 UTC