Re: Myths of the Semantic Web - Popular Misconceptions for Why it Won't Work from Steve Harris on 2006-11-08 (public-sweo-ig@w3.org from November 2006)

From: Steve Harris <steve.harris@garlik.com>
Date: Wed, 8 Nov 2006 17:29:57 +0000
To: <jeff.pollock@oracle.com>
Cc: <public-sweo-ig@w3.org>
Message-Id: <A47C2B9D-B6B5-4FA3-9977-B7AB395B36AC@garlik.com>
On 8 Nov 2006, at 15:35, Jeff Pollock wrote:

> <Devil's advocate>

Sure. There was a fair bit of that on my part too.

> a) depends on what we mean by "ontology" (personally, I am fairly  
> liberal
> about it, to mean a model defined in OWL, RDF, RDFS, or a  
> derivative eg:
> SKOS) ...sometimes in casual conversation I will allow for "any  
> formal model
> is an ontology" sort of thinking, which is how many treat the term.  
> What do
> you mean by ontology?

So, in our company we do have an OWL ontology (I forget which  
flavour), but we just treat it as RDF data. However, I believe there  
are domains where even that is unnecessary, and maybe even undesirable.

> b) if you're doing *no* reasoning whatsoever, why not just put your  
> model in
> XML?  There are more tools, and more widespread knowledge of how to  
> use
> XSD's...or better yet, if you have SQL developers already, why not  
> just put
> in relational tables and use an abstract denormalized schema?

We require a certain amount of flexibility in our data storage, we  
acquire new data on a regular basis and the information in it often  
contains things we hadn't even considered, an example would be  
affluence metrics for neighbourhoods. Our ontology does not contain  
the concept of a neighbourhood, or an affluence rating, or any way in  
which the tow might be connected. However we could still express that  
stuff in RDF, and query it without perturbing the existing data, just  
adding
?person :isIn ?neighbourhood
type triples, plus some appropriate statements about the neighbourhood.

My (limited) experience of XML is that it's hard to add stuff while  
maintaining the behaviour of old queries, and the schema becomes  
baroque over time with extensive additions.

My (more extensive) experience of SQL is that you certainly can  
design a schema that is extensible to add unexpected data, but again,  
over time the schema becomes complex, and the queries get  
increasingly impenetrable.

Once you've added a few dozen valuable, yet unexpected data sources  
like that you really get to appreciate RDF's monotonicity, and not  
having to mess with a schema every time.

> c) you say that "advantages we get from representing our data in  
> RDF are
> sufficient to justify the effort without any reasoning" -- but what  
> are
> those advantages?  ...are they really technical, or business,  
> advantages
> that couldn't be had with the proper Relational or XML schema?  Why  
> not?

I don't believe there's anything you can do with RDF and co. that you  
can't do with SQL, it's more a question of whether you would, or  
could be bothered, or whether you'd go mad trying.

We think that using RDF for the representation layer gives us an edge  
in an industry that has very dynamic data needs, but it's not as if I  
can prove that.

More speculatively I have a suspicion that it's easier to scale large  
triple stores than large relational stores, but I don't have any  
proof for that, it's just a hunch. We have two small clusters that  
each hold 2 KBs of around 1.25 billion triples, with a high churn  
rate. It wasn't especially difficult to develop the storage and  
SPARQL query engine. Less than 1 man year of effort, and the  
performance is good, but with room for improvement.

I don't believe that there are a large number of domains that can  
benefit form ontology-less RDF, but I do think there are some. In the  
end, nothing prevents you from retrofitting an ontology to your  
existing data, should you want one. I could see us going down that  
route in the future, once we have sufficient experience of the data  
domain, and if inference would be helpful.

I have always worked with ontologies (OWL or RDFS) in the past, but  
in Garlik we couldn't really see the need.

> </Devil's advocate>
>
> I believe that the area of data security and identity management is  
> a space
> that will greatly benefit from the SW family of languages - so, in all
> seriousness, if you have the time to reply to the above prodding,  
> I'd love
> to hear your thoughts.

Hmm... I seem to have subjected you to a brain-dump, hopefully it's  
of interest.

- Steve

> From: public-sweo-ig-request@w3.org [mailto:public-sweo-ig- 
> request@w3.org]
> On Behalf Of Steve Harris
> Sent: Wednesday, November 08, 2006 7:15 AM
> To: jeff.pollock@oracle.com
> Cc: public-sweo-ig@w3.org
> Subject: Re: Myths of the Semantic Web - Popular Misconceptions for  
> Why it
> Won't Work
>
>
>
> On 8 Nov 2006, at 13:06, Jeff Pollock wrote:
>>
>> There are plenty of "Myths" out there, such as:
>>
>> -	Semantic Web makes you tag everything again
>> -	Semantic Web requires a single global ontology
>
> Perhaps controversial, but I don't believe that all applications on
> the semantic web require ontologies at all. The application my
> company is deploying now has an ontology, but it's only used
> informatively, and we do no reasoning over it. The advantages we get
> from representing our data in RDF are sufficient to justify the
> effort without any reasoning, and its easier for developers with an
> SQL background to grok. I expect there is data in the store which is
> not described by any ontological structures.
>
> - Steve
>
>
Received on Wednesday, 8 November 2006 17:29:48 UTC