Re: How will the semantic web emerge: SPARQL end point and $$€€ from Henry Story on 2005-12-21 (semantic-web@w3.org from December 2005)

From: Henry Story <henry.story@bblfish.net>
Date: Thu, 22 Dec 2005 00:30:51 +0100
To: Adrian Walker <adrianw@snet.net>
Cc: semantic-web@w3.org, Joshua Allen <joshuaa@microsoft.com>, tim.glover@bt.com, fugu13@mac.com, fmanola@acm.org, Ora Lassila <ora.lassila@nokia.com>
Message-Id: <6AB48696-36C3-414A-8ED5-8483538ADAD9@bblfish.net>
On 21 Dec 2005, at 21:12, Adrian Walker wrote:

> Hi Henry, Joshua and All --
>
> I have been lurking in the corner of your most interesting  
> discussion.  Time for my 2 cents worth....
>
> At 07:21 PM 12/21/2005 +0100, Henry wrote:
> All of this comes out automatically from having a good ontology.  
> The query mechanism is immediately defined, and the whole thing is  
> self documenting.
>
> Well, yes, the query mechanism is defined.  But the whole thing  
> aint self-documenting at any kind of comprehensible end user level.

It is if you use URLs for your relations. As I pointed out for  
foaf:Person try clicking on
http://xmlns.com/foaf/0.1/Person
where you can either get the result back in html, or in RDF/XML if  
you do the right content negotiation.

This is not the case for the uniprot ontology you mention below in  
your [2] for some reasons that I don't know about. But it has an  
excellent front end to compensate. Luckily they have a very good web  
site that makes it easy for those who understand proteins to make  
sense of the ontology.

> For example, in our system -- online at the site below -- a few  
> simple rules like this one
>
> estimated demand some-id in some-region is for some-quantity  
> gallons of some-finished-product in some-month of some-year for  
> estimated demand that-id some-fraction of the order will be some- 
> product from some-refinery that-quantity * that-fraction = some- 
> amount  
> ---------------------------------------------------------------------- 
> ---------------------------------------------------------------------- 
> --------------------------------------- for demand that-id that- 
> region for that-quantity that-finished-product we use that-amount  
> that-product from that-refinery
>
>
>
> automatically compile down to a full page of SQL that would be much  
> too complex to write reliably by hand [1].

Yes. You can also ask some really complex queries using the altavista  
search engine. That did not stop it being very successful.

> As another example [2], the free English question "Find genes  
> associated with human diseases" is written by hand as
>
> rulebase trans{ infer {[rdfs:subClassOf] ?a ?c} from  
> {[rdfs:subClassOf] ?a ?b} and {[rdfs:subClassOf] ?b ?c}; infer  
> {[uni:organism] ?p ?o} from {[rdfs:subClassOf] ?x ?o} and  
> {[uni:organism] ?p ?x}; } SELECT TOP 10 ?gene, ?name, ?text USING  
> uniprot RULEBASE trans WHERE {[uni:organism] ?protein  
> [urn:lsid:uniprot.org:taxonomy:9606]} and {[rdf:type] ?protein  
> [uni:Protein]} and {[uni:annotation] ?protein ?annotation} and  
> {[rdf:type] ?annotation [uni:Disease_Annotation]} and  
> {[uni:encodedBy] ?protein ?gene} and {[uni:name] ?gene ?name} and  
> {[rdfs:comment] ?annotation ?text}
>

This partly looks difficult because few people are specialists in  
Proteins. Give me a similar query with a good book ontology, and I  
bet you it won't look a tenth as complicated.

>
> As programmers we can have endless fun and employment with long and  
> opaque SQL or SPARQL queries, but they are hardly "self  
> documenting" for scientists or business people -- ultimately the  
> people with the funding.

RDF is self document as I explained above in a way that SQL will not  
be. Also given that it uses URLs it lends itself much better to cross  
organisational queries. You cannot really know if two columns named  
PERSON in two sql Databases mean the same thing. With RDF you do. If  
they are the same URL they are the same, otherwise its a bug.

Investors furthermore are not here to write queries. Neither is your  
mom and pops. Investors are here to give people money to create tools  
that will make use of the data you have made available. So you are  
writing an organiser or an add on to an organiser. The SNCF or  
Deutsche Bahn has put all their train time tables online. You can now  
rely on this to add features to your organiser than would otherwise  
not have been possible. See what people are doing with Google Maps.  
SPARQL just makes this a lot easier to do:

	http://blogs.sun.com/roller/page/bblfish?entry=sparql_to_ignite_web_2


> So, I have to disagree with your  "self-documenting" claim.   With  
> RDF and its data semantics, we are certainly better off than before.

A lot better.

> But until we fill the gap indicated in two examples above, with  
> application semantics (e.g as in [3]),  and with at least a  
> smidgeon of real world natural language semantics [4],

No need for that. That will be the role of some client application,  
which you are free to write yourself. These client application will,  
if you wish, communicate with their end users in natural language to  
use the data that the Semantic Web makes available. But that is a  
different level in the stack. Very useful one too, if I may say. But  
it is not required to get the whole ball rolling.

> uptake of the Semantic Web will likely be slow and painful [5].


The semantic web is going to be rolling in the next year, is my  
guess. But as we say at Sun: if you don't want to innovate, be our  
guest. Be the next Dell. Make our day ;-)


>                              -- Adrian
>
>
> INTERNET BUSINESS LOGIC (R)
> Online at www.reengineeringllc.com
> Shared, community use is FREE.
>
> Adrian Walker
> Reengineering
> PO Box 1412
> Bristol
> CT 06011-1412 USA
>
> Phone: USA 860 583 9677
> Cell:    USA  860 830 2085
> Fax:    USA  860 314 1029
>
>
>
> [1]  www.reengineeringllc.com/ 
> Oil_Industry_Supply_Chain_by_Kowalski_and_Walker.pdf
>
> [2]  http://labs.intellidimension.com/uniprot/
>
> [3] Backchain Iteration: Towards a Practical Inference Method that  
> is Simple Enough to be Proved Terminating, Sound and
> Complete. Journal of Automated Reasoning, 11:1-22.
>
> [4]  www.reengineeringllc.com/Internet_Business_Logic_e- 
> Government_Presentation.pdf
>
> [5]  Understandability and Semantic Interoperability of Diverse  
> Rules Systems
> www.w3.org/2004/12/rules-ws/paper/19
Received on Wednesday, 21 December 2005 23:31:05 UTC