Re: Putting Government Data online from Sören Auer on 2009-06-24 (semantic-web@w3.org from June 2009)

From: Sören Auer <auer@informatik.uni-leipzig.de>
Date: Wed, 24 Jun 2009 13:18:10 -0400
To: Azamat <abdoul@cytanet.com.cy>
CC: "'SW-forum'" <semantic-web@w3.org>, "John F. Sowa" <sowa@BESTWEB.NET>
Message-ID: <4A425FD2.1080105@informatik.uni-leipzig.de>

Azamat wrote:
> As it is, , Linked Data looks a big mess-up of data, 

That's the intention of Linked Data - create a big mess-up of data. ;-) 
When there is sufficient quantity of data messed-up and search engines 
start allowing people to aggregate and interconnect this data it will 
gain structure and coherence automatically, since data providers will 
strive to align their data according to common schemes. I think only 
such a bottom-up approach (transitioning from quantity to quality to 
speak in philosophical terms) will work on the Web, nothing else!

> http://linkeddata.org/, with low quality content and lack of any 
> knowledge structure or inference mechanism.

Lack of inference mechanisms might be considered as a feature! I do not 
see any hope that comprehensive inference algorithms will achive the 
required scalability required for the Web. Keyword searches on the web 
work scalable now. The next challenge is to make conjunctive querying 
(ala SQL/Datalog/SPARQL) Web scale. After we solved that we can look 
into reasoning (although some already try now ;-).

> I share the concerns recently expressed by John Sowa on other forum:
> 
> "My major complaint about the Semantic Web is that they ignored all
> the development techniques that worked successfully for years, and
> they failed to provide a migration path.

I think that the initially too dominant focus on reasoning and 
comprehensive knowledge representation hindered the deployment of the SW.
Also the disgrace of its late birth (in German I would phrase it 
"Ungnade der späten Geburt") slowed the SW down: Companies started 
converting their technologies to XML (and they are still busy with that) 
and do not want to switch again soon to another technology stack, 
although in particular for data oriented applications the RDF stack 
would be much more appropriate.

> Following are some of the most egregious blunders:
> 
>  1. Ignoring the fact that every major web site is built on top
>     of a relational database.  The major sites use big commercial
>     databases.  Smaller sites are based on LAMP -- Linux, Apache,
>     MySQL, and Perl, Python, or PHP.

There was quite early support for many of the scripting languages - cf. 
e.g. the Scripting for the Semantic Web Workshop series [1], 
Powl/OntoWiki [2], RAP [3] etc.
Meanwhile there is also a large amount of approaches related to 
integrating DBs and RDF cf. the RDB2RDF XG report [4] and Triplify [5] 
(which targets DB backed Webapps).

>  2. Building RDF on top of triples, instead of the SQL n-tuples.

This is best what could have happened (although I'm a big fan of RDBs 
and SQL). On the Web, however, its all about interlinking and 
integrating data - n-tuples do not merge naturally, triples do!

>  3. Failing to integrate their notations with UML diagrams, which
>     include type hierarchies and various notations for constraints.

I think the Semantic Web should rather focus on lightweight technologies 
  such as REST, Webapps, Wikis etc. - these will be better enablers.

Cheers,

Sören

[1] http://www.semanticscripting.org/
[2] http://ontowiki.net
[3] http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/
[4] http://esw.w3.org/topic/Rdb2RdfXG/StateOfTheArt
[5] http://Triplify.org

-- 

--------------------------------------------------------------
Sören Auer, AKSW/Computer Science Dept., University of Leipzig
http://www.informatik.uni-leipzig.de/~auer,  Skype: soerenauer

Received on Wednesday, 24 June 2009 17:19:12 UTC