Re: Vote for my Semantic Web presentation at SXSW from Sampo Syreeni on 2011-08-18 (public-lod@w3.org from August 2011)

From: Sampo Syreeni <decoy@iki.fi>
Date: Thu, 18 Aug 2011 04:30:38 +0300 (EEST)
To: ProjectParadigm-ICT-Program <metadataportals@yahoo.com>
cc: Kingsley Idehen <kidehen@openlinksw.com>, "public-lod@w3.org" <public-lod@w3.org>, "semantic-web@w3.org" <semantic-web@w3.org>
Message-ID: <Pine.LNX.4.64.1108180334520.9458@lakka.kapsi.fi>
On 2011-08-17, ProjectParadigm-ICT-Program wrote:

> Google just bought Motorola Mobility and Microsoft is rumored to buy 
> Nokia. The killer apps for the semantic web will be apps for mobile 
> devices.

But once again, is that because you cheer for SemWeb, or because you 
have some specific application in mind which would be better served by, 
say, RDF, than the existing technology like RDBMS+CSV? If you have the 
latter in mind, why aren't you rich already?

Again, I really do like the idea of a Semantic Web (architecture) and 
Linked Data (data). But even after I mentioned some FOAF derivative 
being a potential "killer", the only real proposal for an application 
turned out to be "structured profiles". That is, a FOAF derivative. As 
for linked data, it was shown that yes, it is as useful as ever. But I 
didn't see a *hint* of a real life application where some other, 
existing technology couldn't fare as much or better than the current W3C 
sanctioned SemWeb framework. Nothing I would invest in, because it 
lowers the costs, gets things done, brings happiness to the masses, or 
even hold any heretofore undiscovered functionality or bling over the 
competition.

This might be a tired topic already, but it's going to stay relevant 
till we actually have something to show the world; or until the whole 
idea just dies a slow death. If I had some real, final answers here, I 
too would already be rich. But I'm not. Then my ideas too stay rather 
(wannabe-) academic. Them being:

1)  URI based naming of shared concepts is the biggest part. A shared,
     extensible, completely distributed and unambiguous namespace is
     something new and *highly* variable. This is pretty much the only
     new part we're delivering, so let's concentrate on that.

2)  RDF/XML is just bad. The folks who came up with that should be shot.
     Repeatedly. NTriples is more like it for an early adopter, if even
     that.

2a) Standards only help if there is just one. All of the slower, messier
     and "more correct" ones should be dropped wholesale once a simpler
     one shows signs of catching on.

3)  Triples are a neat model for semistructured data. What we actually
     need though is structured data. There n-ary instead of binary (yes,
     RDF is basically binary, and not ternary) works much better.

3a) This is reflected in the current query language, SPARQL. It's a
     total mess for any query you'd usually use for Big Data. For the
     latter you'd *always* use some variant of relational algebra, not
     the equivalent path query. That's just wrong, since SemWeb + Linked
     Data was supposed to deal with formally interpretable data overall,
     and not just the easiest kind of human-produced metadata, like
     manually input bibliographic references mandated by an academic's
     superior.

4)  We're about semantics, so why do we not preferentially target the
     problem areas where semantics are and have been a problem in the
     past? One simple problem I've bumped into in my daily database work
     is that it's amazingly difficult and time-consuming to import and
     export stuff from/to an RDBM, because even the lowest level type
     semantics can't be carried by most export formats. Where's the
     SemWeb solution to that? That's for certain a problem that is being
     experienced every day by at least tens of thousands of people, it
     has to do with (granted, low level) semantics, yet there is no
     commonly accepted solution.

     You'll probably have many other examples like that. Which is good.
     What is bad is that we don't seem to be targeting/solving them right
     now. Even now, it seems to be more about the infrastructure than
     the final application.

5)  As another example of how SemWeb could make a difference, it's
     pretty high on distributed extensibility. Compared to the
     alternatives like plain XML, and in particular most of the lesser
     protocols. Can we not find the *concrete* fields where that is at
     demand? EAV/CR already pretty much addressed that with polymorphic
     medical records, very much in the vain of heterogeneous
     triple-relation vein. So why aren't we following and bettering that
     approach, actively?

6)  If we're doing metadata, why can't we do meta-metadata and beyond
     more effectively? Why is the reification issue so bogged down? I
     mean, there's a huge use case for temporal (even bitemporal) data
     out there, provenance, (cryptographically certified, or
     PKI/WoT-derived) trust, disjunctive knowledge representation, or
     whatnot, out there.

     I sort of think, after the quad vs. triple debates, that much of
     this could be dissolved simply by abandoning the triple model, while
     staying with a shared, distributed, vocabulary for predicates
     (triples)/column headers (the n-ary relational model).

And so on. I'm pretty sure that we could do better even at the 
infrastructure level of SemWeb. It's just that first and foremost we'd 
need some real applications which are well targeted, and can then drive 
the rest of the work. Both in money, and in user feedback. Not perhaps 
"killer apps" per se, but useful apps which uniquely leverage the 
semantic web and couldn't exist without it.
-- 
Sampo Syreeni, aka decoy - decoy@iki.fi, http://decoy.iki.fi/front
+358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Thursday, 18 August 2011 01:31:43 UTC