- From: Sampo Syreeni <decoy@iki.fi>
- Date: Thu, 18 Aug 2011 04:30:38 +0300 (EEST)
- To: ProjectParadigm-ICT-Program <metadataportals@yahoo.com>
- cc: Kingsley Idehen <kidehen@openlinksw.com>, "public-lod@w3.org" <public-lod@w3.org>, "semantic-web@w3.org" <semantic-web@w3.org>
On 2011-08-17, ProjectParadigm-ICT-Program wrote:
> Google just bought Motorola Mobility and Microsoft is rumored to buy
> Nokia. The killer apps for the semantic web will be apps for mobile
> devices.
But once again, is that because you cheer for SemWeb, or because you
have some specific application in mind which would be better served by,
say, RDF, than the existing technology like RDBMS+CSV? If you have the
latter in mind, why aren't you rich already?
Again, I really do like the idea of a Semantic Web (architecture) and
Linked Data (data). But even after I mentioned some FOAF derivative
being a potential "killer", the only real proposal for an application
turned out to be "structured profiles". That is, a FOAF derivative. As
for linked data, it was shown that yes, it is as useful as ever. But I
didn't see a *hint* of a real life application where some other,
existing technology couldn't fare as much or better than the current W3C
sanctioned SemWeb framework. Nothing I would invest in, because it
lowers the costs, gets things done, brings happiness to the masses, or
even hold any heretofore undiscovered functionality or bling over the
competition.
This might be a tired topic already, but it's going to stay relevant
till we actually have something to show the world; or until the whole
idea just dies a slow death. If I had some real, final answers here, I
too would already be rich. But I'm not. Then my ideas too stay rather
(wannabe-) academic. Them being:
1) URI based naming of shared concepts is the biggest part. A shared,
extensible, completely distributed and unambiguous namespace is
something new and *highly* variable. This is pretty much the only
new part we're delivering, so let's concentrate on that.
2) RDF/XML is just bad. The folks who came up with that should be shot.
Repeatedly. NTriples is more like it for an early adopter, if even
that.
2a) Standards only help if there is just one. All of the slower, messier
and "more correct" ones should be dropped wholesale once a simpler
one shows signs of catching on.
3) Triples are a neat model for semistructured data. What we actually
need though is structured data. There n-ary instead of binary (yes,
RDF is basically binary, and not ternary) works much better.
3a) This is reflected in the current query language, SPARQL. It's a
total mess for any query you'd usually use for Big Data. For the
latter you'd *always* use some variant of relational algebra, not
the equivalent path query. That's just wrong, since SemWeb + Linked
Data was supposed to deal with formally interpretable data overall,
and not just the easiest kind of human-produced metadata, like
manually input bibliographic references mandated by an academic's
superior.
4) We're about semantics, so why do we not preferentially target the
problem areas where semantics are and have been a problem in the
past? One simple problem I've bumped into in my daily database work
is that it's amazingly difficult and time-consuming to import and
export stuff from/to an RDBM, because even the lowest level type
semantics can't be carried by most export formats. Where's the
SemWeb solution to that? That's for certain a problem that is being
experienced every day by at least tens of thousands of people, it
has to do with (granted, low level) semantics, yet there is no
commonly accepted solution.
You'll probably have many other examples like that. Which is good.
What is bad is that we don't seem to be targeting/solving them right
now. Even now, it seems to be more about the infrastructure than
the final application.
5) As another example of how SemWeb could make a difference, it's
pretty high on distributed extensibility. Compared to the
alternatives like plain XML, and in particular most of the lesser
protocols. Can we not find the *concrete* fields where that is at
demand? EAV/CR already pretty much addressed that with polymorphic
medical records, very much in the vain of heterogeneous
triple-relation vein. So why aren't we following and bettering that
approach, actively?
6) If we're doing metadata, why can't we do meta-metadata and beyond
more effectively? Why is the reification issue so bogged down? I
mean, there's a huge use case for temporal (even bitemporal) data
out there, provenance, (cryptographically certified, or
PKI/WoT-derived) trust, disjunctive knowledge representation, or
whatnot, out there.
I sort of think, after the quad vs. triple debates, that much of
this could be dissolved simply by abandoning the triple model, while
staying with a shared, distributed, vocabulary for predicates
(triples)/column headers (the n-ary relational model).
And so on. I'm pretty sure that we could do better even at the
infrastructure level of SemWeb. It's just that first and foremost we'd
need some real applications which are well targeted, and can then drive
the rest of the work. Both in money, and in user feedback. Not perhaps
"killer apps" per se, but useful apps which uniquely leverage the
semantic web and couldn't exist without it.
--
Sampo Syreeni, aka decoy - decoy@iki.fi, http://decoy.iki.fi/front
+358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Thursday, 18 August 2011 01:31:43 UTC