Re: Comments on Data 3.0 manifesto from Kingsley Idehen on 2010-04-19 (public-lod@w3.org from April 2010)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Mon, 19 Apr 2010 11:28:33 -0400
To: Leigh Dodds <leigh.dodds@talis.com>
CC: Richard Cyganiak <richard@cyganiak.de>, public-lod <public-lod@w3.org>
Message-ID: <4BCC76A1.9090407@openlinksw.com>
Leigh Dodds wrote:
> Hi,
>
> On 17 April 2010 17:37, Kingsley Idehen <kidehen@openlinksw.com> wrote:
>   
>> ...
>> RDF has inadvertently caused mass distraction away from the fact that a
>> common Data Model is the key to meshing heterogeneous data sources.
>> People just don't "buy" or "grok" the data model aspect of RDF, so why
>> continue fighting this battle, when all we want is mass comprehension,
>> however we get there.
>>     
>
> As you've noted elsewhere, its the natural fixation of people on some
> elements of semantic web technology, like RDF/XML, that have been
> distracting people from the core message. Although personally I think
> that's changing, and focusing on Linked Data has helped.
>   
Yes, but not really changing fast enough, bearing in mind its 2010 and 
we should be demonstrating "one click" demos from a plethora of existing 
apps that showcase why Linked Data changes everything, in a very good way.

I can replicate every killer ODBC demo I gave circa. 1993 with Linked 
Data just by opening up a Descriptor URL. But, I can't really pull that 
off today smoothly because my effort will ultimately get hijacked by RDF 
issues like:

1. What is this thing you opened via a URL from Access or Excel?
2. Why do those LINKs do the wonderful things we see (e.g., polymorphic 
resultsets i.e., the pattern you see in snorql or isparql query results 
tables)
3. etc..

The distraction in the scenario above will either come from a confused 
user or a Semantic Web aficionado.  Ironically, both are equally 
confused, all of the time, but one party doesn't know it :-(

> I've personally found that initially concentrating on the underlying
> model does help people begin to appreciate the benefits. 

Yes!

> But whether
> you begin from RDF, or generalise further and describe EAV largely
> depends on your audience and intent. 

Yes, and most of the time (due to RDF unpopularity), you don't have an 
audience that groks or likes RDF.

> If you're aiming to illustrate
> how RDF is aligned with other data models then I can see how
> positioning it with respect to EAV will help.
>   

RDF would be a chapter 2 item (at best)  in my book about Linked Data 
(should I ever get round to writing one).

You have to understand Structured Data first, how is data structured, 
what does structure give you etc? Most important of all,  have "RDF 
Linked Data the Tweak of EAV" at the back of your mind since that's the 
ultimate destination.


RDF inadvertently conflates Data Model and Data Representation Formats. 
This is an old snafu from the first coming of RDF, and sadly we can't 
fix that in 2010. Simply stating: "RDF is based on a Graph Model..." 
isn't enough. What Graph Model are we talking about? One that dropped 
upon us from Space? Or one that we've used since the start of time?


You can never have a conversation about what is but an aspect of DBMS 
technology without illuminating the role of Data Models.

People like to claim they grok the fact that Resource Description 
Framework is: Graph Data Model and a collection of associated Data 
Representation Formats, but in the same guise all attention is paid to 
the latter. Even worse, RDF/XML  is still pitched as the only official 
variant of latter (even in 2010). Look at how long it's taken RDFa to 
emerge, and the amount of pressure its taken get it this far etc.

Remember, this community didn't even warm to RDFa, irrespective of its 
ability to simplify the process of publishing Structured Entity 
Descriptions based on the EAV model. Ditto Microdata, even though Ian 
Hickson and Co. opted to make the same religious errors due to their 
fixation with "No RDF",  even though Microdata provides yet another 
route to EAV model based Structured Entity Descriptions  with HTTP based 
Identifiers intact esp. once they Grok that Description Documents reside 
at URLs, and that said documents have Subject named using Generic HTTP 
scheme based Identifiers.

> But not with all audiences.
>   

I certainly haven't implied "all audiences".

I am trying to build a bridge of coherence via a long overdue "Linked 
Data" prequel meme. One that reinserts "Linked Data" into an innovation 
continuum.

> For some audiences we need to emphasise different aspects of the
> technology.

Naturally.

But I am more interested in a broader audience of people that already 
understand many of the things that get mangled in the realm of RDF 
gospel preaching.

>  There isn't a single pitch that captures the value because
> different audiences and communities have different focus and different
> problems to solve. 

I don't believe there is a single pitch for anything.

I believe you simply have to be willing to learn to tell the same story 
in different ways, bearing in mind your audience. My audience of 
interest today is a broader community of people that "switch off" the 
moment they encounter the letters "R-D-F".

> Starting from the data model is a useful common
> denominator approach but ultimately value descriptions will be more
> nuanced.
>   

Naturally.

> Even for technical audiences, beginning from EAV or RDF model doesn't
> always help. 

Well if the technical audience in question doesn't make the connection 
between DBMS realm and Linked Data, of course not. Likewise, if  they 
don't make the connection between standards based Data Access and Linked 
Data, of course not. The Data 3.0 manifesto or emphasis on the EAV 
cannot resonate with said audience, and its not who I am actually trying 
to speak to either.

> There's a whole community of developers out there who
> don't begin by framing things in terms of abstract models, in fact
> they may not even be familiar with the abstract models behind the
> tools they're using; they're just interested in how to get things
> done: clear practitioner advice and simple illustrations of the power
> of the technology. Adding another level of generalisation to the
> description isn't going to win them over.
>   

See my comments above.

I am much more interested in people that already work with data,  via 
tools without writing a single line of code. They are used to painting 
and sharing queries. Their only problem to date take the following forms:

1. shared queries reside in proprietary binary files
2. applications aren't HTTP savvy which makes sharing queries very 
difficult (in addition to #1) .

Examples:

1. MS Query (or any other tool that enables visual query construction)
2. MS Access
3. Entire universe of Report Writes and Business Intelligence Tools 
(also MDM in some quarters these days).

Simple contemporary examples:

1. Dabble - http://dabbledb.com/
2. Indicee -- http://www.indicee.com/
3. Others..

These are products from people that grok EAV, that have build products 
from the past. Add them to RDBMS founders that are only now groking 
Linked Data courtesy of EAV, and I hope you can see where my interests 
lie i.e., I shouldn't have to give these folks an RDF sermon when all 
they need to know is EAV + Generic HTTP scheme Identifiers for Names.

> Personally I think we ought to be collectively spending less time on
> worrying about how to pitch the technology and more effort on helping
> practitioners get things done by giving them the tools, guidance and
> support they need.
>   
Well I am not writing about how to pitch Linked Data, and I have no clue 
as to what you mean by a "Linked Data Practitioner". I simply know that 
there are people that need to access heterogeneously shaped data across 
disparate data sources.

I am simply saying to the audience above:

1. We have Structured Data
2. Here is how you make Structured Data (i.e. the underlying model)
3. Here is how you share Structured Data (via Descriptor Documents on an 
HTTP network).

When people understand 1-3 (in many cases making links to what they 
already grok), they can get on with exploiting the kind of 
individual/enterprise Agility levels that real Open (standards 
compliant) Data Access and Integration accords.

It simply isn't Rocket Science, far from it!  Even worse, it's a simple 
context switch at best, a lot of this stuff has been done before (way 
back in the '80s), the only thing that's new and novel is the fact that 
broader context, courtesy of HTTP ingenuity and ubiquity. No more, no 
less IMHO.


BTW - have you done a Google Trends style analysis on  "Semantic Web" vs 
"Linked Data" vs "OData"  vs "GData" [1] ?  There is a very important 
trend line there, and it had a lot to do with why I wrote the Data 3.0 
manifesto.

Again, I more interested in building EAV + Generic HTTP Identifiers 
based bridges re. "Linked Data" than preaching RDF Religion .

Links:

1. 
http://www.google.com/trends?q=%22Semantic+Web%22%2C+%22linked+data%22%2Codata%2Cgdata&ctab=0&geo=all&date=all&sort=1 
--  Google Trends for "Semantic Web" vs "Linked Data" vs "OData" vs 
"GData".
> Cheers,
>
> L.
>
>   

When all is said an done, I am a LINKER (bridge builder) not a FIGHTER 
(aka. Religious Zealot) :-)

-- 

Regards,

Kingsley Idehen	      
President & CEO 
OpenLink Software     
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Monday, 19 April 2010 15:29:06 UTC