Re: is linked data about RDF or EAV or just structured data? from Reza B'Far (Oracle) on 2012-08-08 (public-ldp-wg@w3.org from August 2012)

From: Reza B'Far (Oracle) <reza.bfar@oracle.com>
Date: Wed, 08 Aug 2012 11:19:43 -0700
To: public-ldp-wg@w3.org
Message-ID: <5022ADBF.7030300@oracle.com>
Andy -

See below -

[Andy]
Finding the balance of flexibility and commonality is important.  i 
don't think we have advanced things if we end up with a metamodel but 
isolated islands of connected apps because different groupings use 
different serializations.
[Reza]
I think this is a key statement that I don't agree with and would like 
to see how many of the people here do or do not agree with. My point is 
not religious:  I don't care about RDF one way or the other.  What I'm 
saying is that it's completely impractical if we're proposing that all 
the people in the world who have existing data in much more prevalent 
formats are to convert their serialization models to RDF/XML.  I think 
this, by itself, will make the probability of wide adoption of this 
standard go close to 0.  It's like asking people to go convert all their 
data.  It's impractical.  Won't happen.  Costs too much.  Are you saying 
that we're going to have lots of large data providers all of sudden say 
"wow, there is this cool new standard, let me go spend a billion dollars 
to convert all my data to it so that it can be linked to and I can go 
link to other people".  I think we need to consider the adoption 
ramifications and think about why Sem Web standards have experienced 
lesser success than other W3C/TimBL standards such as Http/HTML/etc.

I use Prov-DM as an example.  I think Prov and HLCS have both done an 
incredible job in building a semantic standard that has a good 
probability of being adopted.  Prov, for example, is not saying "don't 
do RDF"... it just separates the conceptual model (Prov-DM) from the 
serialization, provides 2 serializations (Prov-N and Prov-O), and says 
you can hae your own serialization if you want. I would say 90% of the 
value is still in Prov-DM which provides a conceptual model so that 
implementers think about the structural design of their systems, how to 
deal with time and space complexity (real problems), and other 
ramifications that the implementation of the standard will impose on 
their systems/products/etc.  This wasn't something that I came up with, 
rather something that the academics on Prov came up with, but I give 
them huge props for that [and incidentally, an we get some academics to 
make comments on this thread?  Any time you get only commercial folks 
talking on threads, I become very weary about motivations, etc.]
*
*Best.

On 8/8/12 6:33 AM, Andy Seaborne wrote:
> (maybe out of date ...)
>
> On 07/08/12 21:25, Reza B'Far (Oracle) wrote:
>> Ok. Thanks.  I think we're making progress.  Here is my case:
>>
>> The largest repositories of data that exist today are NOT provided as
>> RDF so far as I know (clearly, this is a subjective statement and open
>> to being corrected).  For example, Data.gov which is a giant repository
>> of data, is not in RDF.  While RPI does provide facilities to make it
>> into RDF, there are licensing issues, etc. around it
>> (http://creativecommons.org/licenses/by-sa/3.0/ for RPI's version).
>>
>> So, my belief, at least, is that it would actually be more beneficial to
>> the entire effort and adoption of the spec if RDF/XML is made optional
>> because it will entice implementers to "back-into" full implementation
>> versus having to do waterfall-style full implementation which may be
>> impractical for smaller entities and/or projects that need to show value
>> to get traction.  So, the concern is rooted in at least one practical
>> perspective.  If we only have a handful of large entities and niche
>> players implement a standard, it's not a very successful standard.
>>
>> There are huge benefits to a standard without requiring a serialization
>> format: namely removal of all abstraction impedance mismatches that can
>> lead to data loss, etc.  I see that as the main benefit of a standard
>> for Linked Data versus the serialization recommendation.  In practical
>> terms, I may need to hook into different systems and write parsers, but
>> the parsers are trivial.  I'm not advocating that we encourage this, but
>> that we allow it.
>>
>> Best.
>
> Finding the balance of flexibility and commonality is important. i 
> don't think we have advanced things if we end up with a metamodel but 
> isolated islands of connected apps because different groupings use 
> different serializations.
>
> Imposing some concrete commonality is a cost/restriction in 
> flexibility and for other reasons (including getting people to agree 
> on the common data format!)
>
> The question is whether that commonality enables a larger ecosystem 
> and gets it's pay-back that way.
>
> And I'm not proposing everything is in RDF - if just the 'record' part 
> is (and not the 'item' part), then generic services can be provided 
> which do not depend on adapters/extensions/translators.
>
> Where the balance of universally understood and payload-specific is, I 
> don't know.
>
> The submission has RDF as both 'record' and 'item'.
>
> Another example would be an image library.  The items are JPEGs. 
> POSTign a JPEG to a collection automatically generates the record; 
> there is no need for all interactions to explicitly include record and 
> item. The JPEGs may even stored in one place (key-value store for 
> scale) and the info about them in another (hierarchical naming).
>
> But at least a common way to ask about a item by URI can be provided 
> by naming the record part.
>
>
> The submission UCR document makes interesting points about application 
> integration; only partial successes have occurred with:
>
> [[
> Implement an API for each application, and then, in each application, 
> implement “glue code” that exploits the APIs of other applications to 
> link them together.
>
> Design a single database to store the data of multiple applications, 
> and implement each of the applications against this database. In the 
> software development tools business, these databases are often called 
> “repositories.”
>
> Implement a central “hub” or “bus” that orchestrates the broader 
> business process by exploiting the APIs described previously.
> ]]
>
> The second is limited because of deciding the schema ... which makes 
> it (abstractly) little different to the agreed API approach.
>
> I see the Linked Data (with RDF) approach is an attempt to increase 
> the commonality and enable a new kind of integration.
>
>     Andy
>
> 'record' and 'item' aren't very good names but at least they get away 
> from metadata.
>
>
>
Received on Wednesday, 8 August 2012 18:21:01 UTC