W3C home > Mailing lists > Public > public-lod@w3.org > April 2010

Re: Comments on Data 3.0 manifesto

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sun, 18 Apr 2010 16:15:42 -0400
Message-ID: <4BCB686E.7030409@openlinksw.com>
To: nathan@webr3.org
CC: public-lod@w3.org
Nathan wrote:
> afaict the focus is on introducing the benefits of EAV (and by extension
> rdf and linked data) as a data model that can be used by us all behind
> the scenes (reaping rich rewards), rather than just mapping our data to
> rdf / owl /linked data (and by implication EAV) at the publishing level.
>
> as more people consume and use linked data, rdf, owl, they come to
> realise the many advantages, and consider making the move to storing all
> data as rdf / linked data and working on it in the same way - I believe
> the Data 3.0 manifesto can be used to introduce the benefits of EAV
> behind the firewall (and help programmers get the model, data,
> understanding right in the first place) then go on to expose this all to
> the world as rdf / linked data.
>
> In many ways it's coming at linked data in the other direction, showing
> the EAV model, then the different ways of exposing the eav model (odata,
> gdata, rdf/linkedata) and people can work it out from there as to which
> is best.
>
> IMHO 1000 developers groking EAV and adopting it will be exponentially
> more benefit to the linked data / web 3 movement than 1000 developers
> all reading a few docs and mapping some rectangular data through to
> linked data without really understanding what's going on.
>   

Well said!

Basically, this is about "Linked Data" the prequel.

You can't have Structured Data without a Data Model.

We have a uniform (and innately universal) data model in EAV :-)

Kingsley
> Best,
>
> Nathan
>
> ps: John, that "Rectangular Data" thing is really catching on :-)
>
> pps: whilst talking about bridges, ask any bridge builder about the
> strength of triangles then relate to the trinity exposed via EAV.
>
>
> Jiří Procházka wrote:
>   
>> So essentially, all this is a cover-up maneuver to sell RDF to the
>> people masked as something else, more familiar?
>>
>> If so, I understand why you feel this is necessary, after all the goal
>> is not to sell the customer what he asked for, but what he really wanted
>> but didn't realize or could fully express (this time customer being tech
>> folk).
>> Anyway I rather use and try to market RDF as it is, maybe it's a bit too
>> fast for some, but I guess I've left too little people in utter
>> confusion yet to try so different ways :)
>>
>> But before proceeding with your plan to fix RDF + Linked Data marketing,
>> I ask you to consider also what in marketing RDF was done right beside
>> what wasn't.
>>
>> For example RDF has clear name (Data 3.0? not very good name IMHO), the
>> core model is very simple and has been numerously very well explained.
>> On the other hand your manifesto sounds a bit too complex, more like a
>> spec than a manifesto. For the effect I think you are aiming you need
>> something very simple and striking...
>> Not to mention it is first time I am hearing about EAV model, we all are
>> from different backgrounds so this terminology won't have much of an
>> impact I fear, though it is still good to introduce yet distant
>> communities.... ;)
>>
>> For me greatest value of RDF and Linked Data lies in semantics - the
>> ontologies (RDFS/OWL), which, as far as I understand it, the EAV model
>> doesn't touch at all which in my eyes makes it only a bit better than
>> tabular data models ("rectangular" as someone nicely coined some time
>> ago somewhere).
>>
>> Overall it seems to me like building a sand island in middle of a wide
>> river to ease construction of bridges across it... I guess you have
>> tried building a bridge without the island a few times and it collapsed
>> every time, so I understand why you are building the island. But maybe I
>> got better steel and mine bridges would last... maybe...
>> On one hand I am glad we try these various ways and on the other I keep
>> myself asking if the gain outweighs the price of fragmentation...
>>
>> Best,
>> Jiri Prochazka
>>
>> On 04/17/2010 10:51 PM, Kingsley Idehen wrote:
>>     
>>> John Erickson wrote:
>>>       
>>>> Hi Kingsley!
>>>>
>>>> Reading between the lines, I think I grok where you are trying to go
>>>> with your "manifesto." For it to be an effective, stand-alone document
>>>> I think a few pieces are needed:
>>>>
>>>> 1. What is your GOAL? It should be clearly stated, something like, "to
>>>> promote best-practices for standards-compliant access to structured
>>>> data object (or entity) descriptors by getting data architects to do X
>>>> instead of Y," etc.
>>>>   
>>>>         
>>> Okay, I'll see what I can do.
>>>
>>> This document is really a continuation of a document that's actually
>>> missing from the Web, sadly.
>>>
>>> A long time ago (start of Web 2.0), there was a Data 2.0 manifesto by
>>> Alex James (now at Microsoft), so in classic two-fer fashion I've opted
>>> to kill two birds with a single stone:
>>>
>>> 1. Linked Data incomprehension (Technical and Political)
>>>
>>> 2. Data 2.0 manifesto upgrade and update.
>>>
>>>       
>>>> 2. What is your MOTIVATION? I think this is implicit in your current
>>>> text --- your argument seems to be that TBL's "Four Principles" are
>>>> not enough --- but you need to make your motivations explicit and
>>>> JUSTIFY them. If TBL's principles are too nebulous, explain concisely
>>>> why and what the implications are. Keep in mind that they seem to be
>>>> "good enough" for many practitioners today. ;)
>>>>   
>>>>         
>>> My motivation is simply this: Get RDF out of the way!
>>> The "RDF incomprehension cloud" is only second to what's heading across
>>> Northern Europe from Iceland, re. obscuring a myriad of routes to Linked
>>> Data comprehension.
>>>
>>> How can we spend 12+ years on the basic issue of EAV + de-referencable
>>> identifiers? Compounded by poor monikers such as: Information Resource
>>> and Non-Information Resource. We have Data Objects (Entities, Data Items
>>> etc.) and their associated Descriptor Documents (Representation Carriers
>>> or Senses), its always been so!
>>>
>>> Note,  RDF "the Data Model" doesn't exist in the minds of the broader
>>> Web audience (I am not sending an inbound meme to the Semantic Web
>>> Community, my meme is being beamed to a wider audience that's taking way
>>> to long to grok the essence of the Linked Data matter).
>>>
>>> I (and many others) are utterly fed up with trying to accentuate the
>>> fact that RDF is based on a Graph Data Model. The initial "RDF/XML is
>>> RDF" conflation has dealt a fatal blow to RDF re., broad audience
>>> communications.
>>>
>>> EAV has been with us forever, people already use applications that are
>>> based on this model, across all major operating systems. Why not
>>> triangulate from this position (top down) instead of bottom up (which
>>> ultimately reeks of NIH rather than a Cool Tweak)?
>>>
>>>       
>>>> 3. Be SPECIFIC about what practitioners must do moving forward. I
>>>> think you've made a good start on this, to the extent that you have
>>>> lots of "SHOULDS." I would argue that more specificity of a different
>>>> kind is needed; if data architects SHOULD be following more abstract
>>>> EAV conceptualizations, what exactly should they do in practice?
>>>>   
>>>>         
>>> Hmm.. will see what I can do.
>>>
>>> This is a seed document (I hope). Anyone (including yourself) should be
>>> able to add perspective to it etc..
>>>       
>>>> Finally, on the deeper question of motivation, I suggest that while a
>>>> historical argument can be made that RDF is likely a subset or special
>>>> case of EAV, the community has developed convenient and familiar
>>>> languages for expressing RDF (such as N3 and Turtle); practitioners
>>>> are much less familiar with EAV. Does the community really lose
>>>> anything by using RDF as its shorthand?
>>>>   
>>>>         
>>> RDF is a variant of EAV courtesy of Generic HTTP scheme Identifiers for
>>> Names.
>>>
>>> Nothing in what I am saying or seeking dislocates RDF from the big
>>> picture here. It just isn't the item for starting conversations with
>>> people outside the Semantic Web Community (still very small in the grand
>>> scale of things).
>>>
>>> I am simply seeking to extend the picture (coherently) without
>>> unnecessary RDF specificity.
>>>
>>> OData, GData, Core Data, are all EAV model based, in the very worst case
>>> they make RDF based Linked Data easier to generate, thus a win-win.
>>> Sadly, that isn't how these other EAV based approaches are perceived,
>>> the gut instinct is to pick them apart as not conforming to the RDF
>>> based Linked Data principles (btw -- when TimBL added RDF and SPARQL to
>>> the meme, he basically put a crack in the pot IMHO).
>>>
>>>       
>>>> Perhaps you can suggest a pattern within current RDF practice that
>>>> more strongly enforces EAV principles?
>>>>   
>>>>         
>>> RDF is all fine re. EAV.
>>> It about getting other communities (e.g. WEb 2.0) to adopt and exploit
>>> EAV via use of de-referencable Identifiers (Names).
>>>
>>> What I am hoping is that we just tweak how we introduce Linked Data,
>>> establish the fact that we have a common data model at the base, a model
>>> that we already use (in our heads) and across almost every application
>>> we've worked with to date. Then show how Linked Data is ultimately about
>>> deconstructing application data silos so that we have a much richer
>>> corpus of structured data, across a myriad of boundaries (application,
>>> OS, network etc..), that amenable to "data meshing" rather than "data
>>> mashing".
>>>
>>> IMHO. The Linked Data Value Proposition and Elevator Pitch is simply
>>> this: Individual and/or Enterprise Agility via Data Silo Deconstruction.
>>>
>>> Kingsley
>>>       
>>>> John
>>>>
>>>> On Sat, Apr 17, 2010 at 12:37 PM, Kingsley Idehen
>>>> <kidehen@openlinksw.com> wrote:
>>>>  
>>>>         
>>>>> Richard Cyganiak wrote:
>>>>>    
>>>>>           
>>>>>> Hi Kingsley,
>>>>>>
>>>>>> Regarding your blog post at
>>>>>>
>>>>>> http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1624
>>>>>>
>>>>>>
>>>>>> Great job -- I like it a lot, it's not as fuzzy as Tim's four
>>>>>> principles,
>>>>>> not as mired in detail as most of the concrete literature around linked
>>>>>> data, and on the right level of abstraction to explain why we need
>>>>>> to do
>>>>>> certain things in linked data in a certain way. It's also great for
>>>>>> comparing the strengths and weaknesses of different data exchange
>>>>>> stacks.
>>>>>>       
>>>>>>             
>>>>> Thanks, happy its resonating.
>>>>>
>>>>> RDF has inadvertently caused mass distraction away from the fact that a
>>>>> common Data Model is the key to meshing heterogeneous data sources.
>>>>> People
>>>>> just don't "buy" or "grok" the data model aspect of RDF, so why continue
>>>>> fighting this battle, when all we want is mass comprehension, however
>>>>> we get
>>>>> there.
>>>>>    
>>>>>           
>>>>>> A few comments:
>>>>>>
>>>>>> 1. I'd like to see mentioned that identifiers should have global scope.
>>>>>>       
>>>>>>             
>>>>> Yes, will add that emphasis for sure. I guess "Network" might not
>>>>> necessarily emphasize that strongly enough.
>>>>>    
>>>>>           
>>>>>> 2. I'd prefer a list of the parts of a 3-tuple that reads:
>>>>>>
>>>>>>     - an Identifier that names an Entity
>>>>>>     - an Identifier that names an Attribute
>>>>>>     - an Attribute Value, which may be an Identifier or a Literal
>>>>>> (typed
>>>>>> or untyped).
>>>>>>
>>>>>>   This avoids using the new terms “Entity Identifier” and “Attribute
>>>>>> Identifier”.
>>>>>>       
>>>>>>             
>>>>> No problem.
>>>>>    
>>>>>           
>>>>>> 3. “Structured Descriptions SHOULD be borne by Descriptor Resources”
>>>>>> -- I
>>>>>> think this one is incomprehensible, because “to bear” is such an
>>>>>> unusual
>>>>>> verb and has no clear connotations in technical circles. I'd
>>>>>> encourage a
>>>>>> different phrasing.
>>>>>>       
>>>>>>             
>>>>> Will think about that, getting the right phrase here is is
>>>>> challenging, so I
>>>>> am naturally open to suggestions etc..
>>>>>
>>>>>    
>>>>>           
>>>>>> 3b. Any chance of talking you into using “Descriptor Document”
>>>>>> rather than
>>>>>> “Descriptor Resource”?
>>>>>>       
>>>>>>             
>>>>> No problem, "Descriptor Document" it is :-)
>>>>>    
>>>>>           
>>>>>> 4. One thing that's left unclear: Can a Descriptor Resource carry
>>>>>> multiple
>>>>>> Structured Entity Descriptions or just a single one?
>>>>>>       
>>>>>>             
>>>>> Descriptor Documents are compound in that they can describe a single
>>>>> Entity
>>>>> or a Collection.
>>>>>    
>>>>>           
>>>>>> 5. Putting each term in quotes when first introduced is a good idea and
>>>>>> helps -- you did it for the first few terms but then stopped.
>>>>>>       
>>>>>>             
>>>>> Writers exhaustion I guess, will fix.
>>>>>
>>>>>    
>>>>>           
>>>>>> 6. I'm tempted to add somewhere, “Descriptor Resources are Entities
>>>>>> themselves.” But this might be a purposeful omission on your part?
>>>>>>       
>>>>>>             
>>>>> Yes, this is deliberate because I am trying to say: "Referent" is the
>>>>> "Thing" you describe by giving it a "Name" so, anything can be a
>>>>> "Referent"
>>>>> including a "Document" (which has always been problematic in general RDF
>>>>> realm work e.g. the failure to make links between  a ".rdf" Descriptor
>>>>> Document and the actual "Entity Descriptions" they contain etc. via
>>>>> "primarytopic", "describedby", and other relations.
>>>>>    
>>>>>           
>>>>>> 7. The last point talks about a “Structured Representation” of the
>>>>>> Referent's Structured Description. The term hasn't been introduced.
>>>>>> Shouldn't this just read “Descriptor Resource carrying the Referent's
>>>>>> Structured Description”?
>>>>>>       
>>>>>>             
>>>>> Yes, so basically this is: s/bear/carry/g  :-)
>>>>>    
>>>>>           
>>>>>> What's your preferred name for the entire thing? I'm tempted to call it
>>>>>> “Kingsley's networked EAV model” or something like that. Do you
>>>>>> insist on
>>>>>> “Data 3.0”?
>>>>>>       
>>>>>>             
>>>>> Well EAV is old, and one of my real inspirations for hamming its
>>>>> relevance
>>>>> to Linked Data is the fact that over the years I spoken with too many
>>>>> people
>>>>> that grok EAV but never connected it to the Semantic Web Project, or the
>>>>> more recent Linked Data meme.
>>>>>
>>>>> Imagine talking to founders of companies like Ingres, Informix, MySQL
>>>>> etc..,
>>>>> and witnessing them not making the EAV model connection; especially
>>>>> when you
>>>>> can't actually write a DBMS engine without comprehension of EAV,
>>>>> Identifiers, and Data representation (simple or complex data
>>>>> structure). How
>>>>> ironic!
>>>>>
>>>>>    
>>>>>           
>>>>>> Best,
>>>>>> Richard
>>>>>>       
>>>>>>             
>>>>> Thanks for the great feedback, I think we're getting closer to the
>>>>> global
>>>>> epiphany we all seek !!
>>>>>
>>>>> -- 
>>>>>
>>>>> Regards,
>>>>>
>>>>> Kingsley Idehen       President & CEO OpenLink Software     Web:
>>>>> http://www.openlinksw.com
>>>>> Weblog: http://www.openlinksw.com/blog/~kidehen
>>>>> Twitter/Identi.ca: kidehen
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     
>>>>>           
>>>>   
>>>>         
>
>
>
>   


-- 

Regards,

Kingsley Idehen	      
President & CEO 
OpenLink Software     
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen 
Received on Sunday, 18 April 2010 20:16:15 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:26 UTC