W3C home > Mailing lists > Public > public-xg-webid@w3.org > December 2011

RE: Important Question re. WebID Verifiers & Linked Data

From: Peter Williams <home_pw@msn.com>
Date: Thu, 22 Dec 2011 05:00:56 -0800
Message-ID: <SNT143-W33D90D3AD9E2565779968292AA0@phx.gbl>
To: <kidehen@openlinksw.com>, "public-xg-webid@w3.org" <public-xg-webid@w3.org>


 I remember aged 20 choosing to do my undergrad project in an expert system of the day (from some UK university). All it had to  was enable me to represent some rules, and do a backwards chaining processing of rules that recognized when a certain coding of music harmony did or did not fit the style rules of Bach chorales (since my first love is formal musicology, not computing). Remember, in the holy roman empire, musicians were under strict rules about was was proper (a bit like infamous astronomers later). They had to fgind beaty within the rules, knowing that actual or social torture was just around the corner for failing them. I gave up after 3 weeks, and just went back to lex/yacc, and 1 token lookahead grammars. The issue was the level of buyin required, which was beyond me (and my time, more importantly). All I wanted to do was USE the rules, not learn about rule based processing. If you were not a programmer of rule systems and rule engines, your could not be an effective user of rule expressions. I see many parallels, here, 30 years (in the W3C-led semantic web world).  Many of the same mistakes are being (were?) made. Webid is a forcing funciton, in my view. If semantic web cannot do this (ping a file on the web for a key), then give up.    Date: Thu, 22 Dec 2011 07:41:26 -0500
From: kidehen@openlinksw.com
To: public-xg-webid@w3.org
Subject: Re: Important Question re. WebID Verifiers & Linked Data


  


    
  
  
    On 12/22/11 3:48 AM, Henry Story wrote:
    

      
        On 22 Dec 2011, at 04:42, Kingsley Idehen wrote:
        
        
          
           On 12/21/11 3:45 PM,
            Patrick Logan wrote:
            See below...

              

              On Wed, Dec 21, 2011 at 8:27 AM,
                Henry Story <henry.story@bblfish.net>
                wrote:

                
                  

                    On 21 Dec 2011, at 14:58, Kingsley Idehen wrote:

                    >
                   > Please understand that RDF !=
                    Linked Data. It's just one of the options for
                    creating and publishing Linked Data.

                    

                  
                  I think it would be very nice to have a formal spec on
                  what Linked Data is. We do have a few for

                  RDF.

                
                

                
                Yes, please. I understand the RDF-related
                  specifications. And I understand the general notion of
                  "linked data". I am someone following the WebID
                  effort, and who is contemplating the costs and
                  benefits of supporting it in my products at some
                  point.
                

                
                Unless I see a WebID specification (or a more
                  general "world-wide linked data specification") for
                  how to support linked data beyond RDF, how can I
                  estimate the costs and benefits of supporting linked
                  data beyond RDF?
                

                
                Please understand that "publishing and consuming
                  any and all possible interpretations of 'linked data'"
                  is probably impossible. 
                

                
                So what are the specific requirements?
                

                
                -Patrick
                

                
              
            
            Patrick,

            

            Here are the fundamental requirements:

            

            1. a Data Item (Object or Entity) is uniquely identified by
            a URI based Name

            2. use de-referencable URIs as Names that resolve to
            *descriptor* documents (resources) that describe the URI's
            referent 

            3. descriptor documents (resources) should *consist* or
            *bear* structured data in the form of eav/spo triples (or
            3-tuples) statements that collectively form a directed graph
            pictorial that coalesces around the description subject's
            URI .

            

            You can make statements eav/spo statements using a variety
            of syntaxes. 

            You can serialize eav/spo bearing resources across the wire
            using a variety of data serialization formats. 

            You can leverage HTTP as a low cost and effective mechanism
            for:

            

            1. de-referencable URI based Names

            2. negotiation of data serialization formats between servers
            and clients (user agents).

            

            The WebID spec can require or suggest a number of common
            formats for eav/spo triple transmission as the basis for
            effective bootstrap. 

            The WebID spec should never (overtly or covertly) be a tool
            for fighting or prolonging  syntax of format wars re.
            eav/spo triples. 

            The WebID spec should never leak the abstraction inherent in
            URIs by mandating http: scheme URIs.

            The WebID spec should never compromise the fidelity of
            Linked Data by favoring a particular style of
            de-referencable URI.

            The WebID spec is a spec. It shouldn't attempt to teaching
            software engineering. 

            

            All of the above is possible without adversely affecting
            WebID. In short, all of this will make WebID attractive to
            much broader developer profiles that extend beyond the RDF
            based Semantic Web community. 

            

            What's the difference between RDF and EAV? 

            

            RDF explicitly (via spec) requires the use of URIs in S, P,
            and O (optionally) slots of triple statements. It also
            handles typed literals and language tags. Basically, it
            caters for locale issues and i18n. 

          
        
        

        
        In fact RDF semantics does not do this. It is just the RDF
          serialisations that do, and for not such bad reasons. RDF
          semantics allows numbers in predicate positions, though these
          don't have much use there. There is even an example in the
          spec about this here http://www.w3.org/TR/rdf-mt/
          !
      
    
    

    Put differently, RDF introduces (as part of its spec) URIs to EAV.
    That's the fundamental point.

    
      
        

        
        You can map every EAV to SPO, and can in fact infer from
          each. SPO is much cleaner theoretically than EAV as it is
          purely relational, and everything can be mapped down to
          relations in the end.
      
    
    

    S-P-O and E-A-V are 3-tuples with different letters. The use of URIs
    is where RDF starts to be more specific. The aforementioned
    specificity extends to typed literals, language tags, and i18n. 

    

    
      
        

        
        So really all that Kinglsey is doing is change the language
          because he thinks that it is the language that was the problem
          with RDF, where in
        fact I think there were quite a number of other issues that
          were the problem, such as no decent tooling and no
          understanding from the developers
        of even REST at the time. XML had just came out and it
          promised to do everything, which shows just how syntactically
          people were thinking at the time. 

        
      
    
    

    No, since I know myself pretty well, here is what I am doing:

    

    I am trying to make the issue of Linked Data (foundation layer of
    the Semantic Web Project) more accommodating. Achieving this
    accommodation means introducing missing genealogy to the RDF
    narrative. 

    

    

    
      
        

        
        In the mean time Java came out and .net and the idea that
          multiple languages could be used to program to the same
          machine. That the machine itself could be virtual. And so the
          notion of semantics is very widely spread in the procedural
          space.
      
    
    

    Please!

    

    Semantics has been in computer science since forever. RDF isn't the
    progenitor!

    

    
      
        

        
         So in my view this is the wrong time to shift. 
      
    
    

    The fact that you think I am seeking a shift is symptomatic of the
    fundamental issue we are having re. communications. Put differently,
    your arguments right now are exactly the same as the arguments you
    were attempting to make about slash vs hash URIs yesterday.
    Eventually, you found out that it was down to a bug.  Please re-read
    yesterday's thread. Until you found the bug you were making a
    contradictory argument. You are repeating it again (albeit different
    context) right now.

    

    
      
        It is as if you had asked people about Object Oriented
          programming in 1992. There would have been huge arguments
          about its complexity (c++) or academicity (eifel, ...), then
          Java come out and 5 years later you could not program without
          doing OO. 

        
      
    
    

    The problem with OO languages is that Object Theory isn't a
    programming language specific thing. Linked Data is about
    unshackling Object Theory from programming languages. 

    

    Data Objects can exist independent of any programming language,
    operating system, or dbms engine specificity. When all is said and
    done, this is what AWWW is delivering to the world, the final
    unshackling of Data from Code. 

    

    
      
        

        
        OWL is just OO for relations btw.
      
    
    

    OWL is about Semantic Fidelity for relations. These relations take
    the form of eav/spo 3-tuples (triples).

    

    
      

        
           

            EAV isn't as specific as RDF with regards to the items
            above. At the same time EAV is well known, and doesn't carry
            the political baggage of RDF. 

          
        
        

        
        I am not sure there is a Political baggage of RDF. 
      
    
    

    We view the world through completely different "context lenses". I
    know the letters R-D-F carry huge political baggage that continues
    to impede comprehension and adoption of the Semantic Web vision. 

    

    I can separate concepts for syntaxes (programming language or
    markup) . You claim you can, but you are always talking about
    parsers. I don't think about parsers when working on WebID, Linked
    Data etc.. I think about: data objects, locations/addresses,
    negotiable representation, relations, and the semantic fidelity of
    relations. 

    

    
      
        All linked data does RDF and it is growing very fast.
          People are understanding that RDF is about semantics and
          relations. and I get very little negative feedback about RDF.
          Even Facebook is publishing Turtle now! 

        
      
    
    

    Again, here is where you are incorrect. Facebook is publishing
    Linked Data and that has zilch to do with Turtle, which is just
    another syntax for expressing eav/spo based triples as well as being
    yet another across-the-wire serialization format. When Facebook
    officially decided to go with Linked Data I wrote a note [1]
    deconstructing what they did, I encourage you to read it.

    

    
      

        
           

            What's the difference between RDF and Linked Data?

            There is nothing in the RDF spec (even as I write) that
            specifies the behavior of URIs used in SPO triples. For
            instance RDF doesn't explicitly distinguish between a URI
            that serves as a Name and a URI that serves as a Resource
            Locator (Address). 

          
        
        

        
        And that is very good. RDF is about semantics and logics.
          Linked Data is pattern of publication of this data. Linked
          Data still has to be fully specified by the W3C, but it is
          obviously exactly how Tim Berners Lee intended RDF to be used.
        
      
    
    

    TimBL did not intend for any specific syntax. He had a clear vision
    for InterWeb scale Linked Data. RDF happened to be a vehicle for
    vision manifestation. 

    

    You are still failing to accept that RDF isn't the progenitor. 

    

    
      
        This was not understood by the RDF logicians he employed
          necessarily because they had a different background.
        

        
        

        
          Linked Data is very
            specific about URI behavior. It expects URI based Names and
            Addresses to be distinct.  This is why it's always
            problematic to infer (overtly or covertly) that RDF ==
            Linked Data. The *truth* of the matter is that RDF is one
            approach to constructing directed graph based descriptor
            resources (comprised of eav/spo triples) that result in
            Linked Data at various scales (local area network or wide
            area network e.g., the InterWeb).

          
        
        

        
        RDF is mathematically defined, therefore it covers all ways
          of doing relational stuff. you can of course re-invent it, but
          then it would take no time to map that back to rdf, so your
          time would be wasted.
      
    
    

    Again, you are making a comment short on deeper information. If you
    want to understand what's going on with relations and data I suggest
    you digest an article I stumbled across recently titled: A
    co-Relational Model of Data for Large Shared Data Banks [2].

    
      
        

        
        I will therefore continue speaking in terms of RDF and
          Linked Data. 
      
    
    

    You can speak in terms of RDF and Linked Data. Just don't try to
    infer that RDF is the only option for producing Linked Data since
    that's an utter fallacy. Also, don't claim WebID is about Linked
    Data if it doesn't support Linked Data principles. Again, remember,
    you found a bug yesterday in your code. Prior to finding a bug in
    your verifier, you were heading down a very slippery slope re. your
    arguments about why Linked Data clients should be discriminating
    against a particular style of de-referencable URI. 

    

    
      
        Those are clearly understood, extremely well specced out,
          and there is now 12 years of accumulated knowledge in that
          space. We  also have the tools to do things there. 
      
    
    

    I am no stranger to RDF tools. 

    
      
         Plus we are at the W3C here and we can use the vocabulary.
          NASA uses RDF, governments use RDF, companies are using them,
          social networks are using it. All in more and more linked data
          manners.
      
    
    

    Any you feel you need to educate me about who is using RDF to do
    real work across industry? 

    

    
      
        

        
        Perhaps some crowds need to be talked to in terms of EAV,
          but we are not going to write a spec in latin to make the
          vatican happy too either.
      
    
    

    I haven't asked you to put EAV into the WebID spec. I've asked you
    to make up your mind about WebID re., the following:

    

    1. Linked Data

    2. Architecture of the World Wide Web.

    

    Yesterday, you were basically "walking the plank" on both fronts.
    Now you've fixed the bugs in your service, and we just might be back
    on track re. the suggestions you make. Please remember, this thread
    is about WebID Verifiers and Linked Data (which is an application of
    AWWW). Yesterday, you made contradictory suggestions re:

    

    1. URIs  -- slash or hash based HTTP scheme URIs

    2. HTTP -- re. 303 redirection.

    

    Links:

    

    1. https://plus.google.com/112399767740508618350/posts/6cqa1Sxk5KV -- What Facebook Can Teach Us about Bootstrapping Linked
      Data at InterWeb Scales

    2. http://queue.acm.org/detail.cfm?id=1961297 -- A co-Relational
    Model of Data for Large Shared Data Banks .

    

    Kingsley

    
      
        

        
        Henry
        

        
        
          
            Regards,

Kingsley Idehen	      
Founder & CEO 
OpenLink Software     
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen





          
        
      
      

      
        
            
                
                    
                        
                                  Social Web Architect

                                    http://bblfish.net/
                                
                      
                  
              
          
      
      

    
    

    

    -- 

Regards,

Kingsley Idehen	      
Founder & CEO 
OpenLink Software     
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen




 		 	   		  
Received on Thursday, 22 December 2011 13:01:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 22 December 2011 13:01:31 GMT