Re: LinkedData != RDF ? from Gregg Kellogg on 2011-05-20 (public-linked-json@w3.org from May 2011)

From: Gregg Kellogg <gregg@kellogg-assoc.com>
Date: Thu, 19 May 2011 20:24:44 -0400
To: Sandro Hawke <sandro@w3.org>
CC: Kingsley Idehen <kidehen@openlinksw.com>, "public-linked-json@w3.org" <public-linked-json@w3.org>, Tim Berners-Lee <timbl@w3.org>
Message-ID: <4E2C74C7-F3C5-4ECD-835B-E360B25F6E29@kellogg-assoc.com>
On May 19, 2011, at 4:20 PM, Sandro Hawke wrote:

> On Thu, 2011-05-19 at 17:11 -0400, Kingsley Idehen wrote:
>> On 5/19/11 4:37 PM, Gregg Kellogg wrote:
>>> Thinking about Manu's suggestion that JSON-LD might not necessarily need to be RDF I came up with some thoughts about what an alternative Linked Data representation might look like.
>>> 
>>> Given that the RDF WG is really just doing minor tweaks to RDF, and not addressing the problems with greater adoption of RDF in the web development community, this might indeed be the way to go. A new Linked Data model could really address the needs of developers without necessarily being constrained by the formalism of RDF and OWL. If people are interested in this, it would be good to discuss. Here are some of my ideas for an alternative to RDF:
>>> 
>>> * The model should be based on the notion of graphs, similar to RDF, but where the semantics are more Class/Object based, rather than predicate based. (i.e., I define a class definition with specific properties and class inheritance/implements more similar to Ruby/Python, rather then being predicate based.
>>> 
>>> * The model should be naturally be processed in web-like languages (JavaScript, Ruby, Python, etc.).
>>> 
>>> * The need to manage multiple ontologies/vocabularies is a real problem for adoption of RDF in many areas. Something that allows a local definition of a vocabulary where equivalence to OWL ontologies might also be defined, would work better with the actual practice of vocabularies in the wild. In JSON-LD, @vocab allows for local definition of these vocabularies, and the RDFa @profile mechanism might provide a way to define these remotely.
>>> 
>>> * RDFS/OWL entailment along with other entailment regimes can be difficult for people to master, and can be computationally expensive. A way to define property type and cardinality constraints directly within the vocabulary definition could also make this simpler for real world applications. Ruby on Rails ActiveRecord::Associations has_many, has_one, belongs_to type are probably sufficient for cardinality, and the @coerce semantics a reasonable way of defining type constraints. Other type identities might be defined by expressing equivalence with native language types (e.g., Integer, DateTime, etc.) or to a user-defined class. Properties of these classes would be defined using WebIDL.
>>> 
>>> * Do away with blank nodes by defining a process for generating hash references relative to @base. BNodes are really a bad idea, in my mind. In SPARQL, they're overloaded with non-distinguished variables. It would be better to define a mechanism for generating anonymous nodes when necessary, as frags off of @base. For example:
>>> 
>>> 	{ "@context":	{ "@genid":	"#_[a-z]{3}[0-9]+" } }
>>> 
>>> Could be a way to define a regular expression used to recognize and/or generate anonymous nodes (recognition might not be important for any new data model).
>>> 
>>> * I've promoted a different notion of lists myself for some time, something similar to the OrderedList ontology [1]. The fact is that graphs simply don't provide a way to define properties of edges (such as ordering). In many systems, the way to do this is through a helper class. For example, in ActiveRecord, consider the following:
>>> 
>>> 	has_many	:orderings
>>> 	has_many	:ordered_elements, :through =>  :orderings
>>> 
>>> is a way of defining that an "orderings" class is used to define the properties of "ordered_elements", such as an "order" attribute. We could potentially do this using the proposed list syntax, but leave the definition of the list semantics to something defined within @context:
>>> 
>>> {
>>> 	"@context":	{
>>> 		"@base": "..",
>>> 		"olo":	"http://purl.org/ontology/olo/core#",
>>> 		"ex":	"http://example.com/",
>>> 		"@list": {
>>> 			"@through":	"olo:Slot",
>>> 			"@index":	"olo:index",
>>> 			"@item":		"olo:item",
>>> 			"@property":	["ex:playlist"]
>>> 		},
>>> 		"@coerce": { "xsd:anyURI":	"ex:playlist" }
>>> 	},
>>> 	"@":		"",
>>> 	"dc:title":	"My Album",
>>> 	"ex:playlist":	[[ "#track1", "#track2" ]]
>>> }
>>> 
>>> This might provide some semantic for how the elements of a list are to be defined, the result could be something like the following:
>>> 
>>> 	<>  dc:title "My Album";
>>> 		ex:playlist
>>> 			[a olo:Slot; olo:index 1; olo:item<#track1>],
>>> 			[a olo:Slot; olo:index 2; olo:item<#track2>] .
>>> 
>>> Without defining @list on a property, the semantics to devolve to  RDF List, for example. Conceivably, other list vocabularies could also be used, and we could define more properties on the list edges by adding property definitions to the join class definition.
>>> 
>>> RDF's formalism and logic-based roots are a double-edged sword. Finding a middle-ground between a rigorous definition and a common-sense Linked Data definition will be difficult, but may be what's required to make these principle relevant to a wider community.
>>> 
>>> Gregg
>>> 
>>> [1]: http://smiy.sourceforge.net/olo/spec/orderedlistontology.html
>>> 
>>> 
>>> 
>> Gregg,
>> 
>> Yes, LinkedData != RDF.
> 
> Says you.
> 
>> That's the truth,
> 
> In your mind, perhaps.
> 
>> That was the state of affairs when TimBL penned his initial AWWW 
>> oriented Linked Data meme. Sadly, the meme was revised  and has 
>> regressed ever since, courtesy of RDF and SPARQL insertion as mandatory 
>> standards rather than optional implementation details.
> 
> When Tim minted the new term, "Linked Data", and began to popularize it,
> he avoid an explicit mention of RDF because there were a lot of people
> who were rejecting RDF without having a clue what it was.  He thought
> people might come to understand and accept the concepts if they come to
> them from a different direction.  Once they got the idea, they'd realize
> RDF itself wasn't a problem.
> 
> At some point, I noticed some people using the term "Linked Data" to
> refer to some non-RDF data.  I pointed this out to Tim, asking if he
> really meant to allow non-RDF data as part of his idea of "Linked Data".
> He said he meant it to be RDF-only, and at that point, to clarify, he
> added the reference to RDF.
> 
>> RDF didn't invent the triple.
> 
> Indeed.   Triples are pretty basic.
> 
>> All we need is a simple Graph Pictorial for expressing relations using 
>> 3-tuples (EAV/SPO).
>> 
>> The schema is Conceptual and based on Logic. Nothing to do with Syntax.
>> 
>> The logic is basic: Observation Subjects are Qualified by Names. Their 
>> descriptions take the form of Attribute=Value pairs that coalesce around 
>> a Qualified Subject. Observations Subjects exist in Quantifies.
>> 
>> JSON can be used to express Linked Data without any RDF overhead. That's 
>> always been the case.
> 
> What "RDF Overhead"?    RDF is just triples, with a few simple
> clarifications: 
>  - objects can optionally have ids
>  - when they do, those ids are URIs
>  - some objects can be data values (literals), with types, if you want
>  - some objects can be text with language tags, if you want
>  - the relationships/properties are identified with URIs
> 
> What's weird or complex or non-obvious about that, aside from the use of
> URIs, which is clearly part of Linked Data?   
> 
> I mean, maybe the language tags are a bit weird, but people who work on
> multilingual systems tell me they are very important.    Given I've
> spent the last two weeks in Luxembourg and Bilbao, each of which seems
> to have two primary languages which I don't really speak, I'm especially
> sympathetic.
> 
> (Yeah, and, okay, the handling of sequences in RDF is pretty awkward,
> but there are some workarounds, and maybe one of them will become
> standards soon.)
> 
>> RDF is about Syntax for expressing Semantics. Trouble is that mangled 
>> narratives inadvertently project RDF as being the very progenitor of 
>> Semantic Graphs 
> 
> I'll certainly agree mangled narratives are a problem.....
> 
>> or the sole option for InterWeb scale Linked Data graphs.
> 
> I know I'm biased, but I think if you're going to propose something
> that's a lot like RDF, but subtly different, it has to be *extremely*
> well motivated.    This seems a bit like proposing your own version of
> ASCII with a few characters different.
> 
>> I am hoping JSON-LD arrives at syntax for EAV graphs pronto. It should 
>> be dead simple modulo RDF.
> 
> I'm all for dead simple.  RDF, especially if you use it in certain
> styles (what I've called "page mode" or "simplified RDF") is dead
> simple.
> 
>    -- Sandro

My thought was to provide a way of defining the semantics of a serialized graph through something more direct than OWL ontologies and entailments. These regimes just aren't useful to the average (or even sophisticated) web developer. Allowing for an implementation of the semantics, along with a way to map it back to OWL/RDF might provide a way forward.

In my mind, it should be easy to state the schematic rules of a given piece of JSON (or other serialization) using common methods as building blocks. For example, defining a core set of methods in WebID that would implement core operations such as type entailment and attribute cardinality. Perhaps OWL-CL could be adapted as a way of formalizing the operational semantics of such methods. But, for a developer, it should be as easy as creating a JavaScript prototype or a Ruby/Python class definition to define the basic attributes and linkages of objects. Such a system SHOULD have a lossless mapping to RDF/OWL, but I believe that should be a secondary consideration. For example, a WebIDL description of a Playlist might be something such as the following:

    module OLO {
      interface OrderedList {
        attribute Slot[] slot;
        readonly attribute unsigned long length;
      };
      interface Slot {
        readonly attribute Slot? previous;
        readonly attribute unsigned long integer;
        readonly attribute object item;
        readonly attribute Slot next;
        readonly attribute OrderedList ordered_list;
      };
    };

    module Release {
      interface PlayList : OLO::OrderedList {
        attribute Track[] track; // Ordered tracks, implemented via slot indirection
      };
      interface Slot : OLO::Slot {
        readonly attribute Track item;
      };
      interface Track {
        attribute DOMString title;
        attribute unsigned long long duration;
        //...
      };
    };

Considering the previous definitions, we might now have a *vocabulary* that associates CURIEs with interfaces/attributes:

    {
      "@vocab": {
        "Release":  "http://example.org/idl-def/release#",
        "playlist": "http://example.org/idl-def/release#playlist",
        "Playlist": "http://example.org/idl-def/release#Playlist",
        "track":    "http://example.org/idl-def/release#Playlist.track",
        "Track":    "http://example.org/idl-def/release#Track",
        "olo":      "http://purl.org/ontology/olo/core#",
        "Slot":     "http://purl.org/ontology/olo/core#Slot",
        "slot":     "http://purl.org/ontology/olo/core#slot",
        "length":   "http://purl.org/ontology/olo/core#length",
        "item":     "http://purl.org/ontology/olo/core#item",
        "@base":    "http://example.com/my_release",
        "@list": {
          "@through": "olo:Slot",
          "@index":   "olo.index",
          "@item":    "olo.item",
          "@property":  ["playlist"]
        },
        "@coerce":  {
          "Track":  "olo.item"
        }
      },
      "@":      "",
      "a":      "Release",
      "playlist":  [[
        {
          "@":              "track1",
         //  "a":              "Track",		// Can be implied from @vocab/IDL semantics that this is a Track
          "Track.title":    "Track 1",
          "Track.duration": 1234
        },
        {
          "@":              "track2",
         //  "a":              "Track",		// Can be implied from @vocab/IDL semantics that this is a Track
          "Track.title":    "Track 2",
          "Track.duration": 5678
        }
      ]]
    }

A lot of hand waving here, but basically the idea would be to use WebIDL to define the interfaces of different classes. This then allows for a way to actually describe the semantics of the data using a standardized definition language that can allow for implementations in a variety of different languages. Thus, the implementation of the semantics could travel with the data itself by matching an implementation of the IDL description to the definition.

Just trying to be productive and consider the solution space that might make the principles of Linked Data more relevant to the larger web development community. Depends on if you consider that RDF is currently serving those needs or not.

Gregg
Received on Friday, 20 May 2011 00:25:32 UTC