Re: Converting RDF to JSON-LD : shared lists between graphs

On 7/25/14 1:57 AM, David Booth wrote:
> Hi Pat,
>
> On 07/24/2014 11:48 PM, Pat Hayes wrote:
>>
>> On Jul 23, 2014, at 2:21 PM, David Booth <david@dbooth.org> wrote:
>>
>>> Hi Kingsley,
>>>
>>> On 07/23/2014 10:13 AM, Kingsley Idehen wrote:
>>>> On 7/23/14 6:46 AM, Dan Brickley wrote:
>>>>>> How so?  It seems to me that there is an inherent tension
>>>>>> between being nice
>>>>>>> to RDF consumers (by using URIs for things that other might
>>>>>>> want to
>>>>>> refer
>>>>>>> to, as AWWW recommends) and author convenience, which leads
>>>>>>> to bnode
>>>>>> use.
>>>>> Yes, that's a real tension, although bnodes are just one
>>>>> aspect. My point was to question the "clearly" in  "the use of
>>>>> blank nodes clearly violates the web architectural good
>>>>> practice that anything of importance should be given a URI".
>>>>> Using bnodes is consistent with the things the bnodes represent
>>>>> having URIs, so nothing is violated. The reason btw we renamed
>>>>> them "bnodes" instead of the earlier (1997-2000
>>>>> e.g.http://www.w3.org/2000/03/rdf-tracking/#rdfms-identity-anon-resources) 
>>>>>
>>>>>
>>>>>
>>>>>
> phrase "anonymous nodes" was this point: the things are not anonymous
>>>>> / nameless. Only particular descriptions of them.
>>>>>
>>>>
>>>> +1
>>>>
>>>> Using pronouns (from natural language) to explain the nature of
>>>> blank nodes helps a lot.
>>>
>>> Maybe, but pronouns are used *very* differently than blank nodes,
>>> so it really isn't an accurate comparison.  Normally when a pronoun
>>> is used, the corresponding noun is *also* used, so the reader can
>>> easily determine the intended noun.  ("When *Jack* got to the bank,
>>> *he* stopped.")  But that is not usually the case with blank nodes.
>>> Usually if a blank node is used in an RDF document, no equivalent
>>> URI is given for that node.  But still, I can see how the analogy
>>> could help sometimes.
>>
>> The correct analogy is with indefinite pronouns like "someone" or
>> "something". But many uses of blank nodes are in fact more like
>> indefinite noun phrases, eg a bnode with an rdf:type link to
>> http://dbpedia.org/ontology/tree is almost an exact rendering of the
>> English phrase "a tree".
>>
>> BTW, recent work crawling the actual existing semantic web shows that
>> about 40% of deployed RDF uses blank nodes. So apparently this
>> "confusing" aspect of RDF is not confusing to a fair number of users.
>> IMO all this noise about 'good' vs. 'bad' RDF is just that, noise.
>> Actual users of RDF, as opposed to writers of technical blogs, seem
>> to be able to handle RDF quite well.
>
> I don't think that's a very helpful claim.  First, the fact that RDF 
> users are handling blank nodes does *not* mean that blank nodes do not 
> add complexity to their work.  For example, I'm an actual user of RDF, 
> and I have many times been frustrated at the extra effort that was 
> required in dealing with RDF because of blank nodes.  Of course, blank 
> nodes can also make the job *easier* in some cases -- they have pros 
> and cons -- hence the desire to have the best of both worlds, which 
> was the motivation behind the idea of Well Behaved RDF.  Second, that 
> claim suffers from self-selection bias: obviously those who "couldn't 
> handle RDF quite well" abandoned it in favor of JSON or something else 
> they found easier, so they are no longer "actual users of RDF".
Above you state:

.. obviously those who "couldn't handle RDF quite well" abandoned it in 
favor of JSON or something else they found easier, so they are no longer 
"actual users of RDF" ..

Doesn't that play into the "RDF is a format" misconception?

>
> I would like to increase the pool of RDF users -- not limit it to some 
> elite who are "able to handle" it.  I've been coming to the conclusion 
> that RDF is harder than it needs to be to support the Semantic Web.

RDF isn't the problem. Narratives are the problem.

We just need to construct more narratives aimed at different audience 
profiles. In the past, the narratives have pretty much been of the 
"because the W3C says so..." variety, and when really desperate "because 
TimBL said so...." .

Those of us that understand and successfully use RDF simply need to go 
the extra mile (as and when required) in regards to narrative 
construction and delivery.

>   But there's a big vested interest among those in the existing RDF 
> community -- particularly among RDF tool developers -- that makes it 
> hard to look past the current specs and semantics, to what we *could* 
> have.  Like the Innovator's Dilemma,

The general problem is that they try to use RDF from the bottom up. The 
first thing that comes to mind is a parser for a specific RDF notation. 
They rarely ever get to the actual relationship and relation semantics.

Developer don't usually notice the fact that RDF is enabling the 
construction of sentences, as exemplified in natural language modulo a 
lot of cruft. It's a powerful digital shorthand, so to speak.


> http://en.wikipedia.org/wiki/The_Innovator%27s_Dilemma
> I'm worried that RDF may be eclipsed by something else that is 
> simpler, but still does the job well enough to support the Semantic 
> Web. (Something JSON-based? or JSON-LD?)

Again, when you say "JSON- or JSON-LD based" to me that comes across as 
conflating RDF the language with notations for actually creating RDF 
document content.

Subject, Predicate, Object based sentence structure isn't going 
anywhere, it existed before the letters R-D-F .

We have to pushback to those that still perceive RDF as a format by 
saying things like:

RDF (the Language) enables the creation of sentences or statements where 
subject and predicate MUST be denoted by an IRI and the object by an IRI 
or Literal.

The connection between RDF and Natural Language is vital to its overall 
comprehension and inevitable appreciation [1] :-)

Links:

[1] http://slidesha.re/QEqLZN -- Natural Language & RDF .


Kingsley
>
>>
>>>>
>>>> Over the years there's been a tendency to tag vital aspects of
>>>> RDF as bad, for a variety of reasons that always boil down to
>>>> assuming that users (end-users and developers) can't figure this
>>>> stuff out.
>>>
>>> I think we have quite a lot of experience indicating that that is
>>> the case.  Even though the basic idea of triples representing
>>> simple assertions is easy -- and David Wood has a really nice Dr
>>> Seuss-inspired introduction
>>> http://www.slideshare.net/3roundstones/rdf-explained-by-suess-and-me
>>>
>>>
> -- RDF also has subtleties that cause complexity and grief in practice 
> and IMO inhibit adoption.  Blank nodes are prime culprits here.
>>
>> Do you have any actual evidence for this claim, David? I have never
>> seen any.
>
> Uh, yes.  Try explaining to a newbie why cmp cannot be used to check 
> whether two graphs are the "same", even when those graphs are 
> serialized as ntriples and sorted.  Try explaining to a newbie why a 
> blank node returned from a SPARQL query cannot be referenced in a 
> subsequent query, because a blank node label is a name that isn't 
> really a name, it's just a temporary label for a "node", which isn't 
> really the same thing as a node that has a URI, because a node that 
> has a URI maps to a *particular* thing (in every interpretation), 
> whereas a blank node merely maps to the *existence* of a thing.  See 
> also the discussion of poll results in "Everything You Always Wanted 
> to Know About Blank Nodes":
> http://www.websemanticsjournal.org/index.php/ps/article/download/365/387
>
>> Most of the noise about blank nodes seems to have been
>> generated by people who dislike them and write a lot.
>
> Well of course.  Who else would be motivated to complain about them? 
> And there's a reason why they dislike them, and it has nothing to do 
> with their color or font: blank nodes have significant downsides.
>
>> People who
>> understand them and use them easily seem to just get on with the job
>> and don't write spleenish complaints about them.
>
> Of course.  Again, that's self-selection bias.
>
>>
>>> And my main argument in this paper
>>> http://dbooth.org/2013/well-behaved-rdf/Booth-well-behaved-rdf.pdf
>>> is that if we constrain how blank nodes are used, by eliminating
>>> explicit blank nodes while retaining implicit blank nodes, we can
>>> simplify RDF usage while retaining the main benefits of blank nodes
>>> -- getting the best of both.
>>>
>>>> In my experience, a little flexibility on the narrative and
>>>> anecdotes front can leads to clarity, appreciation, and
>>>> adoption.
>>>
>>> Yes, that definitely helps too.  And I appreciate all the work you
>>> have done over the years to ease that path.  But I still think RDF
>>> is harder than it should be, because of these complexities, and we
>>> would gain more adoption if we made it simpler.
>>
>> "We" are not going to change it at all in the forseeable future,
>> after getting RDF 1.1 done. Writing papers recommending that people
>> stop using it properly according to its specs, the way that it is
>> already being used, do not achieve anything very positive, IMO.
>
> I beg your pardon.  That is a *gross* mischaracterization.  That paper 
> on Well Behaved RDF does not in any way advocate that RDF be used 
> improperly or in violation of the RDF specs, nor does it advocate that 
> RDF be used differently than it is already being used **in the vast 
> majority of cases**.  According to "Everything you always wanted to 
> know about blank nodes" by Aidan Hogan, Marcelo Arenas, Alejandro 
> Mallea, and Axel Polleres,
> http://www.websemanticsjournal.org/index.php/ps/article/download/365/387
> [[
> We conclude that the majority of documents sur-
> veyed contain acyclical blank node structures. Fur-
> thermore, with a low average in-degree of 1.07, we
> conclude that blank nodes mostly tend to form di-
> rected trees from subject to object. However, unlike
> observations for previous datasets [47], we see a sig-
> nificant number of blank-node components (37.7%)
> containing cycles. Of the 1,258,774 with a treewidth
> of 2, we found that 1,257,229 of these (99.9%) origi-
> nated from a single domain, data.gov.uk, which is
> in fact the largest producer of blank nodes in our
> data (cf. Table 2). Aside from this one domain, the
> vast majority of blank nodes form acylical graph
> structures.
> ]]
>
> However, the paper on Well Behaved RDF *does* advocate that 
> *problematic* uses of blank nodes should be avoided, and it proposes a 
> very simple and practical rule for ensuring that they are.  You may 
> not see that as leading to anything positive, but I certainly do.
>
> David
>
>>
>> Pat
>>
>>
>>>
>>> David
>>>
>>>>
>>>> Here are two important aspects of RDF that typically end up
>>>> being labeled as problematic:
>>>>
>>>> [1] blank nodes -- pronouns [2] statement reification --
>>>> statements are things too!
>>>>
>>>> The items above enable important bridging across:
>>>>
>>>> 1. Open Data -- basic structured data 2. Linked Open Data --
>>>> structured data constructed using Linked (Open) Data principles
>>>> 3. Semantically enhanced Linked Open Data -- structured data
>>>> constructed using RDFS, OWL, etc., in conjunction with Linked
>>>> Open Data principles.
>>>>
>>>>
>>>> In most cases, rather than reification, RDF user agents will
>>>> leverage blank nodes as objects of relations that associate
>>>> embedded structured data with their container documents.
>>>>
>>>> Example:
>>>>
>>>> [1]
>>>> http://linkeddata.uriburner.com/about/id/entity/http/thenextweb.com/apple/2014/07/23/apple-release-information-ios-response-claims-backdoor-data-collection/ 
>>>>
>>>>
>>>>
> -- Statements are reified en route to creating an RDF based Linked Data
>>>> document that's close to self-explanatory re. follow-your-nose
>>>> pattern
>>>>
>>>> [2] http://bit.ly/rdf-statement-reficiation-fyn  -- alternative
>>>> view for deeper follow-your-nose exploration
>>>>
>>>> [3] https://twitter.com/thalhamm/status/471994633573380096 --
>>>> tweet about blank node use that's also a segue to a usecase
>>>> demo.
>>
>> ------------------------------------------------------------ IHMC
>> (850)434 8903 home 40 South Alcaniz St.            (850)202 4416
>> office Pensacola                            (850)202 4440   fax FL
>> 32502                              (850)291 0667   mobile
>> (preferred) phayes@ihmc.us       http://www.ihmc.us/users/phayes
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>


-- 
Regards,

Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this

Received on Friday, 25 July 2014 16:04:07 UTC