Re: Converting RDF to JSON-LD : shared lists between graphs

On 7/27/14 3:49 AM, Eric Prud'hommeaux wrote:
> * Kingsley Idehen<kidehen@openlinksw.com>  [2014-07-26 14:09-0400]
>> >On 7/25/14 10:22 PM, David Booth wrote:
>>> > >Hi Dan,
>>> > >
>>> > >On 07/25/2014 12:30 PM, Dan Brickley wrote:
>>>> > >>On 25 July 2014 04:48, Pat Hayes<phayes@ihmc.us>  wrote:
>>>>> > >>>
>>>>> > >>>On Jul 23, 2014, at 2:21 PM, David Booth<david@dbooth.org>  wrote:
>>>>> > >>>
>>>>>> > >>>>Hi Kingsley,
>>>>>> > >>>>
>>>>>> > >>>>On 07/23/2014 10:13 AM, Kingsley Idehen wrote:
>>>>>>> > >>>>>On 7/23/14 6:46 AM, Dan Brickley wrote:
>>>>>>>>> > >>>>>>>How so?  It seems to me that there is an inherent tension between
>>>>>>>>> > >>>>>>>being nice
>>>>>>>>>> > >>>>>>>>to RDF consumers (by using URIs for things that
>>>>>>>>>> > >>>>>>>>other might want to
>>>>>>>>> > >>>>>>>refer
>>>>>>>>>> > >>>>>>>>to, as AWWW recommends) and author convenience,
>>>>>>>>>> > >>>>>>>>which leads to bnode
>>>>>>>>> > >>>>>>>use.
>>>>>>>> > >>>>>>Yes, that's a real tension, although bnodes are just one aspect. My
>>>>>>>> > >>>>>>point was to question the "clearly" in  "the use of blank nodes
>>>>>>>> > >>>>>>clearly violates the web architectural good practice
>>>>>>>> > >>>>>>that anything of
>>>>>>>> > >>>>>>importance should be given a URI". Using bnodes is
>>>>>>>> > >>>>>>consistent with the
>>>>>>>> > >>>>>>things the bnodes represent having URIs, so nothing is violated. The
>>>>>>>> > >>>>>>reason btw we renamed them "bnodes" instead of the
>>>>>>>> > >>>>>>earlier (1997-2000
>>>>>>>> > >>>>>>e.g.http://www.w3.org/2000/03/rdf-tracking/#rdfms-identity-anon-resources)
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>>
>>>>>>>> > >>>>>>phrase "anonymous nodes" was this point: the things are
>>>>>>>> > >>>>>>not anonymous
>>>>>>>> > >>>>>>/ nameless. Only particular descriptions of them.
>>>>>>>> > >>>>>>
>>>>>>> > >>>>>
>>>>>>> > >>>>>+1
>>>>>>> > >>>>>
>>>>>>> > >>>>>Using pronouns (from natural language) to explain the nature of blank
>>>>>>> > >>>>>nodes helps a lot.
>>>>>> > >>>>
>>>>>> > >>>>Maybe, but pronouns are used*very*  differently than blank
>>>>>> > >>>>nodes, so it really isn't an accurate comparison.  Normally
>>>>>> > >>>>when a pronoun is used, the corresponding noun is*also*
>>>>>> > >>>>used, so the reader can easily determine the intended noun.
>>>>>> > >>>>("When*Jack*  got to the bank,*he*  stopped.")  But that is
>>>>>> > >>>>not usually the case with blank nodes.  Usually if a blank
>>>>>> > >>>>node is used in an RDF document, no equivalent URI is given
>>>>>> > >>>>for that node.  But still, I can see how the analogy could
>>>>>> > >>>>help sometimes.
>>>>> > >>>
>>>>> > >>>The correct analogy is with indefinite pronouns like "someone"
>>>>> > >>>or "something". But many uses of blank nodes are in fact more
>>>>> > >>>like indefinite noun phrases, eg a bnode with an rdf:type link
>>>>> > >>>tohttp://dbpedia.org/ontology/tree  is almost an exact
>>>>> > >>>rendering of the English phrase "a tree".
>>>>> > >>>
>>>>> > >>>BTW, recent work crawling the actual existing semantic web
>>>>> > >>>shows that about 40% of deployed RDF uses blank nodes. So
>>>>> > >>>apparently this "confusing" aspect of RDF is not confusing to
>>>>> > >>>a fair number of users. IMO all this noise about 'good' vs.
>>>>> > >>>'bad' RDF is just that, noise. Actual users of RDF, as opposed
>>>>> > >>>to writers of technical blogs, seem to be able to handle RDF
>>>>> > >>>quite well.
>>>> > >>
>>>> > >>Beyond those findings in "Everything You Always Wanted to Know About
>>>> > >>Blank Nodes", we can add 6M+ internet domains publishing billions of
>>>> > >>entity descriptions using schema.org, an RDF vocabulary. The vast
>>>> > >>majority of this data shows up as bnodes in RDF. There are a few
>>>> > >>tricks for data merging such as ahttp://schema.org/sameAs  property
>>>> > >>which points to indicative documents, e.g. see
>>>> > >>https://support.google.com/webmasters/answer/4620133?hl=en  rather than
>>>> > >>trying to have the http-range-14 conversation with mainstream
>>>> > >>webmasters.
>>>> > >>
>>>> > >>The Linked Data thing began as TimBL expressing a concern that FOAF
>>>> > >>data was needlessly bnodey - hence
>>>> > >>http://www.w3.org/DesignIssues/LinkedData  - and I think we're finally
>>>> > >>settling into a kind of scruffy and pragmatic consensus that both
>>>> > >>styles of graph have a role. High quality professionally published
>>>> > >>Linked Data may lean more towards "well known URIs for all entities in
>>>> > >>the graph", whereas mainstream markup will often use a bnode
>>>> > >>formulation instead. Taking the Music Artists example from my google
>>>> > >>link above, the JSON-LD schema.org triples tell you something like
>>>> > >>this (subsetting for brevity):
>>>> > >>
>>>> > >>A "MusicEvent" with "name" "B.B. King with Jonathon 'Boogie' Long" has
>>>> > >>a "location" (which is a "Place" with "name" "Lupo's Heartbreak
>>>> > >>Hotel"). That "Place" has an "address" that is a "PostalAddress",
>>>> > >>which has such-and-so streetAddress, postalCode etc. The "MusicEvent"
>>>> > >>has a "performer" that is a "MusicGroup". The "name" of that
>>>> > >>"MusicGroup" is "Jonathon 'Boogie' Long". This same "MusicEvent" has
>>>> > >>an "eventStatus" of "EventRescheduled", and a "previousStartDate" of
>>>> > >>"2013-09-30T19:30".
>>> > >
>>> > >That's a good example of how the vast majority of RDF does not
>>> > >need explicit blank nodes.  The above can be written just fine as
>>> > >Well Behaved RDF using only implicit blank nodes:
>>> > >
>>> > >[] a :MusicEvent ;
>>> > >  :name "B.B. King with Jonathon 'Boogie' Long" ;
>>> > >  :location [
>>> > >    a :Place ;
>>> > >    :address [ a :PostalAddress
>>> > >      :streetAddress "..." ;
>>> > >      :postalCode "..." . ] .
>>> > >      ] ;
>>> > >  :performer [ a :MusicGroup ;
>>> > >    name "Jonathon 'Boogie' Long" ; ] ;
>>> > >  :eventStatus :EventRescheduled ;
>>> > >  :previousStartDate "2013-09-30T19:30"^^xsd:dateTime .
>>> > >
>>>> > >>
>>>> > >>This seems to me (and to numerous publishers) to be reasonably
>>>> > >>actionable and interpretable information, particularly since it is
>>>> > >>also linked to well known Wikipedia and homepage URLs.
>>>> > >>
>>>> > >>My preference is that we all stop trying to tell publishers how
>>>> > >>exactly to manage their sites and databases, and deal with the fact
>>>> > >>that they'll often have partial information without nice well known
>>>> > >>URIs everywhere.
>>> > >
>>> > >RDF consumers certainly need to deal with whatever they get.  But
>>> > >in my experience developers generally appreciate good guidance
>>> > >about how to best use RDF.
>>> > >
>>> > >David
>>> > >
>> >
>> >+1
>> >
>> >As indicated, in my earlier post, we don't solve much by encouraging
>> >the publication of data that over burdens the engines that have to
>> >process said data.
> I think I'm missing something here. How does
>
>    [] :location [ :address [ :postalCode "..." ] ] .
>
> burden a system more than it would be when reduced to NTriples
>
>    _:a :location _:b .
>    _:b :address _:c .
>    _:c :postalCode "..." .
>
> ? Perhaps your concearn is not that bnodes can have labels (thus
> permitting multiple references) but instead the RDF 1.1 WG decision
> that bnodes can be shared between graphs. Note that revoking that
> would make some weird stuff happen in SPARQL systems where the default
> graph is some function (e.g. concatonation) of the named graphs.
>
> I appreciate David's goal of making more stuff addressable, but I
> think it really has to be phrased that way. Eliminating BNodes would
> result in people manufacturing lots of dead URIs. Instead, we need to
> gently coax people into implementing additional URIs by showing them
> that they get some nice features and that historically, those features
> pay off (e.g. the Web).
>
>

I am not asserting that Blank Nodes burden the system. My only point is 
that we should encourage publishers to adopt practices that ultimately 
burden consumers (produced by RDF processor implementers).

You recommend:

## Turtle Start ##

   _:a :location _:b .
   _:b :address _:c .
   _:c :postalCode "..." .


## Turtle End ##

We have a tendency to drop in snippets, that aren't really processed to 
show final output. Dan provided one of those examples, David responded 
with a tweak, and I responsed with an output demonstration.

Real output trumps speculative snippets, many of which contain minor 
errors that ultimately throw those studying some of this for a loop. I 
am advocating for show and less tell, in regards to what some perceive 
(rightly or wrongly so) as RDF idiosyncrasies.

As I said, in an earlier post:

Demonstrating how live examples enable publishers and consumers of RDF 
data to reach common ground. At the end of the day, RDF is supposed to 
make encoding and decoding information, via sentences or statements, 
part of the Web.

Links:

[1] 
http://linkeddata.uriburner.com/about/html/http://lists.w3.org/Archives/Public/public-rdf-comments/2014Jul/0030.html 
-- denotes basic follow-your-nose friendly document

[2] 
http://bit.ly/rdf-statements-embedded-in-mailin-list-post-showcasing-blank-nodes-utility 
-- denotes deeper follow-your-nose friendly document

-- 
Regards,

Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this

Received on Sunday, 27 July 2014 21:22:18 UTC