Re: Converting RDF to JSON-LD : shared lists between graphs from Eric Prud'hommeaux on 2014-07-27 (public-rdf-comments@w3.org from July 2014)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Sun, 27 Jul 2014 03:49:29 -0400
To: Kingsley Idehen <kidehen@openlinksw.com>
Cc: public-rdf-comments Comments <public-rdf-comments@w3.org>
Message-ID: <20140727074927.GC12728@w3.org>
* Kingsley Idehen <kidehen@openlinksw.com> [2014-07-26 14:09-0400]
> On 7/25/14 10:22 PM, David Booth wrote:
> >Hi Dan,
> >
> >On 07/25/2014 12:30 PM, Dan Brickley wrote:
> >>On 25 July 2014 04:48, Pat Hayes <phayes@ihmc.us> wrote:
> >>>
> >>>On Jul 23, 2014, at 2:21 PM, David Booth <david@dbooth.org> wrote:
> >>>
> >>>>Hi Kingsley,
> >>>>
> >>>>On 07/23/2014 10:13 AM, Kingsley Idehen wrote:
> >>>>>On 7/23/14 6:46 AM, Dan Brickley wrote:
> >>>>>>>How so?  It seems to me that there is an inherent tension between
> >>>>>>>being nice
> >>>>>>>>to RDF consumers (by using URIs for things that
> >>>>>>>>other might want to
> >>>>>>>refer
> >>>>>>>>to, as AWWW recommends) and author convenience,
> >>>>>>>>which leads to bnode
> >>>>>>>use.
> >>>>>>Yes, that's a real tension, although bnodes are just one aspect. My
> >>>>>>point was to question the "clearly" in  "the use of blank nodes
> >>>>>>clearly violates the web architectural good practice
> >>>>>>that anything of
> >>>>>>importance should be given a URI". Using bnodes is
> >>>>>>consistent with the
> >>>>>>things the bnodes represent having URIs, so nothing is violated. The
> >>>>>>reason btw we renamed them "bnodes" instead of the
> >>>>>>earlier (1997-2000
> >>>>>>e.g.http://www.w3.org/2000/03/rdf-tracking/#rdfms-identity-anon-resources)
> >>>>>>
> >>>>>>
> >>>>>>phrase "anonymous nodes" was this point: the things are
> >>>>>>not anonymous
> >>>>>>/ nameless. Only particular descriptions of them.
> >>>>>>
> >>>>>
> >>>>>+1
> >>>>>
> >>>>>Using pronouns (from natural language) to explain the nature of blank
> >>>>>nodes helps a lot.
> >>>>
> >>>>Maybe, but pronouns are used *very* differently than blank
> >>>>nodes, so it really isn't an accurate comparison.  Normally
> >>>>when a pronoun is used, the corresponding noun is *also*
> >>>>used, so the reader can easily determine the intended noun.
> >>>>("When *Jack* got to the bank, *he* stopped.")  But that is
> >>>>not usually the case with blank nodes.  Usually if a blank
> >>>>node is used in an RDF document, no equivalent URI is given
> >>>>for that node.  But still, I can see how the analogy could
> >>>>help sometimes.
> >>>
> >>>The correct analogy is with indefinite pronouns like "someone"
> >>>or "something". But many uses of blank nodes are in fact more
> >>>like indefinite noun phrases, eg a bnode with an rdf:type link
> >>>to http://dbpedia.org/ontology/tree is almost an exact
> >>>rendering of the English phrase "a tree".
> >>>
> >>>BTW, recent work crawling the actual existing semantic web
> >>>shows that about 40% of deployed RDF uses blank nodes. So
> >>>apparently this "confusing" aspect of RDF is not confusing to
> >>>a fair number of users. IMO all this noise about 'good' vs.
> >>>'bad' RDF is just that, noise. Actual users of RDF, as opposed
> >>>to writers of technical blogs, seem to be able to handle RDF
> >>>quite well.
> >>
> >>Beyond those findings in "Everything You Always Wanted to Know About
> >>Blank Nodes", we can add 6M+ internet domains publishing billions of
> >>entity descriptions using schema.org, an RDF vocabulary. The vast
> >>majority of this data shows up as bnodes in RDF. There are a few
> >>tricks for data merging such as a http://schema.org/sameAs property
> >>which points to indicative documents, e.g. see
> >>https://support.google.com/webmasters/answer/4620133?hl=en rather than
> >>trying to have the http-range-14 conversation with mainstream
> >>webmasters.
> >>
> >>The Linked Data thing began as TimBL expressing a concern that FOAF
> >>data was needlessly bnodey - hence
> >>http://www.w3.org/DesignIssues/LinkedData - and I think we're finally
> >>settling into a kind of scruffy and pragmatic consensus that both
> >>styles of graph have a role. High quality professionally published
> >>Linked Data may lean more towards "well known URIs for all entities in
> >>the graph", whereas mainstream markup will often use a bnode
> >>formulation instead. Taking the Music Artists example from my google
> >>link above, the JSON-LD schema.org triples tell you something like
> >>this (subsetting for brevity):
> >>
> >>A "MusicEvent" with "name" "B.B. King with Jonathon 'Boogie' Long" has
> >>a "location" (which is a "Place" with "name" "Lupo's Heartbreak
> >>Hotel"). That "Place" has an "address" that is a "PostalAddress",
> >>which has such-and-so streetAddress, postalCode etc. The "MusicEvent"
> >>has a "performer" that is a "MusicGroup". The "name" of that
> >>"MusicGroup" is "Jonathon 'Boogie' Long". This same "MusicEvent" has
> >>an "eventStatus" of "EventRescheduled", and a "previousStartDate" of
> >>"2013-09-30T19:30".
> >
> >That's a good example of how the vast majority of RDF does not
> >need explicit blank nodes.  The above can be written just fine as
> >Well Behaved RDF using only implicit blank nodes:
> >
> >[] a :MusicEvent ;
> >  :name "B.B. King with Jonathon 'Boogie' Long" ;
> >  :location [
> >    a :Place ;
> >    :address [ a :PostalAddress
> >      :streetAddress "..." ;
> >      :postalCode "..." . ] .
> >      ] ;
> >  :performer [ a :MusicGroup ;
> >    name "Jonathon 'Boogie' Long" ; ] ;
> >  :eventStatus :EventRescheduled ;
> >  :previousStartDate "2013-09-30T19:30"^^xsd:dateTime .
> >
> >>
> >>This seems to me (and to numerous publishers) to be reasonably
> >>actionable and interpretable information, particularly since it is
> >>also linked to well known Wikipedia and homepage URLs.
> >>
> >>My preference is that we all stop trying to tell publishers how
> >>exactly to manage their sites and databases, and deal with the fact
> >>that they'll often have partial information without nice well known
> >>URIs everywhere.
> >
> >RDF consumers certainly need to deal with whatever they get.  But
> >in my experience developers generally appreciate good guidance
> >about how to best use RDF.
> >
> >David
> >
> 
> +1
> 
> As indicated, in my earlier post, we don't solve much by encouraging
> the publication of data that over burdens the engines that have to
> process said data.

I think I'm missing something here. How does 

  [] :location [ :address [ :postalCode "..." ] ] .

burden a system more than it would be when reduced to NTriples

  _:a :location _:b .
  _:b :address _:c .
  _:c :postalCode "..." .

? Perhaps your concearn is not that bnodes can have labels (thus
permitting multiple references) but instead the RDF 1.1 WG decision
that bnodes can be shared between graphs. Note that revoking that
would make some weird stuff happen in SPARQL systems where the default
graph is some function (e.g. concatonation) of the named graphs.

I appreciate David's goal of making more stuff addressable, but I
think it really has to be phrased that way. Eliminating BNodes would
result in people manufacturing lots of dead URIs. Instead, we need to
gently coax people into implementing additional URIs by showing them
that they get some nice features and that historically, those features
pay off (e.g. the Web).


>                    We can draw a balance by encouraging publishers
> (via tutorials and best practices guides) to publish consumable
> data. At the end of the day, why go through the pains of publishing
> data that's difficult to use by the various audience profiles that
> swirl around structured data?
> 
> I am also using the slightly modified snippet below capture David's
> example, in regards to my live demos/tutorials collection.
> 
> ## Turtle Start ##
> 
> @prefix : <#> .
> 
> <> a :Document;
> :topic :BlankNodeUsage ;
> :describes
> [ a :MusicEvent ;
>   :name "B.B. King with Jonathon 'Boogie' Long" ;
>   :location [
>                 a :Place ;
>                 :address [ a :PostalAddress ;
>                                :streetAddress "..." ;
>                              :postalCode "..."
>                           ] ;
>                   :performer [ a :MusicGroup ;
>                              :name "Jonathon 'Boogie' Long"  ] ;
>                   :eventStatus :EventRescheduled ;
>                     :previousStartDate "2013-09-30T19:30"^^xsd:dateTime
>              ]
> ] .
> 
> ## Turtle End ##
> 
> -- 
> Regards,
> 
> Kingsley Idehen	
> Founder & CEO
> OpenLink Software
> Company Web: http://www.openlinksw.com
> Personal Weblog 1: http://kidehen.blogspot.com
> Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
> Twitter Profile: https://twitter.com/kidehen
> Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn Profile: http://www.linkedin.com/in/kidehen
> Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
> 
> 



-- 
-ericP

office: +1.617.599.3509
mobile: +33.6.80.80.35.59

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

There are subtle nuances encoded in font variation and clever layout
which can only be seen by printing this message on high-clay paper.
Received on Sunday, 27 July 2014 07:49:32 UTC