RE: Blank nodes as predicates [was Re: Input needed from RDF group on JSON-LD skolemization] from Markus Lanthaler on 2013-07-10 (public-linked-json@w3.org from July 2013)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Wed, 10 Jul 2013 16:18:37 +0200
To: <public-linked-json@w3.org>
Message-ID: <009801ce7d78$5f8da930$1ea8fb90$@lanthaler@gmx.net>
On Wednesday, July 10, 2013 4:47 AM, David Booth wrote:
> Hold on, let's back up a moment and make sure that we are on the same 
> page about the overall objective.  Suppose I slightly extend Dave 
> Longley's example to add one more blank node property, such as:
> 
> {
>   ...
>    "_:website_status": {
>      "editor": {
>        "id": "1",
>        "changes": 4
>      },
>      "_:ad636ee3fb": true,
>      "_:ee3fbad636": false
>    }
> }
> 
> Surely, as a design goal, it should be possible for the client to 
> process this JSON-LD document either as JSON or as extended RDF.  So 
> suppose the document is interpreted as extended RDF by a client that is 
> *intended* to fully understand it.  And if you wish, you can even assume 
> that the client has has out-of-band information to understand the 
> meanings of the "private" data properties _:ad636ee3fb and _:ad636ee3fb. 
>   But how is RDF client expected to obtain the values of the 
> _:ad636ee3fb and _:ee3fbad636 properties?   Those two boolean statements 
> would effectively become (in Turtle):
> 
>    _:website_status _:ad636ee3fb true ;
>    _:website_status _:ee3fbad636 false .
> 
> But in RDF, blank node labels are merely syntactic devices, so the above 
> is exactly the same as saying:
> 
>    _:b1 _:b2 true ;
>    _:b1 _:b3 false .

What if I would have some (out-of-band) knowledge that tells me that

  _:b2 rdfs:subPropertyOf <http://example.com/someTheClientUnderstands1> . 
  _:b2 rdfs:subPropertyOf <http://example.com/someTheClientUnderstands2> .

This would then entail

  _:b2 <http://example.com/someTheClientUnderstands1> true .
  _:b2 <http://example.com/someTheClientUnderstands2> true .

I would argue that this might be something very useful in a number of cases.


> So how on earth can the RDF client figure out which of those private 
> properties is supposed to be true and which is supposed to be false? 
> It can't.  All it can determine is that there exists a property with a 
> true value and there exists a property with a false value.

Right, without context it wouldn't be able to figure that out. Exactly the
same happens if a client encounters a URL that doesn't resolve to anything
useful, e.g., a skolem IRI.


> This use of blank nodes looks to me like a hack to intentionally make it 
> harder for an *RDF* downstream consumer -- even an *extended* *RDF* 
> downstream consumer that can handle blank node predicates -- to make use 
> of the data than for a pure JSON downstream consumer.  This seams to me 
> like an *anti*-design goal.  To my mind, the design goal should be the 
> opposite: to make it as easy for *both* JSON and RDF consumers to make 
> equivalent use of the document.

OK, a different example:

  {
    "some_data": "I don't care about",
    "maybe": {
      "I": {
        "just": {
          "care_about_deeply_nested_data": [
            {
              "id": "markus",
              "name": "Markus Lanthaler",
              "authorOf": "http://www.w3.org/TR/json-ld/"
            }
          ]
        }
      }
    }
  }

Now I can convert the pieces I'm interested in to some meaningful RDF with
the following context:

  {
    "@context": {
      "@vocab": "_:",
      "id": "@id",
      "name": "http://example.com/vocab#name",
      "authorOf": { "@id": "http://example.com/vocab#authorOf", "@type":
"@id" }
    }
  }

This would result in the following triples:

  _:b0 _:b1 _:b2 .
  _:b0 _:b8 "I don't care about" .
  _:b2 _:b3 _:b4 .
  _:b4 _:b5 _:b6 .
  _:b6 _:b7 <markus> .
  <markus> <http://example.com/vocab#authorOf>
<http://www.w3.org/TR/json-ld/> .
  <markus> <http://example.com/vocab#name> "Markus Lanthaler" .


In this case, most triples are useless for me but I do care about the last
two and such use cases are very valuable. I'm sorry, but I can't see how it
would anyone if those blank node predicates would be URLs. What would you
gain? The danger is that other people start relying on them or start
complaining that you use a plethora of different URLs for which they can't
find any definition. Blank nodes by their very nature on the other hand make
it clear that there's some relationship, the details however are unclear.
Nothing of that requires any out-of-band information or contract between the
publisher and a consumer.


--
Markus Lanthaler
@markuslanthaler
Received on Wednesday, 10 July 2013 14:19:14 UTC