Re: comments/questions on JSON-LD spec (but _not_ for the CG->WG transition!) from Ivan Herman on 2012-06-14 (public-rdf-wg@w3.org from June 2012)

From: Ivan Herman <ivan@w3.org>
Date: Thu, 14 Jun 2012 18:39:48 +0200
To: Gregg Kellogg <gregg@greggkellogg.net>
Cc: Manu Sporny <msporny@digitalbazaar.com>, Linked JSON <public-linked-json@w3.org>, W3C RDF WG <public-rdf-wg@w3.org>
Message-Id: <CE6E4B4B-B044-495D-8131-246E5052F7F4@w3.org>
Hey Gregg,

On Jun 14, 2012, at 17:48 , Gregg Kellogg wrote:

> On Jun 14, 2012, at 3:29 AM, Ivan Herman wrote:
> 
>> To avoid any kind of misunderstandings: my comment/question should not have a bearing on whether the document comes over to the RDF WG or not, ie, I do not consider that as a prerequisite for starting the whole procedure. Consider it as a comment that may result in active RDF WG issues on the draft either before or after the FPWD publication.
>> 
>> My questions are on the section on @graph, ie, section 4.9 (and tried to look at the API doc, too). They may be clarification issues but maybe missing features in the spec. Obviously, I look at this with an RDF, more exactly RDF Named Graph (put your preferred term here:-) goggle on. I also have some comments on the construct itself, see below.
>> 
>> Question 1: I do not understand the last example in the section, namely the lonely "http://www.markus-lanthaler.com/". There is some text there about additional meta data about Markus, but what would that mean in terms of TriG? Could you describe this? 
> 
> Yeah, I put this in here partly to provoke a reaction :),

Well, you were successful!

> and it should definitely be considered to be at risk. Given that the document will be an RDF WG spec, this can only be included if it is compliance with the rest of RDF Concepts and Semantics.

Yep. I think that minor entry should then be removed before going to FPWD.

> 
> It's in there because of two basic reasons:
> 
> 1) JSON-LD is an expression of Linked Data. The use of a bare IRI was to indicate that a separate resource might be named as part of a dataset that comprises the agregated linked resources. If it is a best practice to use IRIs to denote resources which describe themselves, then this use is in keeping with those principles.
> 
> 2) A problem that came up in WikiData is that you often want to describe a statement using separate provenance, so that a given statement (or collection of statements) could be considered to be in more than one layer/named graph at the same time. For example, a statement is given provenance as deriving from a particular source (e.g., Berlin population), came from some other source document, along with other statements and so forth. Basically, from my observation, this group has described a number of different, but seemingly mutually-exclusive, uses of named graphs. The notion of using an IRI to resolve the resource which is otherwise named was some attempt to hint at a way in which an external resource could be considered to be part of multiple layers/named graphs.
> 
> That said, it's not my intent to open the debate on how a set of statements might be considered to be in more than one named graph at the same time (other than by simply repeating the statements).

At the moment the discussion has not taken a direction that would require several named graphs to be mutually disjoint in terms of their triples. That is for the semantics. As for the syntax: I see what you mean and, indeed, TriG or SPARQL does not have this type of syntax (ie, triples have to be repeated). My *feeling* right now is that we should not go out of our way to do this at this point; we have not seen major requests for something like that yet and, again in my view, let us define a syntax for the very common usages (which TriG/SPARQL seems to cover) before going one step further. And that would apply to JSON-LD, too.

But it may be worth discussing at some point. When we are otherwise ready:-)

> 
> It was really just a means of exploring how follow-your-nose linked data principles might intersect with named graphs.
> 
>> Question 2: My understanding is that
>> 
>> {
>> "@context" : ...
>> "@graph" : ...
>> }
>> 
>> (without @id) defines triples into the default graph. Which also means that if I have a nested situation of the sort:
>> 
>> {
>> "@context" : ...
>> "@id" : URI
>> "@graph: {
>>     "a" : "b",
>>     "@graph" : {
>>         "q" : "r"
>>     }
>> }
>> }
>> 
>> translates into
>> 
>> URI {
>> "a" : "b"
>> }
>> {
>> "q" : "r"
>> }
>> 
>> Is that correct?
> 
> This is certainly not intended usage, as @graph evolved from a desire to be able to define multiple top-level entities (subject definitions) where an array would otherwise be used.
> 
> However, this can be interpreted by the processor to generate quads. Basically, the [a,b] tuple belongs to an anonymous entity (blank node subject), which is in the graph denoted by URI. The [q, r] tuple also belongs to an anonymous entity (with a different BNode) in a named graph denoted by the BNode of the including entity. (JSON-LD syntax allows the use of BNodes for graph names, even though it might not result in valid RDF).
> 
> If you run a cleaned up example through the playground, you can see the Quads which are emitted:
> 
> {
> "@context" : {"a": "http://foo/a", "q": "http://foo/q", "URI": "http://URI"},
> "@id" : "URI",
> "@graph": {
>     "a" : "b",
>     "@graph" : {
>         "q" : "r"
>     }
> }
> }
> 
> _:t0 <http://foo/a> "b" <http://URI> .
> _:t1 <http://foo/q> "r" _:t0 .
> 
> 

Well, to be honest, I am actually lost (and that shows either that I am stupid, or that the section seems to be half baked, or both...) and I am not 100% sure how the @graph syntax works. Just to make it very clear, how would you encode a simple TriG thing:

{  
  <a> <b> <c> .
}
<URI> {
   <p> <q> <r> .
}

I see where I did make some mistake, but I also do not fully grasp, out of the description, what the 'value' of the "@graph" really is, and on what does "@id" applies in that case.

Note that TriG, as it stands, does not allow nested graphs, ie, 

<URI> {
   <a> <b> <c> .
   <URI1> {
      <p> <q> <r> .
   }
}

is not accepted. (There was some discussion about nested named graphs in the group, and the agreement is that there is no strong enough use case to go there.) I wonder whether JSON-LD should not take the simple approach to adopt the same principle for now, just to simplify matters (just raising this, not sure yet myself).

Also, as far as I remember, there is currently an open issue (not necessarily a formal one) whether a Graph ID can be a blank node or not. But do not trust my memory, I am too old for that...


>> However, I tried to look at the RDF algorithm in the API document, and I did not see anything about the case when the @id is not set for a @graph. Did I miss something?
> 
> Yes, if there are other properties, the object is treated as an entity with a blank node subject.

And my questions above (and also my comments below) show the confusion. All the other '@' properties have a fairly similar analogy to property-value pairs, but this one does not... At least I have not grasped it yet.

> 
>> Comment 1: I also try to imagine a JSON user who does not know anything about RDF and, obviously, of named graphs either: for that person this construct may be a bit confusing. First of all, such a person may not _really_ think in term of a graph (and the rest of JSON, ie, also the JSON-LD document, cleverly hides this concept). Ie, this keyword might be confusing. Also, the "@graph" : { ... } does not really fit, at least for me, in the mental model of a property-value pairs _on a common subject_, that is the fundamental paradigm in JSON-LD (rightfully so) because it is, somehow, fundamentally different; we are not making statements on the @id value, we are somehow changing the nature of what is happening. I know something is needed at least for the top level objects even if we do not talk about the named graphs (after all, I raised this issue in the past), but I am not convinced about the direction this goes syntax-wise. (Again, this is a discussion on or before the FPWD, _not_ a prerequisite to get the document over to the RDF WG!)
> 
> The named-graph usage really fell out of needing a way to enclose a number of top-level objects as a value property, rather than a top-level array. We are currently examining an alternative mechanism for doing this (@id maps), so we could consider changing this, but it would be a fairly radical change at this point.
> 

I know the problem:-). 

Maybe worth repeating it for the RDF WG readers: If I take a simple graph of the form

<a> <x> <c> .

<p> <x> <r> .

Its bare bone JSON-LD format would be something like

[
  { 
    "@id": <a>, 
    "<x>" : {"@id":"<c>", "@type":"@id" },
  }
  { 
   "@id": <p>, 
    "<x>" : {"@id":"<r>", "@type":"@id" } 
  }
]

a way of simplifying that would be to use the @context to, eg, to say that the value of <x> is always a URIRef and not a literal. However, the rest of the JSON-LD structure would require to repeat the @context across the array: 

[
  { 
    "@context" : {
      "<x>" : { "@type" : "@id" }
    }
    "@id": <a>, 
    "<x>" : {"@id":"<c>", "@type":"@id" },
  }
  { 
   "@context" : {
     "<x>" : { "@type" : "@id" }
   }
   "@id": <p>, 
    "<x>" : "<r>" 
  }
]

That is how @graph came into the picture. Reading all this, I am less convinced that the current solution is really the good one, though:-( 


>> Comment 2: There is, of course, the general question whether it is wise to publish a FPWD with a @graph features as long as the discussion on named graphs is still raging in the group. Maybe that section should be stripped down, for the moment, to the bare minimum that is necessary to express a graph with several top level subjects... But that is just a thought. I know the API values are set in terms of quads but we can say, at this moment, that JSON-LD does not yet have a syntax to express the full quads, only those for a default graph...
> 
> Point taken, and Manu had raised this, except that we had specific use cases to address. I'd suggest we mark the section as at-risk instead of removing it entirely.

My current bias is:

- we solve the immediate and necessary use case of the top level @context by a dedicated and simple keyword, and we consider ourselves happy
- we mark the generic @graph thingy as at risk; if TriG is well defined then we do something along the same line for JSON-LD. I am not sure it is worth creating two syntaxes that are wildly different in this respect. (I know. I am RDF biased:-)

But _again_: those can be done either before or even after the FPWD, this is not a reason to stop the transition to the RDF WG!

Thanks!

ivan



> 
> 
>> Thanks
>> 
>> Ivan
>> 
>> ----
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> FOAF: http://www.ivan-herman.net/foaf.rdf
>> 
>> 
>> 
>> 
>> 
> 
> 
> Gregg
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Thursday, 14 June 2012 16:40:18 UTC