RE: RDF/JSON

Hi Arnaud,

Sorry for the late reply. I'm not sure I understand the scenario in all its details. For example, it is not clear to me who the creator of the data is or who decides how it gets stored? Anyway, let me try to explain how I would solve such a use case using JSON-LD that may shows that JSON-LD is just as easy to use - even if it is not that apparent at first sight.

JSON-LD is intended to be transformed to a form that is easy to work with. You would thus typically first transform it and then treat is as plain JSON. RDF/JSON doesn't allow such transformation at all which is at the same time an advantage (because it reduces the variability) and a severe limitation. In JSON-LD the same data can be represented in many ways. The "canonical" representation is the expanded flattened form. Which, as you say, is not indexed and thus not always the best representation.


> Let's consider first an RDF-JSON representation of the above Turtle: 
>
> {"http://acme.com/resourceX" : 
>    {"http://purl.org/dc/terms/title":"resource X", 
>    "http://purl.org/dc/terms/description":"description of resource X", 
>    ... other predicates ... 
>    }, 
>  "http://acme.com/collectionM" : 
>    {"http://www.w3.org/2000/01/rdf-schema#member" :
>         {"@id" : "http://acme.com/resourceX"}} 
> } 

This is not valid RDF/JSON [1] AFAICT.. it would have to look as follows:

{
  "http://acme.com/resourceX" : {
    "http://purl.org/dc/terms/title": [
      {
        "value": "resource X",
        "type": "literal"
      }
    ],
    "http://purl.org/dc/terms/description": [
      {
        "value": "description of resource X",
        "type": "literal"
      }
    ]
  },
  "http://acme.com/collectionM": {
    "http://www.w3.org/2000/01/rdf-schema#member": [
      {
        "value": "http://acme.com/resourceX",
        "type": "uri"
      }
    ]
  }
}


> Here is the Javascript code to perform my programming task: 
>
> result = {}; 
> if ('http://acme.com/collectionM' in representation) { 
>    subject = representation['http://acme.com/collectionM']
>       ['http://www.w3.org/2000/01/rdf-schema#member']['@id']; 
>    for (var predicate in representation[subject]) { 
>        result[predicate] = representation[subject][predicate]; 
>        } 
>    } 

It would be a bit more complex even if you assume that there are no literal members and that resourceX is the first member of collectionM:

result = {}; 
if (('http://acme.com/collectionM' in data) &&
    ('http://www.w3.org/2000/01/rdf-schema#member' in data['http://acme.com/collectionM']) &&
    (data['http://acme.com/collectionM']['http://www.w3.org/2000/01/rdf-schema#member'][0]['value'] === 
     'http://acme.com/resourceX')) { 

    // OK, resourceX is a member of collectionM
    for (var predicate in data['http://acme.com/resourceX']) { 
      result[predicate] = data['http://acme.com/resourceX'][predicate]; 
    }
}

See http://jsfiddle.net/M22FD/ for more readable code and the result.

You should also note that the result is *not* valid RDF/JSON. I'm not sure whether you care about this or not.


> Now here is the same resource in JSON-LD format: 
>
> [...]
>
> As you can see, this is much more complicated than what
> I have to write with RDF-JSON. 

Well, in JSON-LD (as in JSON in general) it depends a lot on how you serialize your data. You could optimize your representation for this task quite dramatically (and make it look much more natural for JSON devs at the same time). Keeping the full IRIs the data would look as follows:

{
  "@context": {
    "-memberOf": { "@reverse": "http://www.w3.org/2000/01/rdf-schema#member" }
  },
  "@id": "http://acme.com/resourceX",
  "http://purl.org/dc/terms/title": "resource X",
  "http://purl.org/dc/terms/description": "description of resource X",
  "-memberOf": "http://acme.com/collectionM"
}

I've defined a reverse property memberOf and started it with a hyphen to make it simply to filter that triple.

And the code to get the *tuples* back looks as follows:

result = {}; 
if (('-memberOf' in data) && (data['-memberOf'] === 'http://acme.com/collectionM')) { 
  for (var predicate in data) { 
    if (predicate[0] !== '@' && predicate[0] !== '-') {
      result[predicate] = data[predicate]; 
    }
  }
}

See http://jsfiddle.net/kuxqS/

The response is valid JSON-LD (a blank node though). If you remove the !== '@' check in the code above you would get a completely valid JSON-LD document that can be transformed to RDF triples by the standard algorithms.
See http://jsfiddle.net/kuxqS/1/

I would argue that both the data as well as the code are much easier to understand.. but that's of course subjective and I'm clearly biased :-)


> This is just one example, but in our experience it is
> typical – I have not made up an atypical example to make
> a point – and it doesn't actually matter if the language
> is Javascript, Python or Ruby.

I would like to hear about your other experiences. There's still time to make changes. So this would be the right time to raise concerns, issues, ask for improvements etc.


> The essential difference derives from the following: 
>
> 1. You often know in the code what subject you are looking for,
> either as a constant or in the value of a variable. With RDF-JSON, you
> just index the structure with that key. With JSON-LD, you have to loop
> through the subjectNodes looking for the one whose '@id' matches your
> known subject. You could use fancier programming constructs, like
> select or reduce, to find the subjectNode you are looking for, but it
> still does not match the simplicity of a simple hash/dictionary access
> in RDF-JSON.

This is only true for the top-level resource - and in most cases there will be just one so you won't have any problem. All other data can be represented indexed just as RDF/JSON:

{
  "@context": {
    "predicate": { "@id": "someIRI", "@container": "@index" }
  },
  "@id": "top-level-node",
  "predicate": {
    "IRI-of-node1": {
        "@id": "IRI-of-node1",
        ... other predicates ...
    },
    "IRI-of-node2": {
        "@id": "IRI-of-node2",
        ... other predicates ...
    }
  }
}

See http://www.w3.org/TR/json-ld/#data-indexing for details.


> 2. If you are looking for predicates, you have to
> filter out the '@id' entries, which are artifacts of the format and
> don't correspond to triples.

Depends on what you are trying to achieve. The value of @id is certainly part of the triples, similarly @type is. If you want valid JSON-LD as result (which you then can convert directly to some other RDF serialization format or an abstract representation), you don't have to filter them. You just extracted tuples and stored the subject of those tuples somewhere else.


> We like JSON-LD and it has a use in our applications. However, it is
> not good for everything, and we are finding that a mixture of JSON-LD
> with RDF-JSON is much more useful than either one of them alone. (We
> are also using RDFa, but that is a separate story.) We do not think
> W3C should try to decide whether RDF-JSON or JSON-LD is better, and
> picking one as a winner. That would be very similar to trying to
> decide whether dictionary/hash/associative-array (RDF-JSON) is better
> than list/array (JSON-LD) and forcing a programming language to have
> only one of them. We'd like to see W3C recognize both, perhaps with
> some tutorial material that shows when to use each.

I'm not sure I agree with this reasoning. In 99% of the cases you can bring JSON-LD into a shape which is very, very similar to RDF/JSON. Most of the time however, that's not what you want. I see the value of RDF/JSON as an *internal* data structure. I'm a bit worried of endorsing its use in the wild. I think we are at critical stage and would go as far as saying if we don't get it right this time, developers will just keep ignoring Semantic Web technologies. Publishing several "competing" standards is not the right way forward in my opinion.

I would be fine publishing RDF/JSON as a note if people want a stable reference. A tutorial showing "when to use each" would go way too far IMHO. It should be made clear that JSON-LD is the format to be used between systems. Internally, of course, every application is free to represent the data in any way it likes.


Cheers,
Markus


--
Markus Lanthaler
@markuslanthaler

Received on Tuesday, 16 April 2013 11:37:02 UTC