Re: comments on JSON-LD 1.0, A JSON-based Serialization for Linked Data from Peter F. Patel-Schneider on 2013-06-18 (public-rdf-wg@w3.org from June 2013)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Mon, 17 Jun 2013 21:54:40 -0700
To: Manu Sporny <msporny@digitalbazaar.com>
CC: RDF WG <public-rdf-wg@w3.org>
Message-ID: <51BFE810.8040404@gmail.com>
In response to Manu's comments, I have edited sections of the JSON-LD document 
in an attempt to show where pointers to other W3C document should be placed, 
and what changes should be made to adequately align JSON-LD to W3C Linked Data 
recommendations.  This leaves much of the document unchanged and only makes 
changes to JSON-LD are should not be noticed by the vast majority of potential 
users of JSON-LD and require very little more work on the part of developers 
of JSON-LD implementations.

Enjoy,

Peter F. Patel-Schneider
Nuance Communications


     1. Introduction

Linked Data (http://www.w3.org/standards/semanticweb/data)
is a technique for creating a network of inter-connected
data across different documents and Web sites. Linked Data
has four properties (http://www.w3.org/DesignIssues/LinkedData.html):
1) it uses IRIs as names for things; 2) it
uses HTTP IRIs so that the names are links that can be looked up 
(dereferenced); 3) when
dereferenced the IRIs return documents that provide useful information, using
the standard Web languages RDF [RDF-PRIMER?] and SPARQL [SPARQL]; and 4) the
data so retrieved includes yet other links. These properties allow
data published on the Web to work much like Web pages do today. One can
start at one piece of Linked Data, and follow the links to other pieces
of data that are hosted on different sites across the Web.

JSON-LD is a lightweight syntax to serialize in JSON [RFC4627
<#bib-RFC4627>] the RDF Graphs and Datasets [RDF-CONCEPTS]
that underlie Linked Data . Its design allows
existing JSON to be interpreted as Linked Data with minimal changes.
JSON-LD is primarily intended to be a way to use Linked Data in
Web-based programming environments, to build interoperable Web services,
and to store Linked Data in JSON-based storage engines. Since JSON-LD is
100% compatible with JSON, the large number of JSON parsers and
libraries available today can be reused. In addition to all the features
JSON provides, JSON-LD augments JSON with the notions from RDF and SPARQL
underlying Linked Data, notably:

   * a universal identifier mechanism for JSON objects <#dfn-json-object>
     via the use of IRIs <#dfn-iri>,
   * global uniqueness of local indentifiers (keys) shared among different
     JSON documents by
     mapping them to IRIs <#dfn-iri> via a context <#dfn-context>,
   * linking between JSON objects in different JSON documents, even extending
     to JSON objects from different sites on the
     Web,
   * the ability to annotate strings <#dfn-string> with their language,
   * a way to associate datatypes with values such as dates and times,
   * and a facility to express one or more directed graphs, a.k.a., RDF
     datasets, such as a social network, in a single document.

Developers that require any of the facilities listed above or need to
serialize an RDF graph or dataset [RDF11-CONCEPTS <#bib-RDF11-CONCEPTS>]
in a JSON-based syntax will find JSON-LD of interest. The syntax is
designed to not disturb already deployed systems running on JSON, but
provide a smooth upgrade path from JSON to JSON-LD. Since the shape of
such data varies wildly, JSON-LD features mechanisms to reshape
documents into a deterministic structure which simplifies their processing.


       1.1 How to Read this Document

/This section is non-normative./

This document is a detailed specification for a serialization of Linked
Data in JSON. The document is primarily intended for the following
audiences:

   * Software developers who want to encode Linked Data (RDF Graphs and 
Datasets) in a variety of
     programming languages that can use JSON
   * Software developers who want to convert existing JSON to JSON-LD
   * Software developers who want to understand the design decisions and
     language syntax for JSON-LD
   * Software developers who want to implement processors and APIs for
     JSON-LD
   * Software developers who want to generate or consume Linked Data in a JSON 
syntax

A companion document, the JSON-LD Processing Algorithms and API
specification [JSON-LD-API <#bib-JSON-LD-API>], specifies how to work
with JSON-LD at a higher level by providing a standard library interface
for common JSON-LD operations.

To understand the basics in this specification you must first be
familiar with JSON, which is detailed in [RFC4627 <#bib-RFC4627>].


     2. Design Goals and Rationale

/This section is non-normative./

JSON-LD satisfies the following design goals:

Simplicity
[...]
Compatibility
[...]
Expressiveness
[...]
Terseness
[...]
Zero Edits, most of the time
[...]
Coverage
     JSON-LD was designed to be usable by developers as idiomatic JSON,
     with no need to understand RDF [RDF11-CONCEPTS
     <#bib-RDF11-CONCEPTS>]. However, JSON-LD is also a serialization for RDF
     Graphs and Datasets, so people intending to use JSON-LD with RDF tools
     will find it can be used like any other RDF syntax.


[...]


       6.14 Identifying Blank Nodes

/This section is non-normative./


At times, it becomes necessary to be able to express information, using
nodes,  without
being able to provide an IRI for the node. This type of node is called a blank 
node <#dfn-blank-node>.
JSON-LD does not require all nodes to be identified using |@id|.
However, some graph topologies may require identifiers to be
serializable. Graphs containing loops, e.g., cannot be serialized using
embedding alone, |@id| must be used to connect the nodes. In these
situations, one can use blank node identifiers
<#dfn-blank-node-identifier>, which look like IRIs <#dfn-iri> using an
underscore (|_|) as scheme. This allows one to reference the node
locally within the document, but makes it impossible to reference the
node from an external document. The blank node identifier
<#dfn-blank-node-identifier> is scoped to the document in which it is used.

[...]


     A. Data Model

[Provided previously, but repeated here, slightly modified.]

JSON-LD is a serialization format for Linked Data 
(http://www.w3.org/standards/semanticweb/data)
based on JSON.  It is therefore important to distinguish between the syntax
of JSON-LD, which is defined by JSON [...] and the underlying data model.

The data model underlying JSON-LD is RDF datasets as defined in RDF 1.1
Concepts and Abstract Syntax [RDF-CONCEPTS], with the following additions:
1/ JSON-LD
allows blank nodes as the predicate of triples, and 2/ JSON-LD allows blank
nodes as names of graphs in the dataset.  The use of either of these
extensions can cause interoperability problems with other producers and
consumers of Linked Data and thus are not recommended when publishing Linked
Data using JSON-LD.

JSON-LD allows untyped literals for strings, numbers, booleans, and
language-typed strings.  In each case, the untyped literal must be
transformed into an RDF literal as follows:  Strings are given the datatype
xsd:string, where xsd is a compact URI prefix for ...,
numbers without exponents are given the type xsd:decimal,  numbers with
exponents are given the type xsd:double, booleans are given the datatype
xsd:boolean, and language-tagged strings are given the datatype
rdf:langString, where rdf is a compact URI prefix for ....

The datatypes above and their restrictions in XML Schema Datatypes [...]
are to be considered to be recognized datatypes [RDF 1.1 Concepts] in
JSON-LD and applications that produce or consume JSON-LD.  This means in
essence that JSON-LD applications have some notion of the underlying
datatype involved.  JSON-LD applications may use internal JSON values for
some or all these datatypes with the understanding that they may not be able
to represent all literals in a datatype and thus may not be able to process
all JSON-LD documents, and that any issues with round-tripping may introduce
some minor compatability issues.

JSON-LD includes syntax for lists, which are to be transformed by creating a
new blank node for each element of the list, creating links labelled with
rdf:first from the each of
these blank nodes to the corresponding element of the list, creating links
labelled with rdf:next between these blank nodes in order and one from the
last blank node to the rdf:nil.  The first blank node so created is used in
place of the list.   A longer definition of this process is given in
the Turtle syntax definition [TURTLE].

In a JSON-LD document, graph names and predicates *should* be IRIs. In
keeping with the basis of Linked Data, IRIs in JSON-LD documents should be
derefenceable and should dereference to a document that is in a Linked
Data format.

JSON-LD documents *may* contain data that cannot be represented in an RDF
dataset.  Such data is to be ignored when a JSON-LD document is being
processed, except so far as this data may modify the data that is being
represented in the RDF dataset.  This means, e.g.,
that properties which are not mapped to an IRI <#dfn-iri> or blank node
<#dfn-blank-node> will be ignored.

An illustration of JSON-LD's data model

Figure 1: An illustration of JSON-LD's data model.


[...]



     C. Relationship to RDF

JSON-LD is a concrete RDF syntax
<http://www.w3.org/TR/rdf11-concepts/#dfn-concrete-rdf-syntax> as
described in [RDF11-CONCEPTS <#bib-RDF11-CONCEPTS>], except that the RDF
Datasets resulting from JSON-LD documents have the two additions described
in Appendix A.
Hence, most JSON-LD
documents are both an RDF document and a JSON document.

Summarized, these differences mean that JSON-LD is capable of
serializing any RDF graph or dataset and most, but not all, JSON-LD
documents can be directly interpreted as RDF. It is possible to work
around this restriction, when interpreting JSON-LD as RDF, by
transforming blank nodes <#dfn-blank-node> used as graph names
<#dfn-graph-name> or properties <#dfn-property> to IRIs <#dfn-iri>,
minting new "Skolem IRIs" as per Replacing Blank Nodes with IRIs
<http://www.w3.org/TR/rdf11-concepts/#section-skolemization> of
[RDF11-CONCEPTS <#bib-RDF11-CONCEPTS>]. The normative algorithms for
interpreting JSON-LD as RDF and serializing RDF as JSON-LD are specified
in the JSON-LD Processing Algorithms and API specification [JSON-LD-API
<#bib-JSON-LD-API>].

Even though JSON-LD serializes RDF Datasets
<http://www.w3.org/TR/rdf11-concepts/#dfn-rdf-dataset>, it can also be
used as a RDF graph source
<http://www.w3.org/TR/rdf11-concepts/#dfn-rdf-source>. In that case, a
consumer /MUST/ only use the default graph and ignore all named graphs.
This allows servers to expose data in, e.g., both Turtle and JSON-LD
using content negotiation.

Note

Publishers supporting both dataset and graph syntaxes have to ensure
that the primary data is stored in the default graph to enable consumers
that do not support datasets to process the information.



     D. Relationship to Other Syntaxes

/This section is non-normative./

The JSON-LD examples below demonstrate how JSON-LD can be used to
express semantic data marked up in other syntaxes for RDF and structured
data.

[...]


     G. References


       G.1 Normative references

[...]

[RDF11-CONCEPTS]
     Richard Cyganiak, David Wood, Editors. RDF 1.1 Concepts and Abstract
     Syntax. <http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/> 15
     January 2013. W3C Working Draft (work in progress). URL:
     http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/. The latest
     edition is available at http://www.w3.org/TR/rdf11-concepts/
[TURTLE]
     Eric Prud'hommeaux, Gavin Carothers, Editors. Turtle: Terse RDF
     Triple Language. <http://www.w3.org/TR/2013/CR-turtle-20130219/> 19
     February 2013. W3C Candidate Recommendation (work in progress). URL:
     http://www.w3.org/TR/2013/CR-turtle-20130219/. The latest edition is
     available at http://www.w3.org/TR/turtle/

       G.2 Informative references
Received on Tuesday, 18 June 2013 04:55:10 UTC