Re: comments on JSON-LD 1.0, A JSON-based Serialization for Linked Data

I am still very puzzled as to why there is this fear of saying "RDF" in this 
specification document.

On 06/18/2013 02:55 PM, Gregg Kellogg wrote:
> On Jun 17, 2013, at 9:54 PM, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>
>> In response to Manu's comments, I have edited sections of the JSON-LD document in an attempt to show where pointers to other W3C document should be placed, and what changes should be made to adequately align JSON-LD to W3C Linked Data recommendations.  This leaves much of the document unchanged and only makes changes to JSON-LD are should not be noticed by the vast majority of potential users of JSON-LD and require very little more work on the part of developers of JSON-LD implementations.
>>
>> Enjoy,
>>
>> Peter F. Patel-Schneider
>> Nuance Communications
>>
>>
>>     1. Introduction
>>
>> Linked Data (http://www.w3.org/standards/semanticweb/data)
>> is a technique for creating a network of inter-connected
>> data across different documents and Web sites. Linked Data
>> has four properties (http://www.w3.org/DesignIssues/LinkedData.html):
>> 1) it uses IRIs as names for things; 2) it
>> uses HTTP IRIs so that the names are links that can be looked up (dereferenced); 3) when
>> dereferenced the IRIs return documents that provide useful information, using
>> the standard Web languages RDF [RDF-PRIMER?] and SPARQL [SPARQL]; and 4) the
>> data so retrieved includes yet other links. These properties allow
>> data published on the Web to work much like Web pages do today. One can
>> start at one piece of Linked Data, and follow the links to other pieces
>> of data that are hosted on different sites across the Web.
> The introduction was discussed on today's call, and we resolved to reword the first paragraph based on David Booth's proposed wording:
>
> [[[
>       Linked Data[LINKED_DATA] is a
>       way to create a network of standards-based machine interpretable
>       data across different documents and Web sites. It allows an
>       application to start at one piece of Linked Data, and follow
>       embedded links to other pieces of Linked Data that are hosted on
>       different sites across the Web.
> ]]]

This pushes parts of Linked Data that are important for JSON-LD out of the 
introduction.  Why not state upfront that RDF Graphs and Datasets are the 
foundation of JSON-LD?
>
> In the spirit of trying to make the introduction a bit lighter, I would suggest changing [[[RDF Graphs and Datasets]]] to just [[[Graphs and Datasets]]], but keep the reference to [RDF11-CONCEPTS], which makes it clear what types of Graphs and Datasets we mean. Another way to accomplish this would be to define /Graph/ and /Dataset/ to terms defined in the Data Model appendix, which would then normatively reference the definitions in RDF11-CONCEPTS, thus making it unambiguous.
Well, RDF11-Concepts doesn't define Graphs and Datasets, just RDF Graphs and 
RDF Datasets, so appealing to it for definitions of Graphs and Datasets 
doesn't make much sense.
>> JSON-LD is a lightweight syntax to serialize in JSON [RFC4627
>> <#bib-RFC4627>] the RDF Graphs and Datasets [RDF-CONCEPTS]
>> that underlie Linked Data . Its design allows
>> existing JSON to be interpreted as Linked Data with minimal changes.
>> JSON-LD is primarily intended to be a way to use Linked Data in
>> Web-based programming environments, to build interoperable Web services,
>> and to store Linked Data in JSON-based storage engines. Since JSON-LD is
>> 100% compatible with JSON, the large number of JSON parsers and
>> libraries available today can be reused. In addition to all the features
>> JSON provides, JSON-LD augments JSON with the notions from RDF and SPARQL
>> underlying Linked Data, notably:
> To avoid undue controversy, how about changing [[[JSON-LD augments JSON with the notions from RDF and SPARQL underlying Linked Data]]] to [[[JSON-LD augments JSON with the notions from RDF and SPARQL]]] by just removing the "underlying Linked Data" bit.
But then why would JSON-LD implement these notions?  Isn't JSON-LD supposed to 
be about Linked Data?

[...]

>       1.1 How to Read this Document
>
> /This section is non-normative./
>
> This document is a detailed specification for a serialization of Linked
> Data in JSON. The document is primarily intended for the following
> audiences:
>
>   * Software developers who want to encode Linked Data (RDF Graphs and Datasets) in a variety of
>     programming languages that can use JSON
>   * Software developers who want to convert existing JSON to JSON-LD
>   * Software developers who want to understand the design decisions and
>     language syntax for JSON-LD
>   * Software developers who want to implement processors and APIs for
>     JSON-LD
>   * Software developers who want to generate or consume Linked Data in a JSON syntax
>
> A companion document, the JSON-LD Processing Algorithms and API
> specification [JSON-LD-API <#bib-JSON-LD-API>], specifies how to work
> with JSON-LD at a higher level by providing a standard library interface
> for common JSON-LD operations.
>
> To understand the basics in this specification you must first be
> familiar with JSON, which is detailed in [RFC4627 <#bib-RFC4627>].
> +1 caveat using the Graph and Dataset references and leaving the RDF term out in this context.

See above.
>
>>     2. Design Goals and Rationale
>>
>> /This section is non-normative./
>>
>> JSON-LD satisfies the following design goals:
>>
>> Simplicity
>> [...]
>> Compatibility
>> [...]
>> Expressiveness
>> [...]
>> Terseness
>> [...]
>> Zero Edits, most of the time
>> [...]
>> Coverage
>>     JSON-LD was designed to be usable by developers as idiomatic JSON,
>>     with no need to understand RDF [RDF11-CONCEPTS
>>     <#bib-RDF11-CONCEPTS>]. However, JSON-LD is also a serialization for RDF
>>     Graphs and Datasets, so people intending to use JSON-LD with RDF tools
>>     will find it can be used like any other RDF syntax.
>>
>>
>> [...]
> Did you intend to strike the reference to Appendix C? I don't understand your rational for this.
Yes I certainly did intend to strike the reference to Appendix C. My 
rationale, and the technical change that I propose supports this, is to make 
Appendix C largely vacuous.  Instead the connection between JSON-LD and RDF is 
Appendix A, and it might be a good idea to point to Appendix A here.
>>       6.14 Identifying Blank Nodes
>>
[...]
> This seems to be exactly the paragraph which is in the editor's draft now. Did I miss something?

Probably not.  I copied the whole document deleted most of it and kept the 
bits where I made changes.  I may have left in certain paragraphs that I did 
not touch.
>>     A. Data Model
>>
>> [Provided previously, but repeated here, slightly modified.]
>>
>> JSON-LD is a serialization format for Linked Data (http://www.w3.org/standards/semanticweb/data)
>> based on JSON.  It is therefore important to distinguish between the syntax
>> of JSON-LD, which is defined by JSON [...] and the underlying data model.
>>
>> The data model underlying JSON-LD is RDF datasets as defined in RDF 1.1
>> Concepts and Abstract Syntax [RDF-CONCEPTS], with the following additions:
>> 1/ JSON-LD
>> allows blank nodes as the predicate of triples, and 2/ JSON-LD allows blank
>> nodes as names of graphs in the dataset.  The use of either of these
>> extensions can cause interoperability problems with other producers and
>> consumers of Linked Data and thus are not recommended when publishing Linked
>> Data using JSON-LD.
>>
>> JSON-LD allows untyped literals for strings, numbers, booleans, and
>> language-typed strings.  In each case, the untyped literal must be
>> transformed into an RDF literal as follows:  Strings are given the datatype
>> xsd:string, where xsd is a compact URI prefix for ...,
>> numbers without exponents are given the type xsd:decimal,  numbers with
>> exponents are given the type xsd:double, booleans are given the datatype
>> xsd:boolean, and language-tagged strings are given the datatype
>> rdf:langString, where rdf is a compact URI prefix for ....
>>
>> The datatypes above and their restrictions in XML Schema Datatypes [...]
>> are to be considered to be recognized datatypes [RDF 1.1 Concepts] in
>> JSON-LD and applications that produce or consume JSON-LD.  This means in
>> essence that JSON-LD applications have some notion of the underlying
>> datatype involved.  JSON-LD applications may use internal JSON values for
>> some or all these datatypes with the understanding that they may not be able
>> to represent all literals in a datatype and thus may not be able to process
>> all JSON-LD documents, and that any issues with round-tripping may introduce
>> some minor compatability issues.
>>
>> JSON-LD includes syntax for lists, which are to be transformed by creating a
>> new blank node for each element of the list, creating links labelled with
>> rdf:first from the each of
>> these blank nodes to the corresponding element of the list, creating links
>> labelled with rdf:next between these blank nodes in order and one from the
>> last blank node to the rdf:nil.  The first blank node so created is used in
>> place of the list.   A longer definition of this process is given in
>> the Turtle syntax definition [TURTLE].
>>
>> In a JSON-LD document, graph names and predicates *should* be IRIs. In
>> keeping with the basis of Linked Data, IRIs in JSON-LD documents should be
>> derefenceable and should dereference to a document that is in a Linked
>> Data format.
>>
>> JSON-LD documents *may* contain data that cannot be represented in an RDF
>> dataset.  Such data is to be ignored when a JSON-LD document is being
>> processed, except so far as this data may modify the data that is being
>> represented in the RDF dataset.  This means, e.g.,
>> that properties which are not mapped to an IRI <#dfn-iri> or blank node
>> <#dfn-blank-node> will be ignored.
>>
>> An illustration of JSON-LD's data model
>>
>> Figure 1: An illustration of JSON-LD's data model.
>>
>>
>> [...]
> +1 (mostly). I think we should be clear at this point that the JSON-LD data model is the RDF data model.

My proposal is to make the two be the same.
> Note that last week the RDF WG resolved to allow blank nodes identifiers to name graphs, so that exception isn't necessary. If we were to agree to allow blank nodes in the predicate position, we'd be pretty much 100% aligned (and I happen to think that that is a useful pattern).
I agree, and I was proposing text to say this.
>
> Regarding lists, we should be clear that in a graph representation, nodes are not ordered, so that the RDF model for lists is represented using intermediate nodes with rdf:first/rdf:rest/rdf:nil, but that the JSON representation makes use of the JSON object/array notation to represent this. That's a pretty important feature of JSON-LD IMO.

JSON-LD implementations are certainly welcome to implement well-formed lists 
in a native fashion.  In fact, I expect that they will.  There are already 
implementations that take well-formed RDF lists and turn them into list-like 
internal data structures. Similarly, JSON-LD implementations will likely 
implement numbers, strings, and booleans in a native fashion.  Many RDF 
implementations do so as well, and some (no I don't remember which ones) do so 
in a way that might not completely preserve round-tripping (although they may 
have a mode that does preserve the lexical form of literals).

I do re-realize now that JSON-LD has an issue with numbers that is not shared 
with most RDF implementations in that it is built on JavaScript, which only 
has doubles.  This can cause problems if decimals are converted to double and 
back again.

>
>>     C. Relationship to RDF
>>
>> JSON-LD is a concrete RDF syntax
>> <http://www.w3.org/TR/rdf11-concepts/#dfn-concrete-rdf-syntax> as
>> described in [RDF11-CONCEPTS <#bib-RDF11-CONCEPTS>], except that the RDF
>> Datasets resulting from JSON-LD documents have the two additions described
>> in Appendix A.
>> Hence, most JSON-LD
>> documents are both an RDF document and a JSON document.
>>
>> Summarized, these differences mean that JSON-LD is capable of
>> serializing any RDF graph or dataset and most, but not all, JSON-LD
>> documents can be directly interpreted as RDF. It is possible to work
>> around this restriction, when interpreting JSON-LD as RDF, by
>> transforming blank nodes <#dfn-blank-node> used as graph names
>> <#dfn-graph-name> or properties <#dfn-property> to IRIs <#dfn-iri>,
>> minting new "Skolem IRIs" as per Replacing Blank Nodes with IRIs
>> <http://www.w3.org/TR/rdf11-concepts/#section-skolemization> of
>> [RDF11-CONCEPTS <#bib-RDF11-CONCEPTS>]. The normative algorithms for
>> interpreting JSON-LD as RDF and serializing RDF as JSON-LD are specified
>> in the JSON-LD Processing Algorithms and API specification [JSON-LD-API
>> <#bib-JSON-LD-API>].
>>
>> Even though JSON-LD serializes RDF Datasets
>> <http://www.w3.org/TR/rdf11-concepts/#dfn-rdf-dataset>, it can also be
>> used as a RDF graph source
>> <http://www.w3.org/TR/rdf11-concepts/#dfn-rdf-source>. In that case, a
>> consumer /MUST/ only use the default graph and ignore all named graphs.
>> This allows servers to expose data in, e.g., both Turtle and JSON-LD
>> using content negotiation.
>>
>> Note
>>
>> Publishers supporting both dataset and graph syntaxes have to ensure
>> that the primary data is stored in the default graph to enable consumers
>> that do not support datasets to process the information.
> +1
>
>>     D. Relationship to Other Syntaxes
>>
>> /This section is non-normative./
>>
>> The JSON-LD examples below demonstrate how JSON-LD can be used to
>> express semantic data marked up in other syntaxes for RDF and structured
>> data.
>>
>> [...]
> +1
>
>>     G. References
>>
>>
>>       G.1 Normative references
>>
>> [...]
>>
>> [RDF11-CONCEPTS]
>>     Richard Cyganiak, David Wood, Editors. RDF 1.1 Concepts and Abstract
>>     Syntax. <http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/> 15
>>     January 2013. W3C Working Draft (work in progress). URL:
>>     http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/. The latest
>>     edition is available at http://www.w3.org/TR/rdf11-concepts/
>> [TURTLE]
>>     Eric Prud'hommeaux, Gavin Carothers, Editors. Turtle: Terse RDF
>>     Triple Language. <http://www.w3.org/TR/2013/CR-turtle-20130219/> 19
>>     February 2013. W3C Candidate Recommendation (work in progress). URL:
>>     http://www.w3.org/TR/2013/CR-turtle-20130219/. The latest edition is
>>     available at http://www.w3.org/TR/turtle/
>>
>>       G.2 Informative references
> Thanks Peter, this is quite useful.
>
> Gregg
>
>

peter

Received on Wednesday, 19 June 2013 04:08:02 UTC