RE: Understanding of JSON-LD values from Markus Lanthaler on 2013-06-13 (public-rdf-comments@w3.org from June 2013)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Thu, 13 Jun 2013 07:42:44 +0200
To: "'public-rdf-comments'" <public-rdf-comments@w3.org>
Message-ID: <019f01ce67f8$d47dfd20$7d79f760$@lanthaler@gmx.net>
On Thursday, June 13, 2013 2:29 AM, Peter Ansell wrote:
On 13 June 2013 07:49, Sven R.Kunze wrote:
>> Good evening everybody,
>>  
>> in a former discussion, I mentioned that the purpose of “native
>> literals” in the JSON-LD data model is not clear to me. And it still
>> is not.
>>  
>> Markus wrote: “JSON-LD has e.g. native numbers and (probably more
>> interesting) lists. In RDF everything is a opaque string that can only
>> be interpreted, i.e., converted to a number in your programming
>> language, if you understand the data type. So to speak, JSON-LD has a
>> built-in data type for numbers.”
>>  
>> So, what is the advantage of that? Shouldn’t every RDF graph lib
>> provide a way to parse the literals with a datatype native to the
>> programming language one uses?

Sven, I don't really understand what you are trying to achieve with your questions (and the way you frame and time them) but I'm nevertheless trying to answer them in the most objective way I can -- especially since you mention my name explicitly.

JSON has a "native" representation for numbers. If we were to prohibit the use of that feature it would make no sense at all to define a syntax based on JSON.

>From reading your previous mails I understand that you don't care about serialization syntaxes at all because your libraries take care of everything. The fact however is that we never (ever) talk about programming languages in any of the RDF specifications. They just don't matter (in the sense you are framing your question). It's an implementation detail. Actually you are kind of contradicting yourself because you want native numbera in your programming environment but not in the serialization syntax. The biggest advantage of JSON - and thus the main reason of its success - is that there's no impedance mismatch between the serialization format and the native representation in your programming environment. I'm not aware of any language which can't parse JSON into a native representation.


>> Of one drawback, I could easily think of: it’s confusing as it mixes
>> up serialization and abstract model. What is so bad of having only
>> *ONE* value for the number 42 instead of two?

I do not follow. How does it mixes it up? Because there are no quotes around the number? Because there's no explicit datatype? What about Turtle's "native" numbers?


>> The standard RDF data model only have *ONE* value for it whereas the
>> JSON-LD model suggests *TWO*, namely the ‘native value’ and the
>> datayped-string value. Correct me, when I get something wrong.

I don't know what you mean by value but even in RDF there's a difference between the lexical representation and the "value"... and some systems may not even get to see the real "value" because they don't understand the datatype. That's what I meant by "opaque strings".


>> Another question that arises when having two different 42 (is that
>> even possible?) is the fact of how to work with them. Are they
>> considered equal (in the mathematical sense)? Can I add/substract/...
>> “42”^^xsd:integer and 42? What are the results: 84 or
>> “84”^^xsd:integer?
>>  
>> In order to refer to Markus’ statement: “In RDF everything is a opaque
>> string that can ....” <<< that is not quite true as JSON data itself
>> is only an opaque string, too, that only a JSON parser is able to
>> understand.

No, JSON has, just as RDF, a data model. It happens to support numbers of infinite range and precision. RDF doesn't by itself. It relies on datatypes which define how such an "opaque string" can be interpreted. An RDF library has to know how to interpret the XSD types to be able to infer that "42"^^xsd:integer == 42, the same is true for other datatypes.


>> Other example: in N3, you can write false as a shortcut for
>> “false”^^xsd:boolean.

So? You can do the same in Turtle and JSON-LD.


>> Having said this, I do not quite understand why there is a need for
>> such ‘native values’ in the data model when it’s just a serialization
>> issue which on its own is perfectly valid as is simplifies a lot. But
>> on the data model side, it’s more than questionable.

I do not understand this question at all. What is "more than questionable"? The fact the we allow developers to use JSON-native numbers and booleans? Or is it the fact that JSON numbers don't map 1:1 to XSD types? If that's your concern then the answer is actually quite trivial. Look at RFC4627, numbers are of infinite precision and range but off-the-shelf parsers do have limited precision and range. Unfortunately the exact range and precision is not specified so the best we can do is to map them to the best matching types - and sometimes that means that rounding errors may occur. Have a look at
  http://json-ld.org/spec/latest/json-ld-api/#data-round-tripping
that should explain it.


>> In order to state it more clearly: 1.  When both a ‘native value’ and
>> a ‘typed-literal value’ refer to the very same entity, I do not see
>> the purpose of introducing ‘native values’ as syntactic sugar belongs
>> to the syntax part and not to the abstract model part.

See above. Unfortunately there's no 1:1 mapping.


>> 2.  When they
>> don’t, the above mentioned questions should be answered clearly within
>> the spec.

They are, I think
  http://json-ld.org/spec/latest/json-ld-api/#data-round-tripping


> From my understanding, the JSON-LD-API spec [1] (as they are
> intentionally not normatively referring to either RDF or XMLSchema in
> the JSON-LD spec to reduce the learning curve for JSON-only
> developers)

The API spec normatively references both RDF Semantics and XML Schema


> provides RDF transformation algorithms that are controlled
> by the useNativeTypes setting [2] (which is not a field on
> JsonLdOptions?? [3])

Yeah, there's no API for that, just an algorithm. Defining an RDF API would probably end up in another perma-thread accusing us to be overzealous


> to determine whether to migrate numeric and
> boolean datatypes between XMLSchema and JSON Native datatypes when
> converting to RDF.

No, when converting *from* RDF.



> There should be no issues with the basic integer datatype that has the
> same value space. The issues that I have been enquiring about recently
> were in the double datatype. The reason that they are not syntactic
> sugar, from my understanding, are that the value spaces are not
> equivalent. Ie, you cannot represent some XMLSchema double and decimal
> numbers in JSON Native.

You can, but some off-the-shelf parsers might not be able to parse them.


> The overarching goal of JSON-LD is to be completely compatible with
> idiomatic JSON, and not RDF, so they must offer the ability for users
> to use JSON Native types, even if that introduces round-tripping
> issues.

We support lossless round-tripping of JSON-LD to RDF and back.. but in that case it won't be idiomatic JSON. All you have to do is to set the use native types flag to false when serializing RDF as JSON-LD -- and that's the default value by the way.


> Although all RDF libraries will offer full support for the
> commonly used XMLSchema datatypes, JSON-LD is focused on avoiding any
> dependencies on RDF or XMLSchema libraries due to a feared backlask by
> JSON developers if they do.

That's just wrong. IMO it wouldn't make any sense to provide a JSON-based syntax if you can't use it as idiomatic JSON.


> JSON developers are notorious for their
> hatred of anything XML, and (possibly by extension) RDF due to the
> historical link between RDF and RDF/XML.

I'm not going to comment this one.


> The difference with N3 and Turtle are that their native valuespaces
> are based on XMLSchema datatypes, so there are no issues with
> conversion to RDF Abstract Model for N3/Turtle/other RDF users who
> virtually universally are using XMLSchema to represent numeric data
> internally and in their serialisations

No, the difference is that there already exist JSON parsers for virtually every programming language. That's not the case for N3 and Turtle. The parsers that are being/have been built for that N3 and Turtle are being/have been built exactly for that purpose. I think the majority of the group just tries to ignore that fact. We are not starting at a clean slate. We have developed JSON-LD by considering the current JSON ecosystem. We started with implementations. We had a test suite from the very beginning. The specification was a result of our experiences.


> Would it be useful to add a note to RDF-2-JSON-LD transformers that
> they MAY leave xsd:double values as non-native if they can determine
> that the transformation would not be lossless, even if the
> useNativeTypes flag is set to true?

IMO no, if a user sets the use native types flag to true she expresses her intentions quite clear. Why should we ignore that?



--
Markus Lanthaler
@markuslanthaler
Received on Thursday, 13 June 2013 05:43:18 UTC