XSD datatypes in JSON-LD (was: Re: Use of XSD namespace in RDF recommendations) from Richard Cyganiak on 2012-09-05 (public-rdf-comments@w3.org from September 2012)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Wed, 5 Sep 2012 10:09:22 +0100
To: Gregg Kellogg <gregg@greggkellogg.net>
Cc: public-rdf-comments Comments <public-rdf-comments@w3.org>
Message-Id: <81692EB9-1151-4C29-8875-16D8B3CA8E16@cyganiak.de>
Hi Gregg,

On 4 Sep 2012, at 23:54, Gregg Kellogg wrote:
>>> We considered this in JSON-LD; JSON numbers are translated to xsd:integer or xsd:double, and true/false to xsd:boolean when transforming to RDF.
>> 
>> Doing this differently from SPARQL in the case of xsd:decimal (that is, fraction but no exponent) is a bad idea. You'll get situations where 1.0 in a SPARQL query doesn't match 1.0 in a JSON-LD document because of different numeric datatypes, and where 1.0 written in Turtle and 1.0 written in JSON-LD produce different literals.
> 
> The problem is, in JSON, there's only a single number type, so you can't distinguish between decimal and double.

Oh, I see. Sending a JSON-LD document through a JSON parser turns both 1.5 and 15E-1 into the same native Javascript double value, so you can't distinguish them unless you write a custom parser that doesn't take advantage of an off-the-shelf JSON parser.

> You can distinguish between double and integer due to the presence or absence of a decimal point. JSON-LD supports all datatyped literals using the expanded format:
> 
> { "@value": "1.1", "@type": "xsd:decimal"}
> 
> To ensure fidelity of numeric types in JSON-LD, it's usually best to avoid using native JSON types.

Right.

>>> When going from RDF, strings are used unless an option is specified do use xsd types.
>> 
>> That doesn't quite make sense to me.
>> 
>> xsd:string strings should always be JSON strings and never XSD types in JSON-LD.
> 
> Yes, xsd:strings are always presented as simple strings, or as an expanded value with only a @value key.

Good.

>> The xsd:integer, xsd:decimal, xsd:double and xsd:boolean types should always be represented with the native JSON number / boolean representation, and never as XSD types.
> 
> The problem is that this can be lossy, in the case of xsd:decimal and xsd:double.

Yeah, you're right, we don't want lossy. So, strike xsd:decimal from my list above.

> There is also some subtle interaction when expanding; native types are never expanded. When compacting, only string representations will match when there is a datatype coercion. This is so that, when working within JSON, the use of native types (numeric and boolean anyway) is lossless across the different algorithms.

That sounds sane to me.

> We did discuss always using the native representations for xsd:integer and xsd:double, but this was deemed to introduce too much chance of data corruption. See "Data Round Tripping in the API[1] and discussion in issues 98 [2] and 81 [3].

You lost me here. Where and how are non-native representations of xsd:integer and xsd:double used?

>> For the other types (rdf:XMLLiteral, rdf:HTML, rdf:langString, xsd:xxx, any custom data types) I would argue quite strongly that the default should be to retain all information (hence allowing round-trips from RDF to JSON-LD back to RDF). Perhaps there could be a switch that, if manually enabled, serializes all these literals as plain strings.
> 
> All other typed literals are expressed using the expanded notation, for example:
> 
> { "@value": "e = mc<sup>2</sup>", "@type": "rdf:HTML"}
> 
> If type coercion is specified in the context, these will be serialized as plain strings. For example, if the term "text" was defined to expand to "schema:text" and @type was set to "rdf:HTML", this would be rendered simply as follows:
> 
> {
>  "text": "e = mc<sup>2</sup>"
> }

This again sounds all good to me.

>> But my preferred option would still be that toRDF can be invoked with a context object, and if I want some property with a custom datatype to be serialized as a simple JSON string, then I can provide a term definition with type coercion for that property in the context object. I'm not sure if the JSON-LD APIs support something like this at the moment.
> 
> Yes, just specify the mapping in the context used to compact and this is the form which is used.

Ok, good.

> However, within the JSON-LD API methods, there is no way to convert from native types to expanded (or string compacted) values without going through RDF and using the "useNativeTypes" flag.

Lost me again.

Why would one want to have expanded values instead of JSON-native values?

What is a string-compacted value?

Is there a test case or example that shows the difference between enabled and disabled useNativeTypes?

useNativeTypes defaults to true, right?

Why would one want to set useNativeTypes to false?

Best,
Richard



> [1] http://json-ld.org/spec/latest/json-ld-api/#data-round-tripping
> [2] https://github.com/json-ld/json-ld.org/issues/98
> [3] https://github.com/json-ld/json-ld.org/issues/81
Received on Wednesday, 5 September 2012 09:12:10 UTC