Re: Resolutions for features at risk [JSON-LD]

On Jun 2, 2013, at 1:46 PM, Markus Lanthaler <markus.lanthaler@gmx.net> wrote:

> On Sunday, June 02, 2013 9:53 PM, Gregg Kellogg wrote:
>> On Jun 1, 2013, at 3:55 AM, Markus Lanthaler wrote:
>>> So here are the JSON-LD snippets to be expanded/compacted/converted
>>> to RDF again:
>>> 
>>> "prop1": { "@value": 5 }
>>> "prop2": { "@value": 5, "@type": "xsd:double" }
>>> "prop3": { "@value": "5.0", "@type": "xsd:double" }
>>> "prop4": { "@value": "5.0E0", "@type": "xsd:double" }
>>> "prop5": { "@value": "99999...1000s.of.9s", "@type": "xsd:integer" }
>> 
>> Okay, for expansion/compaction, I think there are really three states:
> 
> Thanks a lot Gregg!
> 
> 
>> useNativeTypes=true, useNativeTypes=false and not specified. In the
>> case that the option is not specified, it should just be the existing
>> behavior. If useNativeTypes is true, it should convert all xsd numeric
>> types (at least xsd:boolean, xsd:integer, xsd:decimal, and xsd:float,
>> but probably all their sub-types as well) to a native JSON
>> representation. If set to false, it should convert all native types to
>> expanded values using a string representation, probably converting all
>> numbers to xsd:double. Here's some possible test results:
> 
> The main reason why I don't particularly like this solution is because it
> tightly couples JSON-LD to XSD. RDF's usage of XSD is due to historic
> reasons (RDF/XML), but should we really tightly couple JSON-LD to it as
> well? What about http://schema.org/Number for instance? That would map 1:1
> to a JSON number.

schema:Number is a big mistake, IMO. Processors should be free to add additional datatype maps.

> Currently the coupling to XSD (and all the complexity that stems from it) is
> in the RDF conversion algorithms - and that's the place where it belongs
> IMO. Shifting it to expansion/compaction makes *every* JSON-LD processor
> much more complex without reducing the complexity of JSON-LD-RDF converters.

A conforming processor needs the complexity, wherever it is used. This just moves the usage from RDF conversion to compaction/expansion.

>> useNativeTypes=false with expand:
>> 
>>  "http://example/prop1": [{"@value": "5.0E0", "@type":
>> "http://www.w3.org/2001/XMLSchema#double"}],
>>  "http://example/prop2": [{"@value": "5.0E0", "@type":
>> "http://www.w3.org/2001/XMLSchema#double"}],
>>  "http://example/prop3": [{"@value": "5.0", "@type":
>> "http://www.w3.org/2001/XMLSchema#double"}],
>>  "http://example/prop4": [{"@value": "5.0E0", "@type":
>> "http://www.w3.org/2001/XMLSchema#double"}],
>>  "http://example/prop5": [{"@value":  "99999...1000s.of.9s", "@type":
>> "http://www.w3.org/2001/XMLSchema#integer"}],
> 
> OK, so basically numbers are converted to the canonical lexical form of
> xsd:double and if the type is missing, xsd:double is added.

Yes.

>> useNativeTypes=true with expand:
>> 
>>  "http://example/prop1": [{"@value": 5}],
>>  "http://example/prop2": [{"@value": 5, "@type":
>> "http://www.w3.org/2001/XMLSchema#double"}],
>>  "http://example/prop3": [{"@value": 5, "@type":
>> "http://www.w3.org/2001/XMLSchema#double"}],
>>  "http://example/prop4": [{"@value": 5, "@type":
>> "http://www.w3.org/2001/XMLSchema#double"}],
>>  "http://example/prop5": [{"@value":  4611686018427387903, "@type":
>> "http://www.w3.org/2001/XMLSchema#integer"}],
>> 
>> The last one really can't be specified, due to the fact that JSON
>> doesn't specify it. I used 2^62-1, but we probably shouldn't even test
>> it.
> 
> So also literals which are not in the canonical lexical form are converted
> ("5.0" -> 5), right?

The JSON parse I use does keep the distinction between 5 and 5.0, so if I used that parser to parse {"v": 5} and {"v": 5.0}, they retain the same representation. This comes down to how the conversion if made; in Ruby, I presume I would do something like "5.0".to_f and use that as the native representation. In this case both "5".to_f and "5.0".to_f give me 5.0. I'm not sure how that plays out on other platforms, or if there's anything we can do about it.

>> For compaction, I presume that the intention of someone using this
>> would be to reduce such values to a simple value, not a expanded value
>> if useNativeTypes is set to true. I also presume that none of the
>> properties have a type coercion applied
>> 
>> useNativeTypes=false with compact:
>> 
>> "prop1": { "@value": "5.0E0", "@type": "xsd:double" }
>> "prop2": { "@value": "5.0E0", "@type": "xsd:double" }
>> "prop3": { "@value": "5.0", "@type": "xsd:double" }
>> "prop4": { "@value": "5.0E0", "@type": "xsd:double" }
>> "prop5": { "@value": "99999...1000s.of.9s", "@type": "xsd:integer" }
>> 
>> useNativeTypes=true with compact:
>> 
>> "prop1": 5
>> "prop2": 5
>> "prop3": 5
>> "prop4": 5
>> "prop5": 4611686018427387903
>> 
>> (same caveat on prop5)
> 
> So that would mean that a { "@value": 5, "@type": "xsd:integer" } wouldn't
> round-trip anymore, right? The type would be lost during compaction.

Once you convert to native types, you loose the ability to round-trip. But the point of this is to create a convenient representation to work with the data.

>>> and how the corresponding RDF literals would be transformed to JSON-
>> LD:
>>> 
>>> <> <prop3> "5"^^"xsd:double" .
>>> <> <prop3> "5.0"^^"xsd:double" .
>>> <> <prop4> "5.0E0"^^"xsd:double" .
>>> <> <prop5> "99999...1000s.of.9s"^^"xsd:integer" .
>> 
>>  "http://example/prop2": [{"@value": "5", "@type":
>> "http://www.w3.org/2001/XMLSchema#double"}],
>>  "http://example/prop3": [{"@value": "5.0", "@type":
>> "http://www.w3.org/2001/XMLSchema#double"}],
>>  "http://example/prop4": [{"@value": "5.0E0", "@type":
>> "http://www.w3.org/2001/XMLSchema#double"}],
>>  "http://example/prop5": [{"@value":  "99999...1000s.of.9s", "@type":
>> "http://www.w3.org/2001/XMLSchema#integer"}],
> 
> OK, so the useNativeTypes flag is not used for RDF conversion at all
> anymore, or more precisely, it is hardcoded to false for RDF conversions.
> 
> I really ask myself what we gain by doing this? I think you and Sandro are
> assuming that all that might be available in some other RDF serialization as
> well is gonna be converted to JSON-LD using the algorithm in the spec with
> useNativeTypes set to true. I don't think that's true. I much more think
> that JSON-LD will be generated directly by serializing some internal data
> structures where numbers are numbers and not strings. And even if it is
> converted from another RDF serialization format I believe publishers are
> very well capable to decide whether the precision loss matters or not. If
> they are not sure, they can just set useNativeTypes to false and be done
> with that.

If you have an application which uses a TripleStore as a backend, then when you serve up JSON-LD, you're always going through a fromRdf conversion. There certainly may be services that use a different backend for storing data. I think that this will be a common pattern for people coming to JSON-LD from an RDF background.

Safter is to change the default of useNativeTypes to false, so that implementations can choose to loose fidelity by setting it to true, if they know that data-loss can be tolerated. Having the default be to loose data doesn't make sense to me.

> If a consumer really doesn't like the literal values, it can, e.g.,
> round-trip the data through the RDF conversion algorithms to convert them to
> native types. Processors might also offer helper methods to do this without
> having to go through RDF - I plan to add one to mine. This also has the
> advantage that the user can select what to transform (e.g.,
> http://schema.org/Number).

That's a pretty good idea, and I could support that.

Gregg

> --
> Markus Lanthaler
> @markuslanthaler
> 
> 

Received on Sunday, 2 June 2013 22:15:55 UTC