RDF-ISSUE-129 (LC2 - Sandro Hawke): JSON-LD xsd:integer lossless conversion [JSON-LD Last Call 2]

RDF-ISSUE-129 (LC2 - Sandro Hawke): JSON-LD xsd:integer lossless conversion  [JSON-LD Last Call 2]

http://www.w3.org/2011/rdf-wg/track/issues/129

Raised by: Sandro Hawke
On product: JSON-LD Last Call 2

JSON-LD-API allows certain RDF literals to be mapped to JSON numbers (if 
the "use native types" flag is set), but I don't think the mapping rules 
are as good as they could be.  Markus concurs, although I'm not sure we 
quite agree on what they should be.

The current spec says:

    If the /use native types/ flag is set to true
    <http://www.w3.org/TR/json-ld-api/#dfn-true>, RDF literals
    <http://www.w3.org/TR/rdf11-concepts/#dfn-literal> with a datatype
    IRI <http://www.w3.org/TR/rdf11-concepts/#dfn-datatype-iri> that
    equals |xsd:integer| or |xsd:double| are converted to a JSON numbers
    <http://www.w3.org/TR/json-ld-api/#dfn-number>

    ...

    if the datatype IRI
    <http://www.w3.org/TR/rdf11-concepts/#dfn-datatype-iri> of /value/
    equals |xsd:integer| or |xsd:double| and its lexical form
    <http://www.w3.org/TR/rdf11-concepts/#dfn-lexical-form> is a valid
    |xsd:integer| or |xsd:double| according [XMLSCHEMA11-2
    <http://www.w3.org/TR/json-ld-api/#bib-XMLSCHEMA11-2>], set
    /converted value/ to the result of converting the lexical form
    <http://www.w3.org/TR/rdf11-concepts/#dfn-lexical-form> to a JSON
    number <http://www.w3.org/TR/json-ld-api/#dfn-number>.

    ...

    It is important to highlight that in practice it might be impossible
    to losslessly convert an |xsd:integer| to a number
    <http://www.w3.org/TR/json-ld-api/#dfn-number> because its value
    space is not limited.

There is a test case:

    https://github.com/json-ld/json-ld.org/blob/master/test-suite/tests/fromRdf-0002-in.nq
    https://github.com/json-ld/json-ld.org/blob/master/test-suite/tests/fromRdf-0002-out.jsonld


Now, that warning about lossless conversion can be rendered unnecessary, 
I believe, by saying in situations where there might be a loss, one MUST 
NOT convert to a number.   It's true we don't know exactly when there 
might be a loss, but after talking with Markus, I'm pretty confident 
that using the range of 32-bit integers will work well.   JavaScript 
requires 64-bit IEEE, so it can handle intergers up to 53-bits, and 
other systems like PHP (on 32-bit systems) use 32-bit ints for (integer) 
numbers.    I propose an additional test case showing:

"-2147483649"^^xs:integer  // not native since it's outside the 32-bit range
"-2147483648"^^xs:integer  // native json number
"2147483647"^^xs:integer   // native json number
"2147483648"^^xs:integer   // not native since it's outside the 32-bit range

I'd also add:

"1"^^xs:int              // not native since it's 'int' not 'integer'
"01"^^xs:integer     // not native since it's not in canonical form

These rules will make xs:integer data round tripping through JSON-LD 
perfectly lossless, I believe, on systems that can handle at least 32 
bit integers.

I'm paying attention to this mostly because I'm building RDF data 
synchronization stuff in JavaScript.  I need the code to not corrupt 
data (even seemingly harmless corruption will screw up synchronization), 
and I'd like to align it with JSON-LD.

On a related topic, there's still the problem of xs:double.  I don't 
have a good solution there.   I think the only way to prevent datatype 
corruption there is to say don't use native number when the value 
happens to be an integer.   But that's pretty painful -- you don't want 
different code paths just because the value happens to be an integer.   
:-/     (Or have a custom JSON parson that notices the ".0"....   but 
that's kind of against the rules.   Maybe we could suggest it for cases 
that need type fidelity?  I don't know.)

Original post: 

http://lists.w3.org/Archives/Public/public-rdf-wg/2013May/0121.html

Received on Monday, 13 May 2013 01:29:35 UTC