Re: datatype coercion issues

On Mar 27, 2012, at 10:59 AM, Dave Longley wrote:

> One idea is to treat datatypes as opaque values unless a special 
> "primitive flag" is provided to compaction or expansion. The flag has 
> three settings, which are: off, convert all natives to xsd type @values, 
> and convert all xsd type @values to natives. This would mean:

Not a big fan of such flags, but I'm open to consider it.

> When the primitive flag is off (this is the default behavior):
> *Note: Remember that expansion always occurs before compaction in the 
> compaction algorithm.
> 
> During expansion, if a value is already in expanded form (@value), it is 
> left alone. If a value is a native JSON string, number, or boolean, then 
> the value becomes a @value with a @type if there is a coercion rule. How 
> a native double is converted to a string could be left unspecified or we 
> could use %1.16e. If there is no coercion rule, the value is left alone; 
> it remains its native type.
> 
> During compaction any value with a coercion rule becomes a string. No 
> attempt is made to understand any xsd types at all; they are treated as 
> opaque just like any other datatype. If there is no coercion rule, the 
> value is left alone; it remains its native type.
> 
> When the primitive flag is set to convert natives to xsd type @values:
> 
> During expansion, any native JSON number or boolean with a coercion rule 
> is treated the same way as when the primitive flag is off. If the value 
> has no coercion rule, then a JSON number that contains a decimal point 
> will be treated as if it had an xsd double coercion rule, a JSON number 
> without a decimal point will be treated as if it had an xsd integer 
> coercion rule, and a JSON boolean will be treated as if it had an xsd 
> boolean coercion rule.
> 
> During compaction, all values with coercion rules become strings. In 
> other words, the same as when the primitive flag is off.
> 
> When the primitive flag is set to convert xsd type @values to natives:
> 
> During both expansion and compaction, any value with a @type or coercion 
> rule that is an xsd type of integer, boolean, or double is converted to 
> its corresponding native JSON type. If the value is not within the 
> lexical space of the given xsd type, then some rules are used to convert 
> it. If the type is xsd integer or double, and the value is a boolean, 
> then true is 1 and false is 0; if it is a string, then the initial part 
> of the string that is an integer or a double will be the resulting value 
> where 0 is used if the initial string contains no digits. If the type is 
> xsd boolean, a value of "false", "0", or 0 will be considered false, and 
> anything else will be considered true.
> 
> Any other values expand and compact the same way that they do when the 
> primitive flag is off.
> 
> This approach allows us to treat xsd types like any other type by 
> default (as opaque), but gives us the option to treat them differently 
> in order to convert to/from native types. Also, it treats native type 
> conversion as an orthogonal issue to expansion/compaction.
> 
> Another minor tweak to this idea would be to allow more a fine grained 
> setting of the primitive flag for each type of native.
> 
> Thoughts?

Again, I'm not too keen on introducing a flag, and adding more processor complexity and developer uncertainty. I think we should try to pick a rule and stick with it. I think this treating this as if the flag is always set, in order to keep as much use of native JSON, which is what developers will expect, IMO. However, I think that this should be limited to coercion of native types to other native types, as you suggest, and strings should only be coerced to native if there's a lexical match.

Gregg

> On 03/27/2012 11:52 AM, Gregg Kellogg wrote:
>> On the call today, we spent a lot of time discussing issues 87 [1] and 81 [2], relating to coercion and round tripping. There are basically several things that come out of this, for which we probably need separate issues:
>> 
>> What is the range of the coercion operator in JSON-LD? As indicated by issue 87, it is any value (not an object or an array). This would include boolean and numeric, in addition to string. One possibility is limiting this to string, or doing it on a case-by-case basis. (boolean could coerce numeric types based on 0 or not 0, integer or double could coerce boolean or other numeric).
>> 
>> When is coercion applied? If applied in expansion, this implies that every term having a coercion rule with an appropriate value is placed in @value form. We currently say that native types are not converted, but we contradict ourselves for xsd:double.
>> 
>> Do strings not having the lexical form of a coerced datatype have coercion applied? For example does "foo" coerced to a boolean result in "foo"^^xsd:boolean, or just "foo".
>> 
>> For some other examples, consider the following:
>> 
>> {
>>   "@context": {
>>     "xsd": "http://www.w3.org/2001/XMLSchema#",
>>     "boolean": {"@id": "xsd:boolean", "@type": "xsd:boolean"},
>>     "integer": {"@id": "xsd:integer", "@type": "xsd:integer"},
>>     "double": {"@id": "xsd:double", "@type": "double"},
>>     "date": {"@id": "xsd:date", "@type": "date"}
>>   },
>>   "boolean": [true, "false", 1, "0", 5, "5", 2.5, "2.5E0", "2011-03-27", "2011-03-27T01:23:45"],
>>   "integer": [true, "false", 1, "0", 5, "5", 2.5, "2.5E0", "2011-03-27", "2011-03-27T01:23:45"],
>>   "double": [true, "false", 1, "0", 5, "5", 2.5, "2.5E0", "2011-03-27", "2011-03-27T01:23:45"],
>>   "date": [true, "false", 1, "0", 5, "5", 2.5, "2.5E0", "2011-03-27", "2011-03-27T01:23:45"]
>> }
>> 
>> My implementation currently results in the following (although I don't necessarily agree with all of these conversions):
>> 
>> @prefix xsd:<http://www.w3.org/2001/XMLSchema#>  .
>> [ xsd:boolean "true"^^xsd:boolean,
>>               "false"^^xsd:boolean,
>>               "5"^^xsd:boolean,
>>               "2.5"^^xsd:boolean,
>>               "2.5E0"^^xsd:boolean,
>>               "2011-03-27"^^xsd:boolean,
>>               "2011-03-27T01:23:45"^^xsd:boolean;
>>   xsd:integer "true"^^xsd:boolean,
>>               "false"^^xsd:integer,
>>               "1"^^xsd:integer,
>>               "0"^^xsd:integer,
>>               "5"^^xsd:integer,
>>               "2"^^xsd:integer,
>>               "2.5E0"^^xsd:integer,
>>               "2011-03-27"^^xsd:integer,
>>               "2011-03-27T01:23:45"^^xsd:integer;
>>   xsd:double "true"^^xsd:double,
>>               "false"^^xsd:double,
>>               "1.0E0"^^xsd:double,
>>               "0"^^xsd:double,
>>               "5.0E0"^^xsd:double,
>>               "2.5E0"^^xsd:double;
>>   xsd:date    "true"^^xsd:boolean,
>>               "false"^^xsd:date,
>>               "1"^^xsd:date,
>>               "0"^^xsd:date,
>>               "5"^^xsd:date,
>>               "2.5"^^xsd:date,
>>               "2.5E0"^^xsd:date,
>>               "2011-03-27"^^xsd:date,
>>               "2011-03-27T01:23:45"^^xsd:date
>> ] .
>> 
>> We could also only perform coercion when the lexical form of the representation matches the XSD definition, although this would be at odds with use in RDF parses, such as Turtle. We currently say that native representations, whether coerced or not, remain in their original form, although xsd:double currently contradicts that, always coercing any value to an @value representation using "1.16E"
>> 
>> When compacting, when can data-typed @value representations be turned into native form? One possible solution would be to convert anything to native form where the lexical representation in @value matches that of the associated XSD definition.
>> 
>> Is 1.16E preserving of 64-bit doubles? In my Ruby implementation it produces rounding errors:
>> 
>> "%1.16E" % 5.2 =>  "5.2000000000000002E+00"
>> "%1.16E" % 5.3 =>  "5.2999999999999998E+00"
>> 
>> %1.15E does not result in rounding errors. It would be useful for others to check their implementations.
>> 
>> Gregg
>> 
>> [1] https://github.com/json-ld/json-ld.org/issues/87
>> [2] https://github.com/json-ld/json-ld.org/issues/81
> 
> 
> -- 
> Dave Longley
> CTO
> Digital Bazaar, Inc.
> 
> 

Received on Tuesday, 27 March 2012 23:47:33 UTC