Re: Datatype normalization

Nathan wrote:
> Hi All,
> 
> I'd suggest that a high percentage of the worlds RDF data is being 
> published untyped, where plain literals are used as rather than typed 
> literals "12.2" vs "12.2"^^xsd:decimal, and also (to a lesser extent) 
> "strings as"^^xsd:string's.
> 
> Until today, I had assumed that it was pretty "safe" to, upon parsing, 
> turn xsd:strings in to plain literals / pull the datatype from the range 
> of a property and turn the object in to the correct type.
> 
> However, it's been suggested to me today that this probably isn't a good 
> thing / "the right thing" to do.
> 
> And thus, should I be avoiding implementing this feature, and 
> additionally what are the reasons *not* to do this.
> 
> An example:
> 
>  Ontology contains..
>    ex:prop rdfs:range xsd:decimal .
> 
>  "data" contains..
>    :foo ex:prop "12.2" .
> 
> What reason would there be not to just infer/pull the type and convert 
> to a typed literal?

Logical monotonicity.  That is, adding new facts to an RDFgraph should not 
invalidate inferences already made.

Attractive as it is, the mechanism you propose for inferring datatypes from 
rdfs:range declarations falls foul of this, as inferences you might make in the 
absence of rdfs:range statements may become incorrect when they are added to the 
graph.

I see this is part of the price we must pay for supporting an open-world, 
"missing-isn't-broken" [1] system for data on the web.

#g
--

[1] this phrase due to Dan Brickley - 
http://rdfweb.org/mt/foaflog/archives/2003/07/24/12.22.48/ - unfortunately that 
URI has gone 404 (Dan: is this uncoolness permanent or transient?)

Received on Monday, 15 November 2010 09:31:23 UTC