W3C home > Mailing lists > Public > semantic-web@w3.org > November 2010

Re: Datatype normalization

From: Axel Rauschmayer <axel@rauschma.de>
Date: Fri, 12 Nov 2010 13:42:07 +0100
Cc: Semantic Web <semantic-web@w3.org>
Message-Id: <B6FF2CB0-ECCF-4379-B45F-A97E0FA20934@rauschma.de>
To: nathan@webr3.org
Sounds like a separate tool/process for fixing ill-formed data. I donít think this step should be performed automatically.

On Nov 12, 2010, at 12:59 , Nathan wrote:

> Just to clarify, I'm specifically talking about when the property has a  range specified - not just hitting "12.1" in a graph and saying oh that looks like a decimal so I'll convert it to that. More along the lines of type inference in a compiler - next step being to validate against DatatypeRestrictions.
> 
> There are two contexts where I'm looking to implement this functionality, as part of an rdf library which converts typed literals to native types - and as part of an "RDF compiler".
> 
> Good catch re "100,123" btw, hadn't thought of that!
> 
> Cheers,
> 
> Nathan
> 
> Axel Rauschmayer wrote:
>> It completely depends on what your application is. What you are trying to do is similar to analyzing unstructured text. Sure, "12.2" looks like a number, but is it really? It could indicate a section in a book. Another example is "100,123" which is between 100 and 101 in many European countries. Why do you even need to infer a type?
>> On Nov 12, 2010, at 12:33 , Nathan wrote:
>>> Hi All,
>>> 
>>> I'd suggest that a high percentage of the worlds RDF data is being published untyped, where plain literals are used as rather than typed literals "12.2" vs "12.2"^^xsd:decimal, and also (to a lesser extent) "strings as"^^xsd:string's.
>>> 
>>> Until today, I had assumed that it was pretty "safe" to, upon parsing, turn xsd:strings in to plain literals / pull the datatype from the range of a property and turn the object in to the correct type.
>>> 
>>> However, it's been suggested to me today that this probably isn't a good thing / "the right thing" to do.
>>> 
>>> And thus, should I be avoiding implementing this feature, and additionally what are the reasons *not* to do this.
>>> 
>>> An example:
>>> 
>>> Ontology contains..
>>>  ex:prop rdfs:range xsd:decimal .
>>> 
>>> "data" contains..
>>>  :foo ex:prop "12.2" .
>>> 
>>> What reason would there be not to just infer/pull the type and convert to a typed literal?
>>> 
>>> Best,
>>> 
>>> Nathan
>>> 
>>> seeAlso:
>>> http://www.w3.org/TR/rdf-plain-literal/
>>> http://www.w3.org/DesignIssues/InterpretationProperties.html
>>> 
>>> 
> 
> 
> 

-- 
Dr. Axel Rauschmayer
Axel.Rauschmayer@ifi.lmu.de
http://hypergraphs.de/
### Hyena: organize your ideas, free at hypergraphs.de/hyena/
Received on Friday, 12 November 2010 12:42:38 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:41:24 UTC