W3C home > Mailing lists > Public > semantic-web@w3.org > November 2010

Re: Datatype normalization

From: Sandro Hawke <sandro@w3.org>
Date: Mon, 15 Nov 2010 22:28:08 -0500
To: Pat Hayes <phayes@ihmc.us>
Cc: Graham Klyne <GK-lists@ninebynine.org>, nathan@webr3.org, Semantic Web <semantic-web@w3.org>
Message-ID: <1289878088.23722.318.camel@waldron>
I think the solution here is to have two different properties, tied
together with a "twin" arc.  Given suitable semantics:

        ns1:Aubrey ns2:age "8".
        ns1:age srdf:twin ns2:age.
        ns1:age rdfs:range xsd:decimal.
entails
        ns1:Aubrey ns1:age "8"^^xsd:decimal.
        
        
And the other direction:

        ns1:Aubrey ns1:age "8"^^xsd:decimal.
        ns1:age srdf:twin ns2:age.
        ns1:age rdfs:range xsd:decimal.
entails
        ns1:Aubrey ns2:age "8".
        

This would allow systems to be designed to use either datatyped literals
or plain (untyped, implicitly string), as they will.  If all consumers
implemented srdf:twin, then they would still be interoperable.

Actually, this is one of three things I suggest srdf:twin do.  The
others concern the domain of the property, and the range of
ObjectProperties.  See
http://decentralyze.com/2010/11/10/simplified-rdf/ for more background
and some more details.

   -- Sandro


On Mon, 2010-11-15 at 08:58 -0600, Pat Hayes wrote:
> Unfortunately correct :-) It might be worth pointing out that even plain literals in RDF effectively have a type already: they are character strings. So being 'untyped' should not be read as 'not yet having a type assigned' but more like 'known to have the type of a simple character string'. It is just like being typed with xsd:string, in fact. Thus, the example given by Nathan is already a type clash, and could give rise to an error (inconsistency) message from a type-savvy reasoner, since "12.2" is definitely not an xsd:decimal. 
> 
> Pat
> 
> On Nov 12, 2010, at 6:59 AM, Graham Klyne wrote:
> 
> > Nathan wrote:
> >> Hi All,
> >> I'd suggest that a high percentage of the worlds RDF data is being published untyped, where plain literals are used as rather than typed literals "12.2" vs "12.2"^^xsd:decimal, and also (to a lesser extent) "strings as"^^xsd:string's.
> >> Until today, I had assumed that it was pretty "safe" to, upon parsing, turn xsd:strings in to plain literals / pull the datatype from the range of a property and turn the object in to the correct type.
> >> However, it's been suggested to me today that this probably isn't a good thing / "the right thing" to do.
> >> And thus, should I be avoiding implementing this feature, and additionally what are the reasons *not* to do this.
> >> An example:
> >> Ontology contains..
> >>   ex:prop rdfs:range xsd:decimal .
> >> "data" contains..
> >>   :foo ex:prop "12.2" .
> >> What reason would there be not to just infer/pull the type and convert to a typed literal?
> > 
> > Logical monotonicity.  That is, adding new facts to an RDFgraph should not invalidate inferences already made.
> > 
> > Attractive as it is, the mechanism you propose for inferring datatypes from rdfs:range declarations falls foul of this, as inferences you might make in the absence of rdfs:range statements may become incorrect when they are added to the graph.
> > 
> > I see this is part of the price we must pay for supporting an open-world, "missing-isn't-broken" [1] system for data on the web.
> > 
> > #g
> > --
> > 
> > [1] this phrase due to Dan Brickley - http://rdfweb.org/mt/foaflog/archives/2003/07/24/12.22.48/ - unfortunately that URI has gone 404 (Dan: is this uncoolness permanent or transient?)
> > 
> > 
> > 
> > 
> 
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973   
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> 
> 
> 
> 
> 
> 
> 
Received on Tuesday, 16 November 2010 03:28:26 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 07:42:23 UTC