Re: around the table on datatypes [ was: Re: datatyping draft 3 (for telecon)]

>>   o Are there any concerns with the proposal
>>     you would like to raise? (What are they?)
>
>Yes.
>
>The use of subclass declarations to "switch idioms"; e.g.
>
>   rdf:type rdfs:subPropertyOf rdfs:dtype .    (section 5, 10)
>   rdfs:range rdfs:subPropertyOf rdfs:drange .  (section 7, 10)
>
>   rdf:dtype rdfs:range rdfs:Datatype .         (section 10)
>   rdf:drange rdfs:range rdfs:Datatype .        (section 10)
>
>The problem I foresee is when one wishes to combine information from 
>different sources.

Then the above could be dangerous, since these assertions apply to 
any graph that contains them. Yes, and we should emphasize this 
point. Still, I'm all for letting people live dangerously if they 
want to take the risks upon themselves.

>
>Suppose some graph A contains some statements:
>
>   exa:decimalInteger rdf:type rdfs:Datatype .
>   exa:octalInteger rdf:type rdfs:Datatype .
>   exa:octalInteger rdfs:subClassOf exa:decimalInteger .
>   _:a rdf:value "10" .
>   _:a rdf:dtype exa:octalInteger .
>   _:b rdf:value "10" .
>   _:b rdf:dtype exa:decimalInteger .
>
>where the intent is to define a new datatype class with the same 
>value space as an existing class, but with a different lexical space 
>and lexical-to-value mapping.
>
>Also, consider graph B, constructed to avoid datatype clashes:
>
>   rdf:type rdfs:subPropertyOf rdfs:dtype .

I note that this is intrinsically a 'dangerous' option that should 
only be used when there is no possibility of clashing datatypes being 
used. So whoever wrote graph B was making some potentially dangerous 
assumptions, like when I remove the safety guard from my bench saw. I 
know Im taking a risk, but the convenience is worth it.

>   exb:integer rdf:type rdfs:Class .
>   exb:decimalInteger rdf:type rdfs:Datatype .
>   exb:octalInteger rdf:type rdfs:Datatype .
>   exb:octalInteger rdf:type rdfs:Datatype .
>   exb:octalInteger rdfs:subClassOf exb:integer .
>   exb:decimalInteger rdfs:subClassOf exb:integer .
>   _:c rdf:value "10" .
>   _:c rdf:type exb:octalInteger .
>   _:d rdf:value "10" .
>   _:d rdf:type exb:decimalInteger .
>
>So far, so good.  But consider what happens if these two graphs are 
>combined (merged):

OK, B is about to lose a thumb.

>We end up with a graph that entails the following:
>
>[From B:]
>   rdf:type rdfs:subPropertyOf rdfs:dtype .
>[From A:]
>   exa:decimalInteger rdf:type rdfs:Datatype .
>   exa:octalInteger rdf:type rdfs:Datatype .
>   exa:octalInteger rdfs:subClassOf exa:decimalInteger .
>   _:a rdf:value "10" .
>   _:a rdf:dtype exa:octalInteger .
>   _:b rdf:value "10" .
>   _:b rdf:dtype exa:decimalInteger .
>
>Then, from the subclass relationship, we have:
>
>   _:a rdf:type exa:decimalInteger .
>
>And from the subproperty assertion:
>
>   _:a rdf:dtype exa:decimalInteger .
>
>which has introduced a datatyping conflict.

Right, and predictably so. B was betting that all datatypes were 
consistent, and A proved him wrong  by creating one that wasn't.

>...
>
>My position here is that it should NEVER be possible to merge two 
>valid RDF graphs and end up with an error or inconsistency of this 
>kind.

Two reactions to this position.
(1) technically, this graph is a consistent RDF graph. It has a 
datatype clash in it, but that's a problem for the datatype checker, 
not for RDF itself.
(2) If you want to say that datatyping is intrinsic to RDF, then the 
position is untenable. Datatype clashes CAN occur; the world of 
datatypes has genuine differences of opinion in it, and its 
impossible to freely combine datatyping information without running 
the risk of finding such conflicts. There's nothing we can do about 
that.

>  To my mind, the ability to freely merge RD information from various 
>sources is the key to power of RDF -- to combine information from 
>different RDF applications thereby obtaining a new knowledge base 
>that is potentially more than the sum of its parts.

Sure, but there isnt any way to guarantee that this free combination 
is always going to produce a single coherent story, Sometimes, people 
really do disagree. What A and B discovered is that they have a 
difference of opinion.

>
>I think that information will demonstrate a variation of Metcalfe's 
>law:  the value of information will be exponentially proportional to 
>the number of applications that can create and share it.  RDF is a 
>key to that sharing, IFF information can be freely combined.

But when there can be conflicts, the free combination of information 
can reveal those conflicts. Theres nothing we can do about that: 
there's no point in trying to pass a law that everyone has to agree 
with everyone else.

>This last bit is all assertion-without-proof, but I'm guessing that 
>most people will agree that it's important to be able to merge RDF 
>information from different sources.
>
>...
>
>I think this may be a particular case of a wider danger:  if people 
>try to make additional assertions about the RDF core vocabulary, 
>beyond those specified by RDFcore, the result may be that valid 
>information is lost when graphs are combined -- interpretations that 
>should be models are no longer models.

Thats not a loss of information, its a gain. The fewer satisfying 
interpretations, the better. When you have none, then you know that 
something doesnt agree with something else.

>This leads me to think that any attempt to make additional 
>assertions about core RDF vocabulary terms should be regarded as an 
>error, or at least should carry a serious health warning.

I agree that we should post suitable warnings, but I don't see how we 
can legislate a perfect world.

Pat

-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes

Received on Wednesday, 20 February 2002 00:40:24 UTC