Re: DATATYPING: second draft

Sergey,

At 12:57 10/12/2001 -0800, Sergey Melnik wrote:
>I restructured and extended the datatyping document to include the most
>recent contributions by Graham, DanC, PatrickS and Frank:


Good work Sergey.  I'm finding it increasingly hard to distinguish between 
the different proposals, a fact I find encouraging in that I hope it means 
we are moving to a synthesis of ideas.

sans chapeau:

Let me see if I've got this straight.

In the P proposal, two nodes in a graph might be labeled "10".  An 
interpretation of one node could be the integer (10) (borrowing Jeremy's 
notation) and the other might be the string "10".

In the PL proposal, two nodes in a graph might be labeled "10" and the 
interpretation of both nodes must be the same string "10".  However, an 
application may use other information, possibly represented in the graph, 
possibly not, to 'interpret' that string as an integer, or not.

In the B idiom of the S proposal, two nodes in the graph might be labeled 
"10" and an interpretion of one node might be the string "10" which is a 
member of the lexical space of xsd:integer and of the other might be the 
string "10" which is a member of the lexical space of xsd:string.  Given

   foo bar "10" .

we need more information to figure out which "10" is meant, e.g. a range 
constraint.

ok so far?


Dan has raised an issue, rephrased by Pat as how many triples result from 
merging:

   foo bar "10" .

and

   foo bar "10" .

Peter Patel-Schneider, echo'd by Jeremy has raised the issue of 
compatibility with xml:schema.  I personally feel there is an elegance to 
Jeremy's proposal which has xml schema dealing with syntax and lexical 
issues, and rdf schema dealing with data model issues.

A while back, Jan suggested putting the integer in the graph rather than 
the numeral.  Hmm, I can't remember why that did not fly at the time.

Patrick has raised the issue of things getting rather messy if the lexical 
representation of a value becomes separated from the mapping which defines 
its interpretation.

And this led me to wondering about Patrick's suggestion that a literal is 
in fact a pair.

So what if  ...
                  we extended n-triples so that literals MAY have an 
associated datatype.

      <foo> <prop> xsd:integer:"10" .

If we don't know the datatype, we leave it blank:

     <foo> <prop> _:"10" .

We can do inference:

   <foo> <prop> _:"10" .
   <prop> <rdfs:rangeLex> <xsd:integer.lex> .

entails

   <foo> <prop> xsd:integer:"10" .

and also:

   <prop> <rdfs:rangeLex> <xsd:integer.lex> .

entails

   <prop> <rdfs:range> <xsd:integer.val> .

We can do simplification:

   <foo> <prop> xsd:integer:"10" .

entails

   <foo> <prop> _:"10" .

Some WG, to leave aside charter issues, could decided that the triple to 
output for:

   <rdf:Description>
     <prop xsd:type="xsd:integer">10</prop>
   </rdf:Description>

is

   _:bnode <prop> xsd:integer:"10" .

What happens if we merge

   <foo> <prop> _:"10" .

and

   <foo> <prop> _:"10" .

Icky bit here: by rule we get:

   <foo> <prop> _:"10" .

If we merge

   <foo> <prop> _:"10" .
   <prop> <rdfs:rangeLex> <xsd:string>

and

   <foo> <prop> _:"10" .
   <prop> <rdfs:rangeLex> <xsd:int>

the merging rule is, first run datatype inference to populate blank data 
types then merge, which gives a merge of

   <foo> <prop> xsd:string:"10" .

and

   <foo> <prop> xsd:int:"10" .

which will give the right result.  Why do I have the feeling I didn't get 
away with that bit.

No matter, completing the job:

Merging

   <foo> <prop> _:"10" .
   <prop> <rdfs:rangeLex> <xsd:integer>

and

   <foo> <prop> _:"12" .
   <prop> <rdfs:rangeLex> <eg:oct>

gives

   <foo> <prop> xsd:integer:"10" .

or

   <foo> <prop> eg:oct:"10" .

to a processor that 'understands' the xsd:integer and eg:oct datatypes.

Finally,

   <foo> <prop> xsd:integer:"10" .

if and only if

   <foo> <prop> _:v .
   _:v   <xsd:integer.map> "10" .


Brian

Received on Wednesday, 12 December 2001 13:23:22 UTC