Flower Power and Datatyping ~:-)

During this week and today's telecon four excellent ideas surfaced,
all targeted as bringing TDL and S closer together. I am referring to
Graham's and Brian's postings, Pat's idea to have different
inferencing mechanisms and Frank's proposal to explain how some sort
of a "flag" can be used to distinguish S-like or TDL-like modeling
style.

The scoop that I extracted is to use an indicator like parseType to
specify what kind of graph is created from the RDF/XML syntax.

EXAMPLE 1:

  <rdf:Description parseType="untidy">
    <age>5</age>
  <rdf:Description>

Generates two triples:

  _:1 age _:2
  _:2 rdf:value "5"

Of course, since there is exactly one triple that describes _:2,
applications may use an optimized internal representation,
e.g. something like:

  _:1 age <_2:"5">

Alternatively, the same information can be represented in a customary
"tidy" way as

EXAMPLE 2:

  <rdf:Description parseType="tidy">
    <age rdf:value="5"/>
  </rdf:Description>

which generates two equivalent triples

  _:1 age _:2
  _:2 rdf:value "5"

The above examples demonstrate the use of global typing. A
corresponding schema looks just like usual, e.g:

<rdf:Property rdf:ID="age">
  <rdfs:range rdf:resource="&rdfdt;integer"/>
</rdf:Property>

EXAMPLE 3:

If local typing is desired, we can write

  <rdf:Description parseType="tidy">
    <age>
      <rdfdt:integer rdf:value="5"/>
    </age>
  </rdf:Description>

and get

  _:1 age _:2
  _:2 rdf:value "5"
  _:2 rdf:type rdfdt:integer

That's it with respect to the syntax. The graphs generated from
RDF/XML can be interpreted as tidy graphs, per S-P proposal (i.e., the
node _:2 above represents a pair, just like in TDL, and the class
extension of rdfdt:integer is a datatype mapping).

Now let me try to summarize without using too many plugs.

The major advantage of the above twist is that all typed values are
always bNodes. With respect to uniformity of representation, this
feature beats both all existing S-* idioms and TDL. (In fact,
depending on global/local use, TDL sometimes deploys bNodes, sometimes
literals, so applications need to distinguish and use two kinds of
queries.)

The disadvantage is that a change to the syntax is required, so Dave
is going to kick my butt ;)

The above use of typed values is consistent with the tidy graphs model
and supports the notation style suggested in TDL. Only one piece of
vocabulary per datatype is needed (a URI like rdfdt:integer). The
local idiom of TDL remains intact, the global idiom corresponds to
using a flag in the syntax and/or compact representation in the graph.

Have a nice weekend,
Sergey


E-Mail:      melnik@db.stanford.edu (Sergey Melnik)
WWW:         http://www-db.stanford.edu/~melnik
Tel:         OFFICE: 1-650-725-4312 (USA)
Address:     Room 438, Gates, Stanford University, CA 94305, USA

Received on Friday, 1 February 2002 13:21:52 UTC