Re: long-range datatyping and rdfa/microdata

Right, the fact that any single data value has to be explicitly typed is 
a problem in RDF.

Often, one would like to write:

ex:prop rdfs:range xsd:decimal .
ex:sub ex:prop "42" .

and infer that "42" is a decimal number. However, what one gets from 
these two triples is that "42" is a sequence of 2 characters AND a 
decimal, which is inconsistent.

Overcoming this in the RDF data model is really hard. The problem is 
that literals are universal identifiers, just like URIs. So "42", no 
matter where it appears, is always identifying the same thing, namely 
the sequence of characters '4' and '2'.
If "42" could be interpreted as a decimal, then it would be a decimal 
everywere, for everybody.

So the following would not be possible:

ex:prop rdfs:range xsd:decimal .
ex:sub ex:prop "42" .
ex:password rdfs:range xsd:string .
ex:sub2 ex:password "42" .

Here, certainly some people would expect the first "42" to be denoting 
the number, while the second is just two characters. But this implicitly 
assumes that the denotation of literals is contextual: it would depend 
on which predicate is used in the triple. While it would be possible, in 
principle, to define a language where this makes sense, it does not fit 
at all with the RDF data model.

One way of addressing this issue would be to consider "42" as syntactic 
sugar for a typed literal with an "undefined type", which I could 
represent like this:

ex:sub ex:prop "42"^^[] .

But this would mean that the following graph serialisation:

ex:sub ex:prop "42" .
ex:sub ex:prop "42" .

effectively contains 2 distinct triples, not 1.
I doubt this is the direction we want to take.


Le 08/06/2011 18:02, Dan Brickley a écrit :
> Hi folks
> Firstly, apologies I couldn't make today's call. I've spent my RDF'ing
> time this week talking to a lot of people about,
> rdfa/microdata etc.
> I want to bring something up  related to that: back in RDFCore WG we
> called it "long range" data-typing, but didn't figure out a way to
> make it work. I'd appreciate if someone could articulate the
> connection to current discussion on literals, and suggest if there are
> ways we could make it work in 2011.
> The idea is that many properties are deployed as if their values take
> string form, but we know from the schema that the values can be
> interpreted e.g. as integers or dates.
> RDF's datatyping mechanism puts a lot of burden on instance data, and
> in some contexts (eg. Website markup) this can be problematic. So for
> example chooses Microdata over
> RDFa and lists 'datatypes' as one of the complexity burdens of RDFa
> markup.
> In practice I don't think a lot of sites will enjoy marking up each
> property value occurence with a datatype, ... and so vocabulary
> designers are tending not to make datatyping explicit.
> So for example in FOAF we have foaf:age, which Peter Mika originally asked for.
> "The age property is a
> relationship between a Agent and an integer string representing their
> age in years. "
> This can be used in RDFa as so:<p>blah blah<span
> property="foaf:age">39</span>  blah</p>.
> If we try to persuade publishers to put datatype="xsd:integer"
> alongside each age, ... we'll have a hard time. So is there anything
> we can do at the schema level?  Mumble mumble range mumble...
> Pat - can you remember why we couldn't make this work in the semantics
> last time?
> cheers,
> Dan
> (another possibility is to do something in RDFa's profile mechanism,
> )

Antoine Zimmermann
Researcher at:
Laboratoire d'InfoRmatique en Image et Systèmes d'information
Database Group
7 Avenue Jean Capelle
69621 Villeurbanne Cedex
Tel: +33(0)4 72 43 61 74 - Fax: +33(0)4 72 43 87 13
Lecturer at:
Institut National des Sciences Appliquées de Lyon
20 Avenue Albert Einstein
69621 Villeurbanne Cedex

Received on Thursday, 9 June 2011 09:02:15 UTC