Re: TDL conflicts with the "duh!" requirement from Patrick Stickler on 2002-01-28 (w3c-rdfcore-wg@w3.org from January 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Mon, 28 Jan 2002 15:24:15 +0200
To: Dan Connolly <connolly@w3.org>
CC: Jeremy Carroll <jjc@hplb.hpl.hp.com>, RDF Core <w3c-rdfcore-wg@w3.org>
Message-ID: <B87B1D9F.C75E%patrick.stickler@nokia.com>
On 2002-01-28 14:53, "ext Dan Connolly" <connolly@w3.org> wrote:


>>> Who's "we"? I always want "W3C" = "W3C". ;-)
>> 
>> But this presumes an implicit, global datatype for all values
>> of a given predicate.
> 
> For all literals period. (except parseType="Literal" ones,
> i.e. structured XML Literals)
> 
> Just like in perl, tcl, and that sort of language.

But perl, tcl, and that sort of language has built-in
native datatypes where the lexical spaces for all
datatypes are (usually) disjunct so that simple
examination of the lexical form allows a processor
to deduce datatype.

RDF has no native, builtin datatypes. You can't *know*
what some literal is, because there are no predefined
datatypes for a literal to be a lexical form of.


>> Surely you are not saying that any
>> arbitrary string has a single interpretation?
> 
> Yes, I am.

I don't think we're in Kansas anymore, Toto!  ;-)

>> As many examples have show, the same literal can -- in the
>> context of a given datatype -- mean very different things.
> 
> You seem to regard that as a fact. I see it as one
> of the design choices; a design choise with very
> bad consequences.
> 
> As experience with perl, tcl, etc. shows, it's quite workable
> to have just one literal datatype and handle a broad range
> of use cases.

Only when the datatyping expectations/requirements are
known by the application itself. You cannot achieve
truly portable expression of typed data literals with
such a model. 

Just because you have not defined the typing explicitly
in RDF does not mean you are not using datatyping!

E.g. your redhead example shows that. You presume
an implicit enumeration of acceptable values for
hairColor, which constitutes a datatype and your
axiom is based on an interpretation of the literal
value in terms of that datatype.
 
>> Two literals may be string equal, but not denote the same
>> value.
> 
> 
> That's one design choice. It's not the one I prefer,
> and it's not the way the S proposal works.

 
>> Literals are local names and we need the "namespace"
>> context provided by a datatype to differentiate between
>> different meanings.
> 
> No, we don't.

There have been sufficient examples on this list
demonstrating that you do.

As I said, whether or not that datatype context is defined
within the RDF graph or external to RDF in the application
space is beside the point -- a literal is a local value
that must have context in order to be interpreted. That
context can simply be the expectations of a given function
or method.
 
But I garuntee you, if I makes the statement in
RDF

   Bob ex:age "11" .

and you try to interpret that as an integer, you're
gonna have problems, since in fact, it's in base 2
notation -- but you assert that "11" *always" means
eleven, right?!

And if you tell me that I can't put a base 2 integer
as a value for that property, then I will say, why
not? And you're likely to answer, because that's  not
what my application expects. And I'll anwser, then
your application expects values of a specific datatype,
no?!
 
>>>> For instance, if we allow literals as subjects,
>>> 
>>> I want that. (eventually; I don't mind syntactic limitations
>>> in 1.0, but I agree we shouldn't do anything today to prevent
>>> doing this later)
>> 
>> Tidy literals will prevent any adoption of literals as
>> subjects in the future.
> 
> No, it won't.

Of course it will!

If you are going to assign rdf:type properties directly to
literal nodes, based on the context of a given statement,
and then merge those literal nodes simply based on string
equality of their labels, then you will end up with type
assertions that were *never* made and in fact will likely
be conflicting and invalid. E.g.

Widget ---foo:offset---> "30" ---rdf:type---> foo:octalInt

Ann ---ex:age---> "30" ---rdf:type---> xsd:integer


now merge them to get a tidy graph:


Widget ---foo:offset------
                         |     --rdf:type---> foo:octalInt
                         V    /
                        "30" <
                         ^    \
                         |     --rdf:type---> xsd:integer
Ann ---ex:age-------------


OOPS! Now which context goes with which interpretation?!
Is the value of foo:offset an xsd:integer or a foo:octalIint?!
What about the value of ex:age for Ann?!

If you allow literals as subjects *and* tidy merging of
literals, you will loose all contextual distinction needed
for interpretation.

The anonymous node (D) idiom avoids that because even if
the literal object of the rdf:value property is merged with
all other occurrences of that string, the anonymous node
remains unique and non-merged.

 
>>>> and say use xml:lang to
>>>> generate triples (which I think some members of the group would like) then
>>>> in general a string in one lang is not the same as the same string in
>>>> another lang.
>>> 
>>> I very much dislike that sort of design; i.e. I consider the
>>> use of xml:lang in the RDF schema for RDF schema broken.
>> 
>> I agree. RDF rather needs a consistent, generic mechanism
>> for statement qualification, which includes such things
>> as language scoping.
> 
> On that we are agreed...

;-)

...
 
>>> This idiom works with completely vanilla triple handling
>>> (apis, query languages, etc.)
>> 
>> I'm not sure how this is "vanilla" if an application
>> must be aware of the particular ontology being used for
>> scoping. I think a more generic mechanism is called for.
> 
> "scoping"? I don't follow you here.

Where statements are qualified by a generic scopingn property,
e.g. rdfx:scope which takes a URI that denotes the scope within
which the statement is "valid", or "in force", or "asserted".

This is akin to scoping in XTM/Topic Maps.

>>>> I will work on it next week.
>>> 
>>> Please consider the related test case I gave a while back
>>> while you're at it:
>>> 
>>> _:somebody ex:leftShoeSize "10".
>>> 
>>> ex:leftShoeSize s:subPropertyOf ex:shoeSize.
>>> 
>>> RDFS-entail this?
>>> 
>>> _:somebody ex:shoeSize "10".
>> 
>> This of course depends on the semantics of s:subPropertyOf.
> 
> I don't mean to change the semantics of subPropertyOf at
> all; they're given by the rule:
> 
> ?s ?p1 ?o.
> ?p1 subProprtyOf ?p2
> ===>
> ?s ?p2 ?o.

I take you you are talking about rdfs:subPropertyOf where
the prefix 'rdfs:' defines the RDF Schema namespace.

I didn't know what 's:' implied.

> 
>> Does it define a subset relation between value spaces,
>> and/or lexical spaces, and/or canonical lexical spaces?
> 
> One of those relationships may be a corrolary of the
> rule above; I'm not sure. But regardless, the
> rule above suffices to deffine subPropertyOf.

Not for lexical datatypes. Sorry. Nope. That was the subject
of *significiant* discussion back in the Fall.

The only backwards compatable and IMO reasonable interpretation
of rdfs:subPropertyOf with regards to lexical datatypes is
that it relates *only* to value spaces, not to lexical spaces
nor canonical lexical spaces nor mappings from lexical to
value spaces.

Patrick
 

--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com
Received on Monday, 28 January 2002 08:23:19 UTC