Re: Datatyping literals: question and test cases

[Patrick Stickler, Nokia/Finland, (+358 40) 801 9690, patrick.stickler@nokia.com]


----- Original Message ----- 
From: "ext pat hayes" <phayes@ai.uwf.edu>
To: "Patrick Stickler" <patrick.stickler@nokia.com>
Cc: <w3c-rdfcore-wg@w3.org>; "Henry S. Thompson" <ht@cogsci.ed.ac.uk>
Sent: 01 November, 2002 00:58
Subject: Re: Datatyping literals: question and test cases


> >[Patrick Stickler, Nokia/Finland, (+358 40) 801 9690, 
> >patrick.stickler@nokia.com]
> >
> >
> >----- Original Message -----
> >From: "ext pat hayes" <phayes@ai.uwf.edu>
> >To: "Patrick Stickler" <patrick.stickler@nokia.com>
> >Cc: <w3c-rdfcore-wg@w3.org>
> >Sent: 31 October, 2002 21:32
> >Subject: Re: Datatyping literals: question and test cases
> >
> >
> >>  >[Patrick Stickler, Nokia/Finland, (+358 40) 801 9690,
> >>  >patrick.stickler@nokia.com]
> >>  >
> >>  >
> >>  >>  >Inlined literals and rdfs:range will *never* work together, except
> >>  >>  >in the single case of rdfs:StringLiteral. I wonder if folks appreciate
> >>  >>  >that oddity.
> >>  >>
> >>  >>  You seem to be assuming that it is impossible for two different
> >>  >>  datatypes to have the same value space.
> >>  >
> >>  >Not at all. But see below.
> >>  >
> >>  >>  I wasn't aware that this was
> >>  >>  a general rule. I would have no problem for example saying that
> >>  >>  rdfs:StringLiteral and xsd:String had the same value space. (NOt the
> >>  >>  same lexical space, but the same value space.)
> >>  >
> >>  >I am presuming, perhaps incorrectly, that for one value space
> >>  >to intersect with another value space that for any two values
> >>  >X and Y which occur in the intersection of those value spaces
> >>  >the same relations hold for them in terms of either datatype.
> >>  >
> >>  >I.e., if X < Y in datatype 1 then X < Y in datatype 2.
> >>  >
> >>  >If one datatype has an ordered value space and the other does
> >>  >not, then can they really intersect?
> >>
> >>  Well, what does it mean to say that the space doesn't have an
> >>  ordering? I mean, its not *impossible* to define an ordering on
> >>  URIrefs.
> >
> >No, but it's a matter of authority. If the "owner" of the datatype
> >(the agency that has the authority to define it) says there is no
> >ordering for the members of its value space, then it doesn't have
> >an ordering.
> 
> I can't make sense of this. It sounds to me like saying that because 
> Im not interested in the colors of the bindings of my books, that 
> therefore they have no colors. Look, I can take one of these 
> unordered value spaces and *I* can define an ordering on it. Of 
> course it *has* an ordering. In fact, if its finite with cardinality 
> N, it has N-factorial orderings. Authority is fine, but its unwise to 
> claim authority over Platonic abstractions.

Sorry, Pat. No.

If we want the SW to be non-monotonic, then folks are not licensed
to change the semantics of resources they don't "own", otherwise
interoperability goes right down the toilet.

Of course, applications are free to do whatever they like, even
assert value-based semantics on inlined literals ;-) but there needs
to be the full realization that diverging from the authoritatively
specified semantics means not playing by the rules and that the
conclusions of your system may very well differ from everyone else's.

If you don't care about that, fine. But in the context of a standard,
and interoperability based on that standard, we need to be clear
about this.

Thus, adding order to a non-ordered datatype is not licensed and
bad practice and will be detrimental to the SW (which IMO is all
about consistent semantics and interoperability).

If the anemically defined datatype not having order doesn't do it
for you, then feel free to define your own. But don't presume that
anyone else is going to respect the ordering you assert for someone
else's datatype.

> >
> >>  I think you have a picture here where a 'space' is something
> >>  like an algebra, ie a set together with some operations or relations
> >>  on the set, rather than simply a set or class of things.
> >
> >That is my understanding of how XML Schema defines datatypes as
> >well. As sets with relations on the sets, and subsets share the
> >relations of their supersets.
> 
> But that doesn't jibe with the RDF picture.  RDF class extensions are 
> just sets . They aren't OO inheritance taxonomies: they don't come 
> with anything to get inherited.

Perhaps you misunderstand me. 

Yes, RDF class extensions are just sets. Therefore relations between
members of those sets are based on inherent characteristics of the
things in those sets, and if those things also belong to other sets,
then they are the same things and will exhibit the same relations
to any other thing which also occurs in the same sets.

So, if we have set A and the members X and Y and X < Y and we also
have set B and X and Y are also members of set B then X < Y in B
as well, not because B specifies it but because of what X and Y are
and those relationships hold between X and Y no matter where X and
Y occur together.

So this is why "foo:bar"^^xsd:string != "foo:bar"^^xsd:anyURI, because
those two different things behave differently, they have different
inherent characteristics.

> >
> >>  Two
> >>  different algebras can have the same underlying set. (I think its
> >>  called the 'carrier' of the algebra, but it was years ago :-)
> >>
> >>  >If X = Y in one value space yet X != Y in the other value space
> >>  >can they really intersect?
> >>
> >>  Well, not if that really means identity, but then if it meant that,
> >>  this would be impossible.
> >
> >Exactly. And that is my point. xsd:string defines a different equality
> >than xsd:anyURI and therefore they cannot intersect.
> 
> No, there is no such thing as 'different equality' in classes. 
> Equality is equality: it means, the same thing. It doesn't come in 
> flavors.

You misunderstand me, and I think agree. If we have A{X, Y} and 
B{X, Y} and in A, X = Y and in B, X != Y then it is fair to conclude
that in fact X and/or Y are ambiguous and that we are talking about
different things.
   
> >And in fact, the recent feedback from the XML Schema WG indicates
> >that their value spaces are in fact disjunct.
> 
> Well, yes, I wrote back to Henry about that. I don't think what he 
> said makes sense, given the wording in the XSD spec.

I look forward to his reply. If he doesn't CC me or the list, please
pass it on. Thanks.

> >
> >>  >
> >>  >I think not, in both cases.
> >>  >
> >>  >Since I do not consider the value space of rdfs:StringLiteral
> >>  >to be ordered, then I do not see that it can intersect with
> >>  >that of xsd:string.
> >>
> >>  HOw about saying that xsd:string has an ordering defined on it which
> >>  isnt relevant to rdfs:StringLiteral?
> >
> >Well, I may be viewing this wrongly, and certainly this is not my
> >strongest area, but I'm thinking along the lines that relations
> >between members of a value space are characteristics of the values
> >themselves and not contextual for the datatype.
> 
> Well, OK, we could go there. But then xsd:integer wouldn't contain 
> integers, for example. They would be integers-with-a-particular 
> ordering, to be distinguished carefully from 
> integers-with-a-different-ordering. I really don't think this would 
> work in RDF: in effect, it forces all class extensions to be 
> disjoint, since the 'things' in the class inherit their class-ness. 
> People-as-family-members are different *things* from 
> people-as-mammals or people-as-employees. Yuk.

I don't think so.

If you have two people (things) that have a given relationship (e.g. married)
then that relationship holds whether those two people are considered as
members of the set mamals, employees, etc. It may be that that relationship
is not relevant to the particular set, but it still holds. The two people
do not cease to be married just because marriage is not relevant to consideration
as mammals. Eh?

These are characteristics/properties of things in the universe, not of the
sets in which those things are placed in.

Yet it is the set by which we define which relations and characteristics
are interesting from a particular point of view. Things are in sets because
the *have* certain characteristics, but it is not membership in the set
that gives them those characteristics, nor do they only have those
characteristics only when considered form the perspective of a particular
set.

I'm still "male" even when considered as a member of the set "employee"
even though the perspective of that set is gender neutral.

In essence, I view RDF classes akin to Java interfaces. They allow me
to interact with things from a particular perspective, and knowing that
that thing conforms to the interface (is a member of that class) I know
that it embodies the characteristics that are interesting with regards
to that interface (class) -- but those characteristics are inherent in
the thing irregardless of the interface.

> >I.e. they are members
> >of that value space because they exhibit those characteristics, and
> >they will exibit those characteristics in whatever value space or
> >subset thereof in which they occur. If there is some other "thing"
> >which does not exibit the same characteristics, no matter how similar,
> >it is not the same thing. Thus even though one may think that the
> >string "foo:bar" is just like the URI "foo:bar" we can test that
> >they are differen
> 
> Well, they sure *look* the same. How do you tell the difference, when 
> you see them in isolation? The URI documents say explicitly that URIs 
> are character strings in several places, in fact: they even tell you 
> which characters you can use in them. Dave's syntax document has a 
> BNF for them.

The URI documents unfortunately blur the lexical and value distinction.
That is a shortcoming of those specifications which we need not repeat.
Where they speak of serialization, they talk in terms of lexical 
forms. Where they speak of equivalence, they talk in terms of
values. The ambiguity is unfortunate.

> >, that they are different things, because they
> >exhibit different characteristics in relation to other things in
> >the universe.
> 
> That begs the question, because if we take your view then there are 
> more things in the universe.

Well, some folks were thinking that the value spaces of xsd:anyURI
and xsd:string intersected, but it appears that they do not, so
there are now more things in the universe than those folks thought.

Is that necessarily a bad thing, that we have removed some ambiguity?

Before we had X == Y and X != Y which was a problem, but now we see
that actually we have X1 == Y1 and X2 != Y2 and now we see that all
is well.

Where's the problem?

> >
> >>  The reason for being so careful about this terminology is that the
> >>  operations are defined on the whole space, sure; but the things IN
> >>  the space are just what they happen to be, which ever category you
> >  > put them into. So with the operations-over-the-carrier-set picture,
> >>  any particular rdfs:StringLiteral is indeed an xsd:string and vice
> >>  versa, even if it makes sense to distinguish the two classes for some
> >>  'global' reason.
> >
> >I may be wrong, but I'm not viewing them as the same thing.
> 
> Well, can you tell me how to tell them apart? When my email editor 
> recognizes a URI and highlights it in blue, does it stop being a 
> character sequence? It still seems to *act* like a character sequence 
> as far as the editor is concerned.
> 
> Or is the 'real' URI in an abstract space somewhere, and the 
> character sequences just surface lexical forms for rendering it, or 
> something? 

That, I think, is the reality, though RDF at present does not reflect
it. If the surface lexical forms were the actual values, then why would
the URI specs speak of equivalence, such that "foo:bar" and "FOO:BAR"
denote the *same* resource?!

RDF has punted on this issue from the start. I tried to bring it up
a few times, but got slapped back into my corner. Well, it's still
an issue that needs to be addressed...

> Then we have a lot more classes to consider, and RDF/XML 
> denotation is at least a two-step matter (xml syntax -to- uri -to- 
> denotation) instead of simple denotation. Im not even sure if two are 
> enough. We would have to rewrite the entire spec if we take this 
> seriously.

I guess it's something that has to be fixed in 2.0, if ever.

> >  > This is the 'weak typing' view Im giving you here, of ocurse.
> >
> >Ahhh, right. I'm definitely taking a strong typing view.
> 
> The problem is, seems to me that the 'weak' view is kind of built 
> into RDF (and all the rest of them: DAML, OIL, OWL,...) These are 
> logics for reasoning about categories, not OO modelling languages. 
> There is a fundamental clash between thinking of classes as Venn 
> diagrams and thinking of them like an OO method-inheritance taxonomy. 
> Strong typing only makes sense in the second way of thinking.

Yet when it comes to datatyping and reliability/precision, strong
typing is IMO the only acceptable approach.

Perhaps this is the real crux of the datatyping debate. Perhaps RDF
is not and will never be acceptable for eCommerce and security and
trust, because it takes to weak a view for such things.

Perhaps I'm asking RDF to do something that it just cannot do.

Patrick

Received on Friday, 1 November 2002 02:49:33 UTC