- From: Sergey Melnik <melnik@db.stanford.edu>
- Date: Thu, 26 Sep 2002 11:49:24 +0200
- To: Patrick Stickler <patrick.stickler@nokia.com>
- CC: RDF Core <w3c-rdfcore-wg@w3.org>
Patrick Stickler wrote: > [...] >>Recall the motivating example from the RDF 1.0 Spec: >> >>foo dc:Creator "John Smith" >> >>Is "John Smith" supposed to represent a person or a string? >> > >>From a data markup perspective, we can say that the form of > the expression is an ambiguous name, since it is not a URIref. There is nothing ambiguous about strings that populate databases. >>From a knowledge representation perspective, I think it's pretty > intuitive that we're talking about a person in the real world, > and not a string. That's the assumption you are making, and the one I'm questioning! Of course, there might be a person in a real world who is the creator of foo. But the object of the triple must not be the identifier of that person to make sense. >>The key >>argument behind untidiness is that "John Smith" (or "10") cannot >>possibly be meant to be a string, so it has to be something else, whose >>meaning can be deduced using a right bit of logic and AI. >> > > Right, but the basis for that argument is not really that the literal > cannot possibly mean a string, but that given the role and purpose > of RDF as a language for making statements about the world, it is > rather bizarre for it to mean a string. From your perspective is may look bizarre, so you might want to choose a different modeling style. However, the above reflects a common modeling practice. Modelers and developers conventionally use integers and strings for modelings all sorts of things, such as ages, weights, income, etc. >>Ok, we have datatyping now, so let's do it right: >> >>:x age int"10" >>:y shoeSize int"10" >> >>Now we got it! int"10" is not a string now; it's what we want it to >>mean: an integer. Damn. The entailment >> >>:x age :z >>:y shoeSize :z >> >>still holds... >> > > Er, why is that a problem that it holds. It should hold. The above entailment has been *the* focal point of argument. Once it is abandoned, tidy interpretation does not pose any "semantic" problems for the existing apps. > Also, what will the impact be to applications expecting tidy semantics > when > > :x age "10" > :y shoeSize "010" > > does *not* entail > > :x age :z > :y shoeSize :z > > ??? IMO, zero. Are you aware of any existing apps whose behavior depends on whether the above entailment holds or not? >>Just as well, shoeSize could be defined as a >>property that holds between shoes and strings/reals/etc. >> > > Well, one problem with saying that the interpretation is based on > a property that holds between resources and lexical representations > is that there is not, and IMO can never be, any restriction against > non-canonical lexical forms. Therefore, even if you were to take a > tidy approach where inline literals denote themselves, you would > *still* have to evaluate cases such as > > :x :p "10" > :x :p "10.0" > :x :p "010" > :x :p "010.0" > etc. > > in terms of a lexical to value mapping to determine actual equality > of the objects. We don't need to prohibit non-canonical lexical forms. In the above example, the equality of "10", "10.0" etc. does not matter. What matters, however, is how these values impact the interpretation of :x. This knowledge is captured by the semantics of the property p, which is typically built-in into apps. Thus, it may well be the case that all of the four statements above imply an identical interpretation for :x. If a developer wants to make this knowledge explicit, and support interoperation with other apps, he/she can a posteriori provide a rule such as (x1 :p s1) & (s1 :isLexicalTokenOf xsd:int) & (v1 xsd:int s1) & (x2 :p s2) & (s2 :isLexicalTokenOf xsd:int) & (v2 xsd:int s1) & (v1 == v2) ==> x1 = x2 in a schema describing p. If we choose untidy semantics, the developer would simply have to write two different rules, to achieve the same effect: (1) p :range xsd:int (2) (x1 :p v1) & (x2 :p v2) & (v1 == v2) ==> x1 = x2 Both approaches are practically equivalent (of course, you might say the first one is more "bizarre" that the second). > I believe that the primary purpose of RDF is as a language for > interchange of knowledge (not just structured markup), and as such, > the more explicitly that meaning can be expressed in that language > the better. Your are preaching to the converted. I think you and me have a quite similar view of the world in this respect. The way I'd model, is illustrated in http://www-db.stanford.edu/~melnik/rdf/datatyping/fig/rich_types.gif I'd model age using durations, and weight in terms of masses, not integers. Recall that our tidy/untidy discussion refers to *existing* applications that already made their choices about how they model the world. We don't want those Adobe, CC/PP etc. folks and API developers, who bought into RDF, do a whole lot of work required to recode their data and reprogram their apps. We agreed on that. Now we are working on making sure that their apps are forward-compatible to future Semantic Web standards, specifically, ontology and rule languages. I'm convinced that both tidy and untidy semantics work equally well. Staying with tidy requires certain developers to change their perception of how properties and values they used refer to the real world. Going for untidy requires APIs and apps to be adjusted. The choice is ours. Sergey
Received on Thursday, 26 September 2002 05:51:11 UTC