- From: Sergey Melnik <melnik@db.stanford.edu>
- Date: Mon, 05 Nov 2001 16:05:10 -0800
- To: Pat Hayes <phayes@ai.uwf.edu>
- CC: w3c-rdfcore-wg@w3.org
Pat Hayes wrote: > > >Pat Hayes wrote: > >> ... > >> >>Now, this seems to me to have a fatal flaw, which arises from the > >> >>fact that the value spaces of two different datatypes might > >> >>overlap. For example, suppose that there are datatypes xxd:octal > >> >>and xxd:decimal, then the following would seem to be perfectly true: > >> >> > >> >>_:1 rdf:type xxd:octal > >> >>_:1 rdf:type xxd:decimal > >> >>_:1 rdf:value "32" > >> >>_:1 rdf:value "26" > >> > > >> > > >> >But that is not how Sergey would write it. He is proposing: > >> > > >> > _:1 xxd:octal "32" . > >> > _:1 xxd:decimal "26" . > >> > >> Oh, I see. That does indeed avoid this problem, but it also throws > >> away the advantages of the bnode way of doing things, since now it is > >> impossible to be neutral about datatypes. > > > >Why? If you want to be neutral, use a bNode w/o any arcs attached to it > >(if I understand what you mean by neutral). > > No, what I mean by 'neutral' is writing, say, that my shoe size is 10 > without giving the datatyping of the literal. That is what is > impossible in this scheme: to use a literal before, or independently > of, giving that literal a datatype (because in this scheme, as I > understand it, the ONLY places that a literal label are allowed are > the object ends of triples whose predicate is a datatype name). That > is why I say it is only a notational variation on the simple idea of > incorporating datatyping information in to the literal label itself. > (BTW, I agree that this simple idea has its merits; but I think that > if we are going to insist that literals *must* be explicitly > datatyped, then we should impose this as an explicit syntactic > constraint in the very syntax of the language.) In principle, I agree. However, if we stick a single type to each literal we won't be able to deal with the cases where multiple literals are required to determine the data value unambiguously _x rdf:type ComplexNumber _x realDecimal "1.0" _x imaginaryDecimal "2.0" as indicated in http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Nov/0103.html > > > > This forces the datatyping > >> information to be attached directly to the literal; > > > >Right. > > > >> the only place > >> literals can occur in an RDF graph is at the object end of links > >> labelled with a datatype. > > > >True, although I personally do not see any problem with allowing > > > >"cat" base64 "Y2F0" > > > >either. > > > >> This seems to me to be simply a variation > >> on the idea of incorporating the datatype label into the literal > >> itself, eg by having literals be pairs of a datatype and a string. > > > >Not quite. Notice that we can refer both to the literal "typed" in such > >a way, and its type by means of arcs in RDF graphs (and, for example, > >provide additional information about the datatype or describe how the > >literal/string is represented using the base64 encoding, if that's all > >we know). > > Suppose I know that some property is represented by a literal "Y2F0", > but have (as yet) no information about the appropriate datatyping to > be used to interpret that literal. How would you represent that state > of information? Oh, I think this leads us into inferencing, which seems to work just fine in both approaches. In the above case, you might assert: _n propertyIDontKnowAnythingAbout "Y2F0" > And now suppose that I discover, perhaps from a > different source, that the property in question is stated in terms of > a base-64 integer encoding: how would you encode that information? (X propertyIDontKnowAnythingAbout Y) -> (X base64 Y) > And then how would I be able to put these two pieces of information > together into a single graph, and be able to draw the obvious > conclusion? Remember that it is not valid to merge bnodes from two > different RDF graphs. I think using the above rule we'd be able to derive _n base64 "Y2F0" right? > > > Like that proposal, it forces datatype information to be given > >> explictly and locally, > > > >Yes, and I believe that's a strength rather than a weakness. I think it > >is essential for the snippets of information dispersed on the Semantic > >Web to be as precise as possible and as self-contained as possible. > > I think that is completely wrong-headed. The whole point of using an > assertional language is to be able to put together pieces of > information from various sources and draw reasonable conclusions from > them. I guess we can argue about it, but this discourse is more of a philosophical nature. There are many reasons to reduce the impact of schemas on instance data. Evolution of schemas (that break instance data) and archival purposes are probably the most obvious ones. Another crucial issue is that the developers often do not properly understand the semantics of the (graph) encodings that they design. Part of the problem is, of course, that they do not even bother to develop a schema for their data, let alone to capture it in a machine-readable form... > If we start declaring that certain kinds of information > *must* be linked to others, or *can only* be inferred in a certain > sort of way, we might as well use Java. I don't think that's a pro argument, Pat. (_x size "10") *can only* be processed correctly if the tool got the schema that defines the property "size" and understands the schema language, right? > I would agree that IF the information is available locally then of > course it should be possible to use it locally. But the reasoners > should also be able to function even when all the information is not > available locally; they should not barf just because the information > provided is incomplete. I looks to me that our discussion is probably even more metaphysical than I originally thought. Let me try to put a different spin on your approach (of course, this is not the way you see it, I understand), which I believe allows all intefencing to work just the way you'd expect it. Assume that _x size "10" asserts there is a certain relationship between I(_x) and the literal value I("10"). (This is the `straightforward' interpretation). In other words, the property `size' with a literal "10" hanging off it restricts the number of valid interpretations of _x. Now, imagine that there is a rule somewhere, in some schema that "breathes in life" into the above statement: (X size L) -> exists N: (X shoeSize N), (N rdf:type Integer), (N xsd:int L) . In this light, the original statement (_x size "10") can be viewed as a syntactic construction, which is interpreted using a inference rule into something that has an adequate semantic interpretation. In particular, the property `shoeSize' would connect a shoe with an integer, rather than with a literal value. I think that both approaches on the table are equivalent in the sense that they can ultimately provide a very similar high-level interpretation to any given piece of RDF instance data, although using quite different schemas and a different perspective. My feeling is, however, that by giving a reasonable (not straightforward) model-theoretic interpretation to (_x size "10") you finesse the fact that this statement is "syntactic matter" that needs further explanation, i.e., my means of rules. The two reasons I am hesitant to buy your suggestion are that a) it reminds me of taking a random piece of XML and "interpreting" it as RDF using rule-based transformations, and b) it transfigures the model theory in such a way that I (and maybe others with similarly limited mental abilities) have hard times understanding it - to the contrary of my belief that MT is there to help clarify things. BTW, the above rule-based approach addresses your concern that local typing information needs to be provided, does it? Sergey
Received on Monday, 5 November 2001 18:38:21 UTC