- From: Pat Hayes <phayes@ai.uwf.edu>
- Date: Tue, 6 Nov 2001 18:35:54 -0600
- To: Sergey Melnik <melnik@db.stanford.edu>
- Cc: w3c-rdfcore-wg@w3.org
>Pat Hayes wrote: >..... > > No, what I mean by 'neutral' is writing, say, that my shoe size is 10 >> without giving the datatyping of the literal. That is what is >> impossible in this scheme: to use a literal before, or independently >> of, giving that literal a datatype (because in this scheme, as I >> understand it, the ONLY places that a literal label are allowed are >> the object ends of triples whose predicate is a datatype name). That >> is why I say it is only a notational variation on the simple idea of >> incorporating datatyping information in to the literal label itself. >> (BTW, I agree that this simple idea has its merits; but I think that >> if we are going to insist that literals *must* be explicitly >> datatyped, then we should impose this as an explicit syntactic >> constraint in the very syntax of the language.) > >In principle, I agree. However, if we stick a single type to each >literal we won't be able to deal with the cases where multiple literals >are required to determine the data value unambiguously > >_x rdf:type ComplexNumber >_x realDecimal "1.0" >_x imaginaryDecimal "2.0" > >as indicated in >http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Nov/0103.html OK, though this kind of example seems to have only just surfaced and I am not sure I like it. Would one call realDecimal a datatype?? I don't think I would. It seems to be a kind of product of a datatype mapping and a selector. (Are there any examples like this in XML schema?) If we are going to use bnodes, I would rather write this as: _x rdf:type ComplexNumber _x realPart _:y1 _x imaginaryPart _:y2 _:y1 xsd:number "1.0" _:y2 xsd:number "2.0" >....> >> Suppose I know that some property is represented by a literal "Y2F0", >> but have (as yet) no information about the appropriate datatyping to >> be used to interpret that literal. How would you represent that state >> of information? > >Oh, I think this leads us into inferencing, which seems to work just >fine in both approaches. In the above case, you might assert: > >_n propertyIDontKnowAnythingAbout "Y2F0" ??But now you seem to have broken your own rules. The proposal, as I understood it, was that this would be syntactically illegal: the only place a literal label can occur is at the sharp end of an arc labelled with a datatype name. Look, suppose you DID know that the base64 information, then this would be written as: _n propertyIDoKnowSomethingAbout _:1 _:1 unSticklishDatatype:base64 "Y2F0" Right? Now, how would you 'remove' the base-64 part of that graph and still write the information you want? There is no place to put the literal except at the sharp end of an arc that you now don't have a label for. > > And now suppose that I discover, perhaps from a >> different source, that the property in question is stated in terms of >> a base-64 integer encoding: how would you encode that information? > >(X propertyIDontKnowAnythingAbout Y) > -> (X base64 Y) You have to write it in RDF. (If we could write Prolog then all our problems would be solved. :-) >..... >> > > Like that proposal, it forces datatype information to be given >> >> explictly and locally, >> > >> >Yes, and I believe that's a strength rather than a weakness. I think it >> >is essential for the snippets of information dispersed on the Semantic >> >Web to be as precise as possible and as self-contained as possible. >> >> I think that is completely wrong-headed. The whole point of using an >> assertional language is to be able to put together pieces of >> information from various sources and draw reasonable conclusions from >> them. > >I guess we can argue about it, but this discourse is more of a >philosophical nature. There are many reasons to reduce the impact of >schemas on instance data. Evolution of schemas (that break instance >data) and archival purposes are probably the most obvious ones. Another >crucial issue is that the developers often do not properly understand >the semantics of the (graph) encodings that they design. Part of the >problem is, of course, that they do not even bother to develop a schema >for their data, let alone to capture it in a machine-readable form... > >> If we start declaring that certain kinds of information >> *must* be linked to others, or *can only* be inferred in a certain >> sort of way, we might as well use Java. > >I don't think that's a pro argument, Pat. (_x size "10") *can only* be >processed correctly if the tool got the schema that defines the property >"size" and understands the schema language, right? Yes, of course, but that's not the issue. There are two pieces of information needed to interpret a literal: the literal label itself, and the datatype mapping that is supposed to be used to interpret it. All this discussion has been about how these two different pieces of information can be encoded in some form that allows them to be brought together reliably. One way is to just insist that they always occur together in some sense, either by making a kind of composite label out of them, or by providing a bnode to which the two kinds of information must be attached in a certain way. The MT extension, in contrast, allows them to be separated and treated as separate assertions, and uses RDFS inference machinery to make the connection between them. It seems to me that this flexibility is more desirable, and more in the RDF spirit, than any scheme which imposes strict syntactic constraints on the form of RDF graphs, and which breaks if those constraints are violated. That is what I meant by the above remark. > >> I would agree that IF the information is available locally then of >> course it should be possible to use it locally. But the reasoners >> should also be able to function even when all the information is not >> available locally; they should not barf just because the information >> provided is incomplete. > >I looks to me that our discussion is probably even more metaphysical >than I originally thought. Let me try to put a different spin on your >approach (of course, this is not the way you see it, I understand), >which I believe allows all intefencing to work just the way you'd expect >it. Assume that > >_x size "10" > >asserts there is a certain relationship between I(_x) and the literal >value I("10"). (This is the `straightforward' interpretation). In other >words, the property `size' with a literal "10" hanging off it restricts >the number of valid interpretations of _x. It does if the interpretation is a datatype interpretation, right. >Now, imagine that there is a >rule somewhere, in some schema that "breathes in life" into the above >statement: > >(X size L) > -> exists N: (X shoeSize N), > (N rdf:type Integer), > (N xsd:int L) . ? What are you talking about? Such rules aren't expressible in RDF. >In this light, the original statement (_x size "10") can be viewed as a >syntactic construction, It IS a syntactic construction. In fact it is an RDF graph. >which is interpreted using a inference rule into >something that has an adequate semantic interpretation. But it is an adequate semantic representation already. It is perfectly meaningful (if ambiguous), and when suitably conjoined with, or extended by, appropriate information about the datatype, and when interpreted in way that conforms to the datatype semantics , then it says exactly what one would expect it to say, viz. that x's shoe size is ten. Even if you don't like the semantic account, that first triple is perfectly well-formed RDF; so we should, in all conscience, either undertake to say what it means, or make it illegal. I am still not quite clear what your attitude to this point is; do you intend that 'rule' to be a kind of syntactic transformation or constraint on RDF graphs? What if I write a graph and refuse to apply the rule: have I made an error of some kind? What kind? >In particular, >the property `shoeSize' would connect a shoe with an integer, rather >than with a literal value. If the datatype is chosen appropriately (eg xsd:integer), the literal value of the literal "10" IS an integer. In fact it is ten. > >I think that both approaches on the table are equivalent in the sense >that they can ultimately provide a very similar high-level >interpretation to any given piece of RDF instance data, although using >quite different schemas and a different perspective. My feeling is, >however, that by giving a reasonable (not straightforward) >model-theoretic interpretation to (_x size "10") you finesse the fact >that this statement is "syntactic matter" that needs further >explanation, i.e., my means of rules. I don't finesse it, I deny it outright. It is not syntactic in any way: it says that _x's shoe size is a literal value. Until we know what datatyping scheme to use, we do not know which literal value, so it is of course ambiguous taken in isolation; but then so is much of RDF, by its very nature as an assertional language. But I don't see that triple as being more "syntactic" than a triple consisting entirely of urirefs. > >The two reasons I am hesitant to buy your suggestion are that a) it >reminds me of taking a random piece of XML and "interpreting" it as RDF >using rule-based transformations, I fail to follow this point. >and b) it transfigures the model >theory in such a way that I (and maybe others with similarly limited >mental abilities) have hard times understanding it - to the contrary of >my belief that MT is there to help clarify things. Well, I confess that I have not done a very good job of explaining the idea intuitively, but it really is not very complicated once you get used to it. It doesn't change any of the rest of the model theory, by the way: the extra machinery only comes into play when literals are around. And I would say that while the MT is there to help clarify things, it does that by being precise, rather than by being simple. In any case, this debate isn't really about rival model theories, but about rival treatments of literals. The real point of the MT extension is that it shows that the 'straightforward' interpretation of a triple with a literal in it can indeed be made to work; we don't need to transform or rewrite such triples; we can just leave them as they are, and they work just fine, and have their 'straightforward' meanings. > >BTW, the above rule-based approach addresses your concern that local >typing information needs to be provided, does it? I'm really not sure what this rule-based approach means. There aren't any such rules in RDF, right? Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 home 40 South Alcaniz St. (850)202 4416 office Pensacola, FL 32501 (850)202 4440 fax phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
Received on Tuesday, 6 November 2001 19:36:07 UTC