- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Thu, 21 Feb 2002 11:07:08 +0200
- To: RDF Core <w3c-rdfcore-wg@w3.org>
-- Patrick Stickler Phone: +358 50 483 9453 Senior Research Scientist Fax: +358 7180 35409 Nokia Research Center Email: patrick.stickler@nokia.com ------ Forwarded Message From: Patrick Stickler <patrick.stickler@nokia.com> Date: Thu, 21 Feb 2002 10:17:37 +0200 To: Pat Hayes <phayes@ai.uwf.edu> Cc: Jos De_Roo <jos.deroo.jd@belgium.agfa.com>, "Brian McBride <brian_mcbride" <brian_mcbride@hp.com>, ext Graham Klyne <Graham.Klyne@mimesweeper.com> Subject: Re: Even more simplified datatyping proposal On 2002-02-21 2:55, "ext Pat Hayes" <phayes@ai.uwf.edu> wrote: > (MOST of this is argumentation. There is one good idea in here, > though, so to cut to the chase find ***** at the end.) > >> On 2002-02-20 19:01, "ext Pat Hayes" <phayes@ai.uwf.edu> wrote: >> >> >>> The problem here is making sure that its the literal that denotes one >>> thing and the bnode that denotes the other. What stops an >>> interpretation having the literal denote the value and the bnode >>> denote the lexical form? They are all just semantic objects in the >>> set at this point, not segregated by some kind of intrinsic type. >> >> The answer is quite simple, RDF provides no *representation* of >> values. Period. Ever. It may provide a bNode denotation of some >> value, but never, ever a representation. >> >> If it is a literal, it is *always* a lexical form. > > Im not following you. Sure it IS a lexical form. The question is, > what (if anything) is it required to denote in an interpretation? > > Through yesterday Ive been assuming that it always denotes itself, in > effect. The version I sent today says that it can denote anything > (unless there is datatyping information which fixes it somehow), > which makes a literal exactly like a bnode with an attached (but > meaningless) label. I think this is where we may be having a disconnect. Datatyping doesn't "fix" the meaning of a literal. It provides a context for interpreting the literal as a lexical form. If the meaning were "fixed", we'd need untidy literals, since different range constraints could fix the same literal to lexical forms of different datatypes. It's very very important, I think, to keep clear that datatyping *interpretation* does not happen in the graph -- is not expressed in the graph. The graph provides the pieces of information that allows for a consistent, unambiguous interpretation, but that interpretation happens above/outside/beyond the graph. Thus, the literal always denotes a literal. It may, in the context of a datatyping interpretation, be taken to represent the lexical form of a value, from which the mapping to the value is clear, but it never ever *is* the datatype-specific lexical form. That may seem like a paradox at first reading, but its not, really. All that the graph syntax is denoting are literals, datatype URIrefs, and value bNodes for the datatype idiom. The interpretation that treats a literal as a lexical form and resolves the actual value denoted by a value bNode (or implicitly if no bNode) is part of the interpretation, and that interpretation is not reflected explicitly in the graph. >> Outside the context of datatyping, a literal is just a literal. > > And it denotes.....what? A literal. With no defined datatype interpretation. Any interpretation is application specific and not specified by any official RDF/RDFS/RDFDT rec. > (Anything, like a bnode? It's not quite the same as a bNode, since it does have lexical distinction. One cannot differentiate bNodes by any quality of the node itself, since it's blank. Literal nodes are distinguishable from one another, even if they do not have any fixed interpretation. > Or nothing at all? Insofar as RDF is concerned. Yes. Nothing at all. > If the latter, how could any triple containing a literal ever be > true?) How could it be false? It would only be true or false according to extra-RDF interpretation. >> The RDFS spec does not include rdfs:Literal as a subclass of rdfs:Resource, >> and I thought it was agreed that literals do not denote resources, only >> bNodes and URIrefs do. > > Thats not my understanding. The MT leaves the issue open. It > certainly seems reasonable to be able to say that literals denote > *something*, and it seems to me that everything is a resource. Well, if (and I'm not saying it should be) rdfs:Literal is a subclass of rdfs:Resource, then that means that we only lose the ability to restrict property values to the bNode idioms, but can still restrict them to the inline idiom. I.e. the following still works as before ppp rdfs:range ddd . ppp rdfs:range rdfs:Literal . Perhaps there is some other way to achieve the same results as the previously suggested ppp rdfs:range ddd . ppp rdfs:range rdfs:Resource . Perhaps also giving bNodes and URIref nodes each a class of their own, in the same vane as rdfs:Literal? >> Those datatype schemes don't have literals as values. They have >> strings as values. That is not the same. > > Sure its the same. What is a literal if not a string? (Just look at > it, there's the string right there, between the quote marks.) But see > below. A literal may have a string representation, and a literal may be string equal to some string value of a datatype, but that does not mean that any member of a value space of a datatype has a literal representation in the graph. They are *not* the same. >> Literals are constructs of the RDF graph, not of a given >> datatype. Any intersection in nature between literals and >> strings is insignificant. > > Well, I think that all that Dan C really wants is a licence to use > literal-string matching to test identity in simple in-line cases. > That requires only that values and literal forms are 1:1, Right. It requires that they be string-equal, but not the same 'thing'. And this is an exceptional case, not the norm, so it shouldn't drive the datatyping model (even if it is accomodated by datatyping). >> Thus, as in the case of xsd:string, the fact that every >> lexical form is also string-equal to the value it denotes >> is a characteristics of the datatype, not a feature of >> RDF or the expression of RDF datatyping in the graph. > > I agree with that. > >> A literal is either a literal or a lexical form. Never an >> actual datatype value. RDF provides no representation whatsoever >> for datatype values (even if for some datatypes, that would >> be technically feasible). > > And with that. Cool. This is good, because the above two points are crucial. >> The RDFS spec makes no such subclass >> assertion that I can find. > > I think that it follows from the MT. I need to check that. Please do (and also, please don't make that assertion ;-) >> You don't equate a literal node with an instance of rdfs:Literal? > > No, never have. That intepretation doesn't make sense, because > literal NODES are in the graph, not in the semantic domain. See > http://www.w3.org/TR/rdf-mt/#literalnote Then what the heck is the purpose/significance of rdfs:Literal??? >> I thought that we agreed that literals denote literals, always. > > Well, I thought so too, and I wrote the datatype thing up on that > basis; but now I have you, Graham, Brian and uncle Tom Cobbley and > all screaming that they cannot live with that decision. Not me. See above ;-) > You want the > denotation of a literal to be influenced by datatyping information. Nope. Not at all. A literal always denotes a literal. > I wish y'all would make up your minds :-) I did when we agreed a literal denotes a literal, and haven't changed it since. >> Just because a literal may be interpreted in a datatype context >> as a lexical form does not mean it's not a literal in the graph. > > Being a literal in the graph is a matter of syntax. Of course it IS a > literal in the graph. BUt what does such a literal denote in an > interpretation?? It depends on the context. A literal itself denotes a literal. A literal in a datatyping context represents a lexical form -- but that representation is part of the interpretation, not part of the literal. >> Now, two datatypes may have the same lexical form, even if they >> do not give it the same interpretation, so having tidy literals >> is no big deal, because all the literal node denotes is the >> lexical representation. > > BUt look, if you say that, then you have already said that > > jenny ex:age "35" . > > asserts that Jenny's age IS the lexical representation. You are now > STUCK with that, and you can't influence it by talking about ranges > or datatypes. No. See above. An interpretatation does not fix the meaning of a literal. If there is no datatype range asserted for ex:age (or it is ignored in the interpretation) then Jenny's age is *interpreted* to be "35". If there is a range defined as xsd:integer, then in that context, Jenny's agre is *interpreted* to be 35. In either case, the denotation of the literal node "35" is the literal "35". I don't see why this is difficult. It's consistent. It's unambiguous. Applications always know how to interpret literals in a datatype context. No, you don't know *in the graph* which value a given literal denotes and a given literal *in the graph* does not explicitly denote a value -- and that's why a literal is not a resource, it is just a syntactic construct that contributes to some interpretation which is meaningful to the application, but that interpretation is not explicit *in the graph*. The convergence proposal (your summary3) is one way to have such interpretations reasonably explicit in the graph, but it's a camel (for the typical user). The datatype-as-union proposal says don't bother trying to capture the interpretation in the graph, leave it "up there" above the graph, and gives us a lean and mean arabian (for the typical user). *Both* of those proposals provide for consistent, unambiguous, and functional interpretation of datatyping for applications. They simply differ in how much of that is reflected in the graph itself. Eh? > That's what the damn literal denotes, end of story. So > (I presume) you DONT want to say that, right? Right. > What you want to say > is, that what the literal denotes varies from interpretation to > interpretation, ie it has flexibility, so that you can then use other > assertions (eg about drange) to rule out the cases you want to > exclude and rule in the ones you want to have as a legal > interpretation that satisfies the datatyping. If you fix the > denotation rigidly up front, then you havn't got any flexibility left. Right. Exactly. (though we don't need drange anymore) >> It needs the context of the datatype >> to provide the actual interpretation to a value. > > Context is irrelevant if you have already fixed the denotation. True, but don't assert that denotation of literals is fixed to a given datatype interpretation. I propose we maintain flexibility in the denotation and leave denotation up to the datatype-context specific interpretation. >> >> The literal "5" may be a lexical form of countless datatypes, >> but in isolation, it's just a lexical representation, just a >> literal, just a string. It does not in and of itself denote >> the value 5. > > OK, fine. So what DOES it denote? Its got to denote *something*, or > else every triple including it is false (see the basic RDF MT). Insofar as RDF MT truth is concerned, it denotes a literal, always. But MT truth and datatyping truth are not the same thing, I think. Like I said above, the graph denotes literals, datatype URIrefs, and value bNodes. Not lexical forms. So any occurrence of a literal is true, insofar as the MT is concerned -- i.e. it *is* a literal, even if its significance in datatyping interpretation may differ from context to context. >> You need the datatype context to achieve that >> mapping and that happens "above" the graph, not in it. > > That sounds like what Peter P-S and I tried to do long ago, with an > 'external' datatyping interpretation system that kind of glommed onto > the literals and got added to a conventional RDF interpretation. But > that was woefully complicated, and as things turned out it didn't > work in any case when we got down to the details. I got so > discouraged I gave up, you may recall. It's going to happen above the graph anyway, because all real validation of datatyping knowledge must be performed by some application using datatype specific understanding and real system-internal representations of values. So, even if it's a bitch to make work from the viewpoint of KR, it will work just fine for what most folks need RDF datatyping for, namely: what the !@#&*(!*$ is this value. The lighter, union-based, contextualized extra-graph interpretation approach is easy for users to approach/use, is 100% clear to application developers what is meant, and provides for some degree of MT support, insofar as what is actually denoted in the graph (literals, datatype URIrefs, and value bNodes). The value equality problem, as with all resource equality, is something that has to be worked on, and likely will require functional layers above the basic *declaration* of datatyping knowledge provided by the RDF datatyping idioms. We're not going to solve that this go-round. What we have to capture now is simply which value we are talking about, and either of the proposals does that. Since we're going to have those higher layers later anyway, I think the lighter contextualized approach gives more leeway to those later efforts to provide machinery for the context specific interpretations and may actually make later solutions to the value equality problem easier to achieve. The more the interpretation is explicit in the graph now, the fewer options remain for later. >> For the datatype triple idiom, we have a nice bNode to >> uniquely denote that value. For the inline idiom, there >> is no denotation of the value in the graph. But the >> actual value is a product of extra-RDF interpretation, not >> intra-RDF inference (if that makes any sense whatsoever). > > But its got to be connected to RDF inference in some ways. It has RDF > consequences, for example. So its not enough just to invoke a kind of > external magic. See above. We can't get away from that magic just now. Either the magic is needed to do value equality merging in the graph or query by value, etc. We just won't at this time be able to capture the totality of typed data values in the RDF graph -- nor do I think we should. Datatyping and full interaction with datatype values will always be tied to the application space. That is unavoidable. >> So, taking the case of Dan's wish to use inline literals to >> denote just the literal, just the string representation, is fine, >> and such literals will have globally consistent meaning, but >> only as long as some range constraint doesn't assert a datatyping >> context that gives it some other interpretation (not denotation, >> just interpretation) *and* an application heeds that datatyping >> interpretation. > > Well, the sticking point is going to be that 'other', because that's > where the whole thing goes nonmonotonic. Yes. Exactly. RDFS range and domain constraints *are* non-monotonic. Yup. Because I can make long range assertions about your knowledge that you did not make. Once we merge our graphs, we get different interpretations than for each graph in isolation. Cest la vie. > ***** > I think it is better to hold a gun to Dan's head (or maybe its the > Dublin Core's head) and insist that if he wants to say literals > denote themselves (or strings, if you like), then that is a > datatyping decision, and he should be explicit about it. All he has > to to do is to add > > rdfs:dlex rdfs:subPropertyOf xsd:string . Pat, you're still in the convergence proposal, not the "Even more simplified" proposal. There is no rdfs:dlex. But I agree. If Dan, or DC or anyone wants to say that the range of their properties are strings, then they should say so explicitly, dc:title rdfs:range xsd:string . Though some folks want to say that the range of their properties are strings that are string-equal with lexical forms from a given datatype, and to do that they can use a range intersection: dc:title rdfs:range xsd:date . dc:title rdfs:range rdfs:Literal . which is, I think, really what Dan wants to do. > to his graph, and he's got it locked down tight: every time he uses a > literal anywhere in that graph, it's got to be interpreted using > xsd:string. Its not a default or anything else underhand or > 'magical': if he tries to add any other datatyping information to his > graph with this in it, he's going to get an explicit clash. Now > everyone is wearing their datatyping assumptions on their sleeves. Agreed. Patrick -- Patrick Stickler Phone: +358 50 483 9453 Senior Research Scientist Fax: +358 7180 35409 Nokia Research Center Email: patrick.stickler@nokia.com ------ End of Forwarded Message
Received on Thursday, 21 February 2002 04:05:37 UTC