- From: <Patrick.Stickler@nokia.com>
- Date: Wed, 14 Nov 2001 13:49:01 +0200
- To: phayes@ai.uwf.edu, w3c-rdfcore-wg@w3.org
> -----Original Message----- > From: ext Pat Hayes [mailto:phayes@ai.uwf.edu] > Sent: 14 November, 2001 03:20 > To: Stickler Patrick (NRC/Tampere) > Cc: w3c-rdfcore-wg@w3.org > Subject: RE: datatypes and MT > > > ... > >We need to keep in focus the fact that "10" is a lexical > representation > >of the value, not the value. > > Well, that is one of the issues being discussed. In the S proposal, > the value of the literal "10" would be the string "10". Then that would be incorrect, as the value is 'ten'. But my point is that we should not be trying to address the mapping from "10" to 'ten' in RDF beyond the association of the literal "10" with a class which denotes a data type that defines a mapping from "10" to some value in the value space identified by that data type. > >Yes, folks are not saying that the shoe size is a string. They > >are expecting that lexical form to mapped to a value in a particular > >value space. > > Right, which is what the P(++) datatyping proposals try to do. As is the X proposal. > >The same is true of the lexical representation of literals in a > >programming language. > > > > protected Integer shoeSize = 10; > > > >is not saying the shoeSize is the character sequence '10' (even > >though there are no quotes), but the value ten. > > Right, but that lack of quotes is significant. It's an issue entirely within the lexical space of the data type. RDF has its own lexical space for its "primitives" and literals are a primitive, and we enclose literals in quotes so that they are known to be RDF Literals, but that doesn't mean that the use or meaning of quotes in any other lexical space to delimit lexical forms is relevant. It's not. > In LISP for example, > supplying the character sequence '10' as an argument indicates the > value ten, while supplying the character sequence ''10' indicates > that the value is the character sequence '10'. It's irrelevant how LISP delimits its lexical forms. Lexical forms (for any data type) are delimited in RDF by quotes. That same lexical form in some other encoding (LISP code, Perl code, C code, etc.) may have other delimiters. > >The difference between e.g. Java and RDF is that Java actually > >interprets the lexical forms before it uses them, but RDF just > >holds on to them as-is. > > > >The mistake here is to somehow thing that RDF will interpret > >them in any way. > > Right. But I believe that nobody is making that particular mistake. > The discussion is about whether a literal label should be taken to > *denote* the string or the value it has under a datatyping scheme. The literal denotes a value in a value space, just as any lexical form in a lexical space of a data type denotes a value in the value space of the data type. I thought that was crystal clear. Whether the actual mapping from lexical form to value is defined in terms of RDF constructs, within RDF Space, is a seperate issue, and one that I don't think RDF should address. IMO, all that RDF should address is the association of RDF Literal to the RDF Class denoting the data type. And in fact, this is identical to the means of associating type with any resource whatsoever, Literal or otherwise. > .... > > > I just meant to avoid the implication that they were to be > >> interpreted as strings, since that interpretation begs > the question. > >> If we can agree that XML syntax in general should not be > interpreted > >> using logical canons of notational rigor, then we can > leave the quote > >> marks there and not call them quotes. > > > >Exactly. No interpretation is going to happen in RDF. > > We are at cross purposes. Interpretation, in the sense I was using > it, is not something that HAPPENS. Defining in RDF in any way that "10" maps to 'ten' within the scope of the data type xsd:integer is interpretation of the literal, and should not be done by RDF; at least not as part of the core model. All that RDF should do is allow one to say that "10" is a lexical form corresponding to some value in the value space of the data type, not how that mapping occurs or what the mapping is. > >They *are* strings. > >Leave the quote marks to indicate they are strings. > > They don't need quote marks to indicate that they are strings. They might, if there is to be a lexical distinction in the notation between literals and other terms. E.g. in NTriples, we can (I think) write a local ID just as the value, with no quotes, so if we have an ID(foo) and a Literal(foo) then we use quotes to differentiate them insofar as the notation grammar is concerned: _:X foo "foo" . The lexical form for the literal "foo" is only the three characters 'f', 'o', and 'o'. The quotes are a mechanism of the notation. > The > quote marks, if interpreted as genuine quotations, would indicate > that those strings denoted other strings, eg the string of four > characters on the next line: > "10" > is often understood as denoting the string of two characters > on the next line: > 10 > which in turn is usually taken to denote the number ten. Why are we going in circles about stuff that computer science has solved eons ago? If you need to include a significant character of the notation as a literal character, you escape it, and the application which knows how to parse that notation unescapes it during parsing. So "foo" -> ( f o o ) "\"foo\"" -> ( " f o o " ) etc. I don't see that this is even an issue... This is just about the notation used, not about RDF Literals. > It is this 'quotation' interpretation that is under discussion, and > that is accepted by the S and DC conventions but not by the P(++) > ones. The question doesn't arise in the X proposal, since literals in > this sense are not used. BUT THIS HAS NOTHING TO DO WITH SOLVING THE PROBLEM OF DATA TYPING! Are we defining how resources are typed or designing a new notation?! This is why I gave that verbose graph abstraction in my proposal, to illustrate the data model *not* some convenient notation which has to define a lexical grammar to be interpreted in terms of that abstraction. Deciding whether to write X ---foo---> "bar" ---type---> "bas" or X ---foo------> _:1:bar _:1 ---type---> "bas" and arguing whether "bar" and the suffix 'bar' in _:1:bar are the same literal is of course important, but is not defining the actual model, it's just playing around with notations OK, to be fair, maybe I'm missing all that is being expressed in the mathematics going back and forth, but I'd like to see us get past the notation issues, choose one notation, and define the models in terms of it. The fact that folks are trying to define new notations with complex terms such as _:1:bar and so forth suggests that we *all* are talking about a layer underneath the current resource-centric graph model, and that we should just define the meta-structures of that lower level in terms of NTriples, such as I'm doing with my X proposal. > > > >> Ah. So this would be OK, would it? > >> > >> aa eg:prop _:x . > >> _:x xsd:integer "10" > >> _:x xsd:integer "0010" > >> > >> That does make sense, I agree. > > > >But, just to clarify here, RDF is not determining that > >these two lexical forms map to the same value in the > >xsd:integer value space. > > It certainly is making this claim! NO! NO! NO! It only is *asserting* that both literals "10" and "1010" map to the same value in the value space of xsd:integer (though how could they! One denotes the value 'ten' and the other denotes the value 'one thousand and ten'!) The key word above is "determine". RDF does not determine the equality of lexical forms, even if an RDF statement or construct might assert it. Just as rdfs:range does not determine the type of a value, it just asserts that the value must be of a certain type. > The use of the common bNode _:x > asserts that there is one thing that is related to "10" and to "0010" > by the same xsd:integer property. *What* shared bNode?! The node _:x denotes the property value, the object of the statement. That node itself has two properties which associate two lexical forms (literals) with that property value, but the literal nodes are not the same. I guess I find the above model (where data types are properties) just a bit too wierd to follow consistently. One cannot achieve a merge of variant lexical forms which map to the same value (as suggested by the first representation) without knowledge about that data type, therefore such an approach is unnaceptable as it does not permit RDF to remain neutral with regards to data type schemes. Let's re-express it as follows: aa eg:prop _:1:"10" . aa eg:prop _:2:"1010" . _:1 rdf:type xsd:integer . _:2 rdf:type xsd:integer . Now, we have in fact two property values for the eg:prop property associated with aa, and each value has its own type and lexical form. And in this case, the values denoted by the two literals may or may not be the same value in the xsd:integer value space. Patrick
Received on Wednesday, 14 November 2001 06:49:37 UTC