- From: pat hayes <phayes@ihmc.us>
- Date: Mon, 6 Oct 2003 13:24:18 -0500
- To: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>
- Cc: www-rdf-comments@w3.org
> > xmlsch-01 1.1. Design question, complexity (substantive) >> ++++++++++++++++++++++++++++++++++++++++++++++ > > you said: >> [[ >> 1.1. Design question, complexity (substantive) >> The introduction of pairs consisting of a lexical form and a type (or, >> strictly speaking, a lexical form and a type label) seems at first glance to >> complicate the RDF model somewhat. We have had the impression that in other >> parts of RDF, typing is handled by adding further arcs and nodes. >>If the type >> of a resource is identified by having an arc labeled rdf:type from >>it to (the >> URI of) its (RDF) type, and if the type of an arc is similarly identified by >> an arc, then surely a reason ought to be given for shifting to a different >> method for typing literal strings. It seems like a dramatic shift in the >> infrastructure of RDF, from "everything is a node, an arc, or a literal >> value" to "everything is a node, an arc, or a typed literal value". Perhaps >> not quite so dramatic, after all. But the question of design consistency >> remains: why not "everything is a typed node, a typed arc, or a typed >> literal"? >> ]] >> >> >> Our resolution is: >> xmlsch-01 as in 0252 with amendment. >> i.e. >> [[ >> The RDF Core WG interprets this comment as two questions and a comment: >> > > 1) Why is the type of a literal not described using a property arc, as >> is done for other literals? >> >> 2) Having introduced typed literal nodes, why not introduce typed >> resource nodes and typed property arcs as well >> >> 3) The WG should provide a rationale for this design in the >>specifications >> >> Regarding question 1: >> >> This would require that literals be allowed as subjects of RDF >> statements. This is not possible in current RDF/XML and would require >> considerable change, beyond the scope of the WG, to support it. Further >> it introduces problems of non-monotonicity in the semantics. A property >> whose value is plain literal is currently taken to denote a sequence >> characters. Adding a further statement could change that value to, say an >> integer, invalidating previous inferences and breaking a fundamental tenet >> of RDF. > >On question 1: Thank you; that helps clarify the design. > >> Regarding question 2: >> >> No requirement justified a change to the notion of a URIREF node or an RDF >> arc. > >On question 2: In the final analysis this is your call and we don't >plan to lie down in the road over it. For the record, though, we >should record that we find your analysis unconvincing. The >introduction of typed literals introduces a new idea into RDF, and it >is obvious that this new idea has possible applications elsewhere in >the design space. Your response amounts to saying that you chose not >to work through the design implications of introducing this kind of >type labeling, because it seemed possible to get by without such >re-thinking. The result is that the new idea will continue to feel >incompletely integrated into RDF; it will feel like a patch added as >an afterthought rather than an integral part of the design. Allow me to offer some more exposition which may help clarify this issue. I should first say that I am speaking here with my own voice, not that of the WG. I am not sure what you mean by typed literals being a 'new idea'. It sounds from your response above as though you see the new idea here being the syntactic association of a type (ie a datatype) with a literal string, and that you therefore see a natural generalization of this associating-of-a-type as a kind of generic syntactic option. I do not see the situation in this way, and find the proposed generalization unmotivated and close to incoherent. (Our inability to understand the motivation for this proposal may be one reason for the brevity of our response.) Although our response did not go into this in detail, it may be appropriate to point out that the original comment seems to embody a conceptual error, by conflating 'type' as in rdf:type with 'type' as in datatype. These are quite distinct ideas: the similarity in fact is little better than a pun. The property rdf:type indicates membership in a class, or application of a property. It is what is conventionally called 'member' or, in formal set theory, written using an infix epsilon, or, in conventional logical notations, written as the application of the rdf:type property value (the class or predicate) to the subject (the individual in the class). In other words, it has to do with membership in a class. Datatyping, in contrast, has to do with ways of interpreting lexical forms. These are different topics. RDF blurs this distinction to some extent by allowing a datatype name to be used to refer to the value class, so that a well-formed typed literal denotes a value which is in - bears the rdf:type property to - that type considered as a class. This is purely a convention in RDF, however, and in fact itself has been the subject of controversy since the 'primary' interpretation of any datatype name has to be the lexical-to-value mapping rather than the value class. The role of datatyping seems to be to provide for alternative ways of interpreting lexical items which have conventional interpretations in widespread use, such as numerals used to indicate numbers, calendar conventions used to indicate days and times, and so on. In conventional (pre-Web) formal languages these are often thought of as fixed, so that numerals always denote numbers using decimal conventions, strings are always indicated by enclosing quotations, etc.. On the Web we need to both allow for a wider range of alternatives but also give a specific indication of the lexical-to-value mapping intended: hence the utility of the XSD structure. Thus, the combination of a lexical string and a datatype can be seen as a kind of 'fixed name' which is required to denote its conventional meaning in any interpretation, and this is moreover a meaning which can be determined by any processor which has access to the datatyping conventions indicated by the type name, which embody the conventions being used. The syntactic association of a literal string (a lexical form) and a datatype name seems like a natural way to encode the use of the convention named by the latter to interpret the lexical form which is the former: the Ntriples ^^ convention can be read as ", understood as a"., e.g. "234"^^xsd:number means '123', understood as a number... in contrast with, say, '123', understood as a character string. .Note that none of this has got anything particularly to do with class membership. But all this applies only to those parts of RDF (or indeed any other such formalism) which are intended to be understood 'conventionally', ie relative to a widely used convention (such as dates and numerals). Most names are not conventional in this way: in a programming language, typically, identifiers are not; in RDF, URIrefs are not. They are simply general-purpose denoting expressions, which follow no particular lexical conventions and for which no generic rules can be given which relate their lexical form to their intended interpretations. Thus, to associate a lexical-to-value mapping with a URIref would be meaningless and would provide no useful information about the referent, or functionality to a reasoner: to say " <ex:aaa>, understood as a number" is otiose: if <ex:aaa> is a number then it is vacuous, and if not, meaningless: either way, the datatype contributes nothing towards the interpretation of the URIref. For this reason, the generalization you suggest to typed nodes and arcs seems unmotivated and semantically meaningless; contrary to your claim above, it does not have 'obvious' possible applications elsewhere in the design space. Seen in this light, therefore, the 'new' idea to which you refer is only new in the sense that it provides a syntactic association between a datatype and a lexical form. Any way of using datatypes in RDF must somehow provide for such an association - that is not new - and, as we discovered during a very long and arduous process of exploration, a direct syntactic association is one of the very few such techniques which does not break either the underlying semantic model of RDF or else the underlying graph syntax conventions. We did not therefore see this as a highly new idea, more of a workable solution to a pressing, but old, problem. You claim that typed literals are a 'patch', incompletely integrated into the RDF design. I disagree; if anything, plain literals with language tags are a patch, not properly integrated but required for legacy reasons. In fact, with hindsight, I think it would be fair to say that in an sense *all* RDF literals can be viewed as typed: we retained the 'plain' style for essentially historical reasons (and to satisfy the i18n requirements for a syntactic placeholder for XML language tags) but in fact, both semantically and in the central syntactic model, those could be considered to be typed with a 'trivial type' (which is in fact extremely similar to xsd:string, though not quite exactly the same.) You suppose in your answer that we "chose not to work through" the design implications. I rather resent this supposition, and reject the implication of laziness. If the design implications you refer to are the possibility of datatyping URI references, it would be more accurate to say that we thought about them and decided that there were none. If you are referring to the use of explicit properties to describe datatypes, without introducing the 'new' typed-literal syntax, then rest assured that we considered many design options in detail, as could be discovered from a perusal of the WG email archive. If I have missed your point entirely and you (or y'all) feel that there is some other obvious opportunity we have missed here, I would welcome a more detailed correction; particularly if it could be in some way related to the RDF model theory. Pat Hayes -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32501 (850)291 0667 cell phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Monday, 6 October 2003 14:24:48 UTC