- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Thu, 10 Jan 2002 17:56:21 +0200
- To: RDF Core <w3c-rdfcore-wg@w3.org>
- CC: Sergey Melnik <melnik@db.stanford.edu>, ext Graham Klyne <GK@NineByNine.org>
On 2002-01-10 13:48, "ext Graham Klyne" <GK@NineByNine.org> wrote: > At 10:27 AM 1/10/02 +0200, Patrick Stickler wrote: >> On 2002-01-07 20:54, "ext Graham Klyne" <GK@NineByNine.org> wrote: > Pat did sketch a model theory for the > P/P+ > schemes, and it (a) certainly seems to add a significant degree of > complexity to the model theory, and (b) it seems possible that it has > some > fatal logical flaws (i.e. logical consequences that are different from > what > is desired or expected). From my recollection of those earlier discussions, there were two issues that were brought up as problemmatic: (1) that rdfs:subClassOf relations, if based on lexical space rather than value space could be problemmatic in the context of certain inference operations, but that is an issue for *any* data typing solution, not just for P/P++; and (2) there was originally the expectation that values themselves (not just lexical forms) would have some representation in the graph which presumed native data types. I don't recall any other flaw discussed -- though I may certainly be wrong about that, and Pat or others are free to refresh our memories. >> If the present idioms/practices did not do the job, then of course >> there is justification for a new approach such as S, but since the >> current idioms work fine there is no need or reason to take on yet >> another approach. > > I'd say that it's pretty well impossible to say that the present > idioms/practices "do the job" unless there's an accompanying theory that > says what entailments are licensed; i.e. what conclusions can validly be > drawn from some given collection of RDF. (Back to the model theory.) > >> The D (rdf:value/rdf:type, local) and P (rdfs:range, global) idioms do >> the job just fine. > > Point me to the model theory, and I may be convinced. Though your point is taken, I'd say its stretching things a bit. Just because the MT doesn't exist for P and D, doesn't mean that they don't do the job. Since both idioms are in use, they apparently do "work". I might not be able myself to provide you with the math to *prove* that they work as well (or better than) S, but that doesn't mean they don't. Still, it's a fair request that the options be considered at the same level of precision, and I agree that we need P and D defined in mathematical terms. In fact, it appears that most of the math for P and D is defined, insofar as they are based on the fundamental concept of a pairing of lexical form to data type identity, and that is also the foundation of S, and therefore much/most of Sergey's document appears to significantly intersect the needed definition of P and D. I don't think that talking about P and D is just so much hand waving simply due to the absence of a complete MT specification for them. >> The S idioms, while also doing the job, do so with more machinery and >> most significantly are contrary to the intuitions of current RDF users >> (data typing by predicate rather than by rdf:type). > > I don't recognize that description of S: > - I don't see "more machinery" here, whatever that means, That means having to use multiple URIrefs and global schema definitions to pair a lexical form with its data type. I've given more than enough examples to bear that out. No need to repeat them here. S imposes separate URIrefs for lexical and value spaces and the mapping relation, and requires one to use data typing properties relating to those data type URIs rather than the generic rdf:type. It also requires either parsing the URIref to determine the type of component, or further definitions for type for each URIref for each component. I.e. xsd:integer.lex rdf:type s:LexicalSpace . xsd:integer.val rdf:type s:ValueSpace . xsd:integer.map rdf:type s:Mapping . xsd:integer.cmap rdf:type s:CanonicalMapping . and then likely further statements to relate the data type to its component predicates xsd:integer s:lexicalSpace xsd:integer.lex . xsd:integer s:valueSpace xsd:integer.val . xsd:integer s:mapping xsd:integer.map . xsd:integer s:canonicalMapping xsd:integer.cmap . Now, while all that information is likely to be useful, it is not actually *necessary* to know what value the pairing ("30",xsd:integer) actually maps to and therefore to make it required simply to know what value we're talking about is wasteful. If you don't consider that more machinery (and just more headache) all just to pair a lexical form to a data type for interpretation by some ex-RDF application, then I'm not surprised if our conclusions and opinions differ so greatly. > - "contrary to the intuitions of current RDF users" is precisely one > area > where I think S scores very strongly, based on my intuitions from work > with > CC/PP (modulo small issues raised in my comments to Sergey's paper). > - I don't know what "data typing by predicate" means Er, from Sergey's document, section 4: A datatype mapping can be "named" using RDF properties. In this document, such properties are referred to as datatype properties. Forgive me for using the term predicate rather than property. > (though I observe > that > rdf:type *is* commonly used as the name of a predicate). That in order to associate a literal (lexical form) with a data type, I must use a data type specific property rather than the generic property rdf:type which is intended for that purpose and which is used for associating URIref labeled resources with data types. I must then relate that data type specific property to the actual data type, and from that relation deduce the pairing I need. It's extra work that is unnecessary and forces an obese sub-vocabulary for every data type which otherwise serves no useful purpose. See above. All we need is a single URIref for each data type and a standardized, generic means to pair literals with that URIref to achieve all the knowledge needed to interpret those pairings to actual values. Thus, the P and D idioms are far more economical than S in that regard, and thus have less machinery -- and also don't require folks to learn new machinery and expend extra effort in making their data typing schemes RDF-usable. No, I don't have the math to throw at you, but that doesn't mean I'm not right ;-) >> There is also the drawback that the S idioms represent a method of > typing >> literal labeled resources that differs from the typing of URIref labled >> resources, introducing an inconsistency into RDF with regards to > resource >> typing in general. > > Er, how is it different? By not using rdf:type and rdfs:range in the same manner as for URIref labeled resources. And if P+ were adopted, then we could have P/P+ for global/local data typing of literals where the treatment would be even more the same as URIref labeled resources, including analogous graph constructs with literals serving as subjects. But, since we are (presumably) bound by the charter against the changes needed for P+, we can still use P/D now and then later migrate more easily to P/P+ with much less trouble than migrating from S to P/P+. > (The *syntax* of RDF doesn't allow us to apply rdf:type directly to a > literal, so more indirect means are required to indicate the intended > type > of literal values; Right, such as the D idiom already in use, and seeming to work just fine. > I (personally) don't think that constraint needs to > apply in the RDF graph, and I'd favour relaxing the syntax in a future > version of RDF. If you mean allowing literal nodes to be subjects, per P+, then I fully agree. But certainly the D idiom is far easier to contract to an eventual P+ representation when that time comes. I.e. going to P+ Bob ex:age _:1:"30" . _:1 rdf:type xsd:integer . from D Bob ex:age _:1 . _:1 rdf:value "30" . (simnply move to anon-node label) _:1 rdf:type xsd:integer . is far more straightforward than from S Bob ex:age _:1 . _:1 xsd:integer.map "30" . xsd:integer.map rdfs:range xsd:integer.lex . xsd:integer.map rdfs:domain xsd:integer.val . and presumably xsd:integer.lex rdf:type s:LexicalSpace . xsd:integer.val rdf:type s:ValueSpace . xsd:integer.map rdf:type s:Mapping . xsd:integer.cmap rdf:type s:CanonicalMapping . xsd:integer s:lexicalSpace xsd:integer.lex . xsd:integer s:valueSpace xsd:integer.val . xsd:integer s:mapping xsd:integer.map . xsd:integer s:canonicalMapping xsd:integer.cmap . and we still haven't actually said anything about the data type xsd:integer itself, but have to "know" about how to parse the special names and remove the '.lex', '.val', and '.map' suffixes. And again, with S we get into the fun stuff with the ability to say Bob ex:age "30" . ex:age rdfs:subPropertyOf xsd:integer.map . which given the above range and domain statements thereby declares Bob to be an instance of xsd:integer.val, etc. etc. Yes, alot more machinery... and all unnecessary, and with unknown conflict/interaction with usage/semantics of rdfs:subPropertyOf, etc. etc. > I don't see any other proposal addressing this point, > other than by adding whole new layers of structure to RDF, or making > literals a truly different kind of thing in the RDF graph and associated > semantics.) From what I have understood, none of P, P+, D, or U make literals anything other than they are as defined in the Dec14 draft of the MT. The only change to the current MT is suggested by P+ which allows literal nodes to act as subjects. >> Finally, if we later (e.g. in RDF 2.0) want to take a P+ approach, >> (which I think is the most ideal approach once the necessary extensions >> to the graph model can be made) the P and D idioms are far more >> compatible with P+ than are the S idioms and therefore with P/D we >> have a much easier evolution path to P+ than with S. > > P+ allows literals-as-subjects, right? > > Again, I don't agree that P and D are more compatible with this > approach. See examples above, where D is easily contracted into P+ by simply being able to attach the literal as a label to the object node rather than having to attach it via rdf:value. I'd say that's much more compatible than the S idiom. It probably should be stressed that I am focusing on the efficacy of the representation of data typing knowledge in the RDF graph more so than the details of the semantic interpretation of that representation -- which may in the eyes of some disqualify me from this discussion entirely-- but I feel that my intuitions about those alternate representations will be borne out in the math (once someone more capable than I provides it ;-) > I think S would also benefit from such a relaxation. I would think that literals as subjects would break S, as it depends on an anonymous node to represent the .val(ue) and the literal to represent the .lex(ical form). How would you use a .map predicate to define the mapping?! You'd have to resort to rdf:type, just as with P+. > I > suspect > that much of your resistance to S is not to the basic proposal per se, > but > to the indirections that are used to apply S in the face of the > syntactic > restriction noted. My objection certainly has to do with those indirections, yes, as I see them as entirely unnecessary and just that much more work for folks to have to deal with in order to work with typed data literals in RDF. Indirection is a good thing in general, but only if it is providing something needed or useful. I don't see the S indirection do that. All of the key distinctions between lexical and value spaces and the mapping between them is clear simply from the pairing of lexical form (literal) to data type identity (URIref). Explicit identities for those spaces and mappings is unecessary in the RDF graph. All that information about spaces and mappings, and in particular the relationships between data types with regards to subset spaces and mappings, is very important and useful, but that is knowledge about the data types, not mechanisms for defining the data type of literals. > This is an important debate that we need to have, which I hoped to spark > by > my comments to which you responded. I believe the group needs to come > to > consensus on this issue, with some urgency. I agree. > Well, I certainly disagree that description of S should be removed from > a > discussion document. Point taken. I was taking your recommendation as meaning a promotion or change of the role of that document to something other than for discussion. Apologies if I misunderstood. > I would certainly accept the addition of corresponding descriptions of P > and D, if such descriptions are forthcoming -- that way we have an > even-handed basis from which to make our choice. I agree, and wish I could provide them, but it is beyond my present skill. Regards, Patrick -- Patrick Stickler Phone: +358 50 483 9453 Senior Research Scientist Fax: +358 7180 35409 Nokia Research Center Email: patrick.stickler@nokia.com
Received on Thursday, 10 January 2002 10:55:42 UTC