- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Tue, 05 Feb 2002 12:59:17 +0200
- To: RDF Core <w3c-rdfcore-wg@w3.org>
After a few comments, I outline below a possible scenario by which we may reach convergence and closure for the datatyping challenge. I've tried to state questions, actions, and decisions explicitly so that we can assess to some degree whether there would be consensus about such a scenario. [Please read it slowly, and count to ten before posting any replies... you all should know by now how first readings of my posts can be misleading... ;-) and please read all of it before firing off comments on idividual points which may be clarified further on or may become clear on a second reading... Thanks] > ... I could live with S-P only. Again, CC/PP and DAML folks > might oppose trashing S-B. I wouldn't like disposing of S-A completely > just for the reason that S-A supports datatyping decoupled from lexical > representation, and S-P does not. However, I could live without S-A and > without S-B. That's good news, as I think it means we are reaching that much craved convergence. I agree that the S-P/TDL-local idiom is the least problemmatic of the lot. We do, however, have to have a global/implicit idiom, or else we can't express constraints/expectations on datatypes for property values. So we need either TDL-global or S-B, or something like that. Jeremy's revised global idiom based on S-P/TDL-local, making rdf:type optional seems the most promising. Per Pat's latest comments about the interpretation of literals in the S-B/TDL-global idiom, perhaps we can get round them by just dropping those idioms entirely. That said..... -- It seems that, from all the past couple of weeks discussions, there are the following characteristics on folks wish lists (this is a partial recap of some of the desiderada): 1. A working MT (duh ;-) 2. Tidy literals 3. A global/implicit idiom 4. A local/explicit idiom 5. Same vocabulary valid for both local and global idioms 6. Free combination of local and global idioms without conversion 7. The ability to conduct queries by value 8. The ability to conduct queries by literal 9. Datatype URIs denote the entire datatype, as defined by the datatype "owner", not only one of its components It seems to me that (backwards compatability issues aside for the moment) that the following may make everyone happy: Take TDL (sans present MT) with its present local idiom, is also the S-P idiom (apart from the designation of the datatype URI) Replace the global idiom with Jeremy's proposed bNode global idiom, which is a derivative of the local idiom with rdf:type omitted Make literals tidy (untidyness is born by the bNodes) Extend the RDF vocabulary to include the property rdf:dtype which is an rdfs:subPropertyOf rdf:type, and which is to be used by both local and global idioms (I think this is a more conservative choice than adopting a completely separate vocabulary per Pat's recommendation) State for the benefit of the XML Schema community that datatype URIs in this solution denote the whole datatype as defined by the datatype owner with no extension or modification. The datatype simply serves as the context of interpretation for a typed literal. Fix/extend/refine the TDL MT to take these changes into account and make it all work ;-) Thus, we have global and local datatyping idioms that look like the following: Bob ex:age _:1 . _:1 rdf:value "30" . ex:age rdfs:range xsd:integer . Mary ex:age _:2 . _:2 rdf:value "30" . _:2 rdf:dtype xsd:integer . where the literal "30" is a tidy literal shared by both rdf:value statements and they live happily in the same knowledge base with the same vocabulary with no problems, and have a consistent and symmetrical graph representation. This, I believe, meets all the items in the above defined wishlist (presuming the working MT of course ;-) -- BACKWARDS COMPATABILITY: Issue 1: Intuitive use of old-style global idiom The old-style global idiom (Bob ex:age "30") would be considered a contracted form of the new style global idiom, which is more convenient for users to manually edit and view. The expansion of the old-style contracted idiom to the new bNode global idiom would be performed by the parser, just as are all contracted forms in the RDF/XML (or by an external transformation for legacy parsers). Thus <rdf:Description rdf:ID="Bob"> <ex:age>30</ex:age> </rdf:Description> will produce the two triples Bob ex:age _:1 . _:1 rdf:value "30" . rather than the single triple Bob ex:age "30" . It would be acceptable for a parser to have an option for generating the old-style single triples in order to support legacy systems (see immediatly below) though such behavior would be deprecated and not the default. -- Issue 2: Queries on non-datatyped literal values It has been clarified, I feel, that one can make both literal based and value based queries on TDL datatyped graphs simply by whether or not the query ignores or takes into account the datatyping. Thus, with minor tweaks to existing query APIs, legacy systems based on literal equality tests will continue to work fine with this proposed convergence solution. Nevertheless, if that is not acceptable to all, then if we still can have in the graph, in addition to the bNode global and local idioms, statements such as Fred ex:age "30" . then the literal "30" is the same literal as in the two rdf:value statements of the bNode datatyping idioms, since literal nodes are tidy, but it does *not* denote an integer insofar as the RDF defined interpretation is concerned as the statement does not conform to either of the datatyping idioms (this is a crucial distinction, think about it and keep reading). As pointed out in issue 1 above, a parser can provide backwards compatible ntriples generation (or one can use a legacy parser ;-) to continue using RDF without datatyping. And both the old-style global idiom and the bNode datatyping idioms can coexist in the same graph with no problems -- as queries based on datatyped values would simply disregard the non-datatyped literal values, and likewise queries based on literals would disregard the bNode isolated literal values. (warning, MT rapids ahead... life jackets on... ;-) This coexistence of course requires the MT to exclude literals from datatyping interpretation by rdfs:range such that rdfs:range only asserts datatyping for non-literal property values: either bNodes or URIrefs. QUESTION: Can the MT exclude literals in the datatyping interpretation of rdfs:range? By allowing the old-style global, or basic, idiom, this allows folks who are treating literals as having globally consistent meaning to continue doing so, regardless of any range defined datatyping, and to conduct their queries in terms of literal string equality, etc. Thus, current practices and systems are not impacted in any way by the datatyping solution at all. Those that want datatyping must use the bNode idioms and datatyped values expressed by those idioms have no misinterpretation of meaning by literal-comparison queries. -- Issue 3: Old-style global idioms with rdfs:range datatyping Per the treatment of the old-style global idiom as a contracted form of the bNode global idiom, there is no problem with supporting legacy RDF instances which employ both the old-style global and local idioms (e.g. DAML) since both receieve a consistent representation in the graph with a consistent interpretation from the MT, and query APIs supporting this datatyping solution will have a consistent foundation to conduct queries. -- OK, that's pretty much it. I guess it's time to duck and cover ;-) Patrick -- Patrick Stickler Phone: +358 50 483 9453 Senior Research Scientist Fax: +358 7180 35409 Nokia Research Center Email: patrick.stickler@nokia.com
Received on Tuesday, 5 February 2002 06:43:17 UTC