- From: <Patrick.Stickler@nokia.com>
- Date: Tue, 13 Nov 2001 23:01:30 +0200
- To: w3c-rdfcore-wg@w3.org
Below is a brief outline of an ontology for "Typed Data Literals" which is meant to provide the basis for the definition of URV schemes, and an RDF schema (partially incomplete) which defines an XML Schema URV scheme in terms of that ontology. It addresses the distinction between value space and lexical space, as well as lexical space versus canonical lexical space, and allows for defining mappings between data type classes in terms of either value space, lexical space, or both. --- Here's the ontology "schema" (only has comments so far ;-) <?xml version="1.0"?> <!DOCTYPE uridef [ <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#"> <!ENTITY lit "voc://nokia.com/lit-1.0/"> ]> <rdf:RDF xmlns:rdf ="&rdf;" xmlns:rdfs ="&rdfs;" xmlns:lit ="&lit;" > <!-- Ontology for Typed Data Literals: lit:mapsTo An RDF Class (other than a TDL type) which every TDL for this TDL type corresponds to, for both the value space and lexical space. Is subPropertyOf lit:correspondsTo. lit:correspondsTo An RDF Class representing a value space which every TDL for this TDL type is a member, but which may not conform to its lexical space. Is subPropertyOf lit:approximateTo. lit:approximateTo An RDF Class representing a value space which the TDL for this TDL type is a member, but which may not conform to its lexical space and may differ in precision such that conversion may result in a loss of information. lit:conformsTo An informative string identifying a standard to which the TDL type conforms, both the value space and lexical space, if and as specified. lit:subTypeOf A TDL type which is superordinate to this TDL type such that every TDL for this type is a valid TDL of the superordinate type, both in value space and lexical space. lit:pattern A pattern which matches a lexical form for a TDL of this type. lit:xpattern A pattern which matches a lexical form for a TDL not of this type. - - - The set of patterns defined for a TDL type constitute an OR'd set of options, any single pattern may match. The set of xpatterns defined for a TDL type constitute an AND'd set of exclusions, none of which may match. Patterns are applied prior to xpatterns for a given TDL type. If a given TDL type is a subTypeOf one or more other TDL types, the patterns and xpatterns defined are complementary to the patterns defined for the superTypes, such that, validation of a lexical form must be done from furthest ancestor to locally defined TDL type, and if failure ocurrs at any stage, the value is invalid. This permits subtypes to define their lexical spaces in terms of supertypes by restriction (adding only xpatterns) and also ensures that all lexical forms of a TDL type conform to the lexical space defined for all superordinate TDL types. If there is multiple inheritance, then validation must be done for every path from each furthest ancestor to the local TDL type. All lit:mapsTo, lit:correspondsTo, lit:approximateTo, and lit:conformsTo relations defined for a superordinate TDL type are inherited by a subordinate TDL type. --> </rdf:RDF> ================= And here's the definition for the XML Schema simple types: <?xml version="1.0"?> <!-- RDF Schema defining Type Data Literal (TDL) Encodings and Mappings to XML Schema Simple Types using Canonical Representations Author: Patrick Stickler Nokia Research Center patrick.stickler@nokia.com --> <!DOCTYPE uridef [ <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#"> <!ENTITY xsd "http://www.w3.org/2001/XMLSchema#"> <!ENTITY lit "voc://nokia.com/lit-1.0/"> ]> <rdf:RDF xmlns:rdf ="&rdf;" xmlns:rdfs ="&rdfs;" xmlns:xsd ="&xsd;" xmlns:lit ="&lit;" > <!-- need to distill patterns to utilize superType definitions --> <!-- Primitive Data Types --> <rdf:Description rdf:about="xsd:anySimpleType"> <lit:mapsTo rdf:resource="&xsd;anySimpleType"/> </rdf:Description> <rdf:Description rdf:about="xsd:string"> <lit:mapsTo rdf:resource="&xsd;string"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> </rdf:Description> <rdf:Description rdf:about="xsd:boolean"> <lit:mapsTo rdf:resource="&xsd;boolean"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>0</lit:pattern> <lit:pattern>1</lit:pattern> <lit:pattern>true</lit:pattern> <lit:pattern>false</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:decimal"> <lit:mapsTo rdf:resource="&xsd;decimal"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>0\.0</lit:pattern> <lit:pattern>-?0\.[0-9]*[1-9]</lit:pattern> <lit:pattern>-?[1-9][0-9]*\.[0-9]*[1-9]</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:float"> <lit:mapsTo rdf:resource="&xsd;float"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <!-- check for completeness --> <!-- canonical form should use fixed point notation! --> <lit:pattern>-?0\.[0-9]*[1-9]E-?[1-9][0-9]*</lit:pattern> <lit:pattern>-?[1-9][0-9]*\.[0-9]*[1-9]E-?[1-9][0-9]*</lit:pattern> <lit:pattern>INF</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:double"> <lit:mapsTo rdf:resource="&xsd;double"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <!-- check for completeness --> <!-- canonical form should use fixed point notation! --> <lit:pattern>-?0\.[0-9]*[1-9]E-?[1-9][0-9]*</lit:pattern> <lit:pattern>-?[1-9][0-9]*\.[0-9]*[1-9]E-?[1-9][0-9]*</lit:pattern> <lit:pattern>INF</lit:pattern> <lit:pattern>-INF</lit:pattern> <lit:pattern>NaN</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:duration"> <lit:mapsTo rdf:resource="&xsd;duration"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <!-- need to constrain digits? --> <lit:pattern>-?P([0-9]+Y)?([0-9]+M)?([0-9]+D)?(T[0-9]+H)?([0-9]+M)?([0-9]+S) ?</lit:pattern> <lit:xpattern>-?P</lit:xpattern> </rdf:Description> <!-- Add lit:xpattern's to trap bogus date elements? --> <rdf:Description rdf:about="xsd:dateTime"> <lit:mapsTo rdf:resource="&xsd;dateTime"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>-?[0-9]{4,}-[0-9]{2}-[0-9]{2}T(([01][0-9])|(2[0-3])):[0-5][0-9] :[0-5][0-9]Z?</lit:pattern> <lit:xpattern>^0000-.*</lit:xpattern> </rdf:Description> <rdf:Description rdf:about="xsd:time"> <lit:mapsTo rdf:resource="&xsd;time"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>(([01][0-9])|(2[0-3])):[0-5][0-9]:[0-5][0-9]Z?</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:date"> <lit:mapsTo rdf:resource="&xsd;date"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>-?[0-9]{4,}-[0-9]{2}-[0-9]{2}</lit:pattern> <lit:xpattern>^0000-.*</lit:xpattern> </rdf:Description> <rdf:Description rdf:about="xsd:gYearMonth"> <lit:mapsTo rdf:resource="&xsd;gYearMonth"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>-?[0-9]{4,}-[0-9]{2}</lit:pattern> <lit:xpattern>^0000-.*</lit:xpattern> </rdf:Description> <rdf:Description rdf:about="xsd:gYear"> <lit:mapsTo rdf:resource="&xsd;gYear"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>-?[0-9]{4,}</lit:pattern> <lit:xpattern>0000</lit:xpattern> </rdf:Description> <rdf:Description rdf:about="xsd:gMonthDay"> <lit:mapsTo rdf:resource="&xsd;gMonthDay"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>--[0-9]{2}[0-9]{2}</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:gDay"> <lit:mapsTo rdf:resource="&xsd;gDay"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>---[0-9]{2}</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:gMonth"> <lit:mapsTo rdf:resource="&xsd;gMonth"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>--[0-9]{2}--</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:hexBinary"> <lit:mapsTo rdf:resource="&xsd;hexBinary"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>([0-9A-F]{2})+</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:base64Binary"> <lit:mapsTo rdf:resource="&xsd;base64Binary"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <lit:pattern>[\+/=0-9A-Za-z]+</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:anyURI"> <lit:mapsTo rdf:resource="&xsd;anyURI"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:QName"> <lit:mapsTo rdf:resource="&xsd;QName"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:NOTATION"> <lit:mapsTo rdf:resource="&xsd;NOTATION"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <!-- lit:pattern TBD --> </rdf:Description> <!-- Derived Data Types --> <rdf:Description rdf:about="xsd:normalizedString"> <lit:mapsTo rdf:resource="&xsd;normalizedString"/> <lit:subTypeOf rdf:resource="xsd:string"/> <lit:xpattern>.*#xD.*</lit:xpattern> <lit:xpattern>.*#x9.*</lit:xpattern> </rdf:Description> <rdf:Description rdf:about="xsd:token"> <lit:mapsTo rdf:resource="&xsd;token"/> <lit:subTypeOf rdf:resource="xsd:normalizedString"/> <!-- should be type xsd:tokenizedString with subtype xsd:token --> <lit:xpattern>.*#xD.*</lit:xpattern> <lit:xpattern>.*#x9.*</lit:xpattern> <lit:xpattern>^#x20.*</lit:xpattern> <lit:xpattern>.*#x20$</lit:xpattern> <lit:xpattern>.*(#x20){2,}.*</lit:xpattern> </rdf:Description> <rdf:Description rdf:about="xsd:language"> <lit:mapsTo rdf:resource="&xsd;language"/> <lit:subTypeOf rdf:resource="xsd:token"/> <!-- should be subTypeOf xsd:name? --> <!-- this needs to be more constrained --> <lit:pattern>[a-z]{2}</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:NMTOKEN"> <lit:mapsTo rdf:resource="&xsd;NMTOKEN"/> <lit:subTypeOf rdf:resource="xsd:token"/> <lit:subTypeOf rdf:resource="xsd:NMTOKENS"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:NMTOKENS"> <lit:mapsTo rdf:resource="&xsd;NMTOKENS"/> <lit:subTypeOf rdf:resource="xsd:token"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:Name"> <lit:mapsTo rdf:resource="&xsd;Name"/> <lit:subTypeOf rdf:resource="xsd:token"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:NCName"> <lit:mapsTo rdf:resource="&xsd;NCName"/> <lit:subTypeOf rdf:resource="xsd:name"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:ID"> <lit:mapsTo rdf:resource="&xsd;ID"/> <lit:subTypeOf rdf:resource="xsd:NCName"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:IDREF"> <lit:mapsTo rdf:resource="&xsd;IDREF"/> <lit:subTypeOf rdf:resource="xsd:NCName"/> <lit:subTypeOf rdf:resource="xsd:IDREFS"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:IDREFS"> <lit:mapsTo rdf:resource="&xsd;IDREFS"/> <lit:subTypeOf rdf:resource="xsd:token"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:ENTITY"> <lit:mapsTo rdf:resource="&xsd;ENTITY"/> <lit:subTypeOf rdf:resource="xsd:NCName"/> <lit:subTypeOf rdf:resource="xsd:ENTITIES"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:ENTITIES"> <lit:mapsTo rdf:resource="&xsd;ENTITIES"/> <lit:subTypeOf rdf:resource="xsd:token"/> <!-- lit:pattern TBD --> </rdf:Description> <rdf:Description rdf:about="xsd:integer"> <lit:mapsTo rdf:resource="&xsd;integer"/> <lit:subTypeOf rdf:resource="xsd:anySimpleType"/> <!-- canonical representation not valid for xsd:decimal! --> <lit:correspondsTo rdf:resource="&xsd;decimal"/> <lit:pattern>0</lit:pattern> <lit:pattern>-?[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:nonPositiveInteger"> <lit:mapsTo rdf:resource="&xsd;nonPositiveInteger"/> <lit:subTypeOf rdf:resource="xsd:integer"/> <lit:pattern>-0</lit:pattern> <lit:pattern>-[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:negativeInteger"> <lit:mapsTo rdf:resource="&xsd;negativeInteger"/> <lit:subTypeOf rdf:resource="xsd:nonPositiveInteger"/> <lit:pattern>-[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:long"> <lit:mapsTo rdf:resource="&xsd;long"/> <lit:subTypeOf rdf:resource="xsd:integer"/> <lit:pattern>0</lit:pattern> <lit:pattern>-?[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:int"> <lit:mapsTo rdf:resource="&xsd;int"/> <lit:subTypeOf rdf:resource="xsd:long"/> <!-- need to constrain between 2147483647 and -2147483648 --> <lit:pattern>0</lit:pattern> <lit:pattern>-?[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:short"> <lit:mapsTo rdf:resource="&xsd;short"/> <lit:subTypeOf rdf:resource="xsd:int"/> <!-- need to constrain between 32767 and -32768 --> <lit:pattern>0</lit:pattern> <lit:pattern>-?[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:byte"> <lit:mapsTo rdf:resource="&xsd;byte"/> <lit:subTypeOf rdf:resource="xsd:short"/> <!-- need to constrain between 127 and -128 --> <lit:pattern>0</lit:pattern> <lit:pattern>-?[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:nonNegativeInteger"> <lit:mapsTo rdf:resource="&xsd;nonNegativeInteger"/> <lit:subTypeOf rdf:resource="xsd:integer"/> <lit:pattern>0</lit:pattern> <lit:pattern>[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:unsignedLong"> <lit:mapsTo rdf:resource="&xsd;unsignedLong"/> <lit:subTypeOf rdf:resource="xsd:nonNegativeInteger"/> <!-- need to constrain below 18446744073709551615 --> <lit:pattern>0</lit:pattern> <lit:pattern>[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:unsignedInt"> <lit:mapsTo rdf:resource="&xsd;unsignedInt"/> <lit:subTypeOf rdf:resource="xsd:unsignedLong"/> <!-- need to constrain below 4294967295 --> <lit:pattern>0</lit:pattern> <lit:pattern>[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:unsignedShort"> <lit:mapsTo rdf:resource="&xsd;unsignedShort"/> <lit:subTypeOf rdf:resource="xsd:unsignedInt"/> <!-- need to constrain below 65535 --> <lit:pattern>0</lit:pattern> <lit:pattern>[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:unsignedByte"> <lit:mapsTo rdf:resource="&xsd;unsignedByte"/> <lit:subTypeOf rdf:resource="xsd:unsignedShort"/> <!-- need to constrain below 255 --> <lit:pattern>0</lit:pattern> <lit:pattern>[1-9][0-9]*</lit:pattern> </rdf:Description> <rdf:Description rdf:about="xsd:positiveInteger"> <lit:mapsTo rdf:resource="&xsd;positiveInteger"/> <lit:subTypeOf rdf:resource="xsd:nonNegativeInteger"/> <lit:pattern>[1-9][0-9]*</lit:pattern> </rdf:Description> </rdf:RDF> ===== A few misc. comments... I've noted that the XML Schema simple type hierarchy is not quite perfect. There are a few comments to that end in the above schema. In particular, the list versions of the token subtypes seem "upside down" insofar as lexical space is concerned, and integer shouldn't be a subtype of decimal, etc. So we may end up with two distinct hierarchies, one for value space, defined via rdfs:subClassOf and one for lexical space, defined via lit:subTypeOf. Note that I have not defined the rdfs:subClassOf relations between the XML Schema simple type classes above, only their realization as types of the 'xsd:' URV scheme. Secondly, note that with a URV definition such as that for the 'xsd:' scheme above, one need not use an XML Schema engine for validation of lexical forms, but simply test whether the value conforms to the specified patterns and xpatterns. Thus, a single function that provides regular expression matching does the trick. This is also a good thing because I am presuming that we will be allowing statements to be asserted via other interfaces than XML serialization, and thus, this provides an XML-independent means for testing lexical forms by type. Finally, I am presuming that a URI scheme prefix is itself a valid URI, which may be wrong (or undefined). Thus, <xsd:integer> is the URI representing the URV scheme for XML Schema integers, and is not the qname for the actual XML Schema class. Hopefully this distinction is clear in the schema above. Cheers, Patrick -- Patrick Stickler Phone: +358 50 483 9453 Senior Research Scientist Fax: +358 7180 35409 Nokia Research Center Email: patrick.stickler@nokia.com
Received on Tuesday, 13 November 2001 16:01:34 UTC