- From: <Patrick.Stickler@nokia.com>
- Date: Wed, 14 Nov 2001 09:36:34 +0200
- To: melnik@db.stanford.edu, w3c-rdfcore-wg@w3.org
> -----Original Message----- > From: ext Sergey Melnik [mailto:melnik@db.stanford.edu] > Sent: 14 November, 2001 00:18 > To: Stickler Patrick (NRC/Tampere) > Cc: phayes@ai.uwf.edu; w3c-rdfcore-wg@w3.org > Subject: Re: DATATYPES: mental dump. > > > Patrick.Stickler@nokia.com wrote: > > ... > > Patrick, > > for some reason your most recent X-scheme related posting > still did not > propagate to my mailbox (I found it in the archive of rdf-core). Don't know what the problem there might be, if there is one... > Question: do you assume that namespaces are part of the model? For > example, given a URV like xsd:integer:10, is it possible to > extract the > namespace xsd:integer unambiguously? Namespaces are not part of URVs. The similarity between the URV xsd:integer:10 and the qname xsd:integer (which isn't even an authoritative identifier) is coicidental, though understandable, given the mnemonic space. Have a look at the 'xsd:' URI scheme specification to see these relations explicitly: http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Nov/0358.html > (My guess is that the > URI encoding > alone is not sufficient, i.e., by looking at > http://www.w3.org/2001/XMLSchema#integer:10 it is not obvious whether > the datatype would be http://www.w3.org/2001/XMLSchema#integer or > http://www.w3.org/2001/XMLSchema#). Again, it is explicit, per the URV definition. I had sent those materials to you last Friday, not via the list (just to the DATATYPE subgroup) so you may need to check our email service. It appears to be loosing messages. > Related question: how do you refer to a datatype? Would you use a > resource URI like http://www.w3.org/2001/XMLSchema#integer to refer to > integers? What is the nature of the relationship between a > resource like > that and other URV-resources that have the above URI as a prefix? The URI http://www.w3.org/2001/XMLSchema#integer denotes an XML Schema simple data type for which is defined both a value space, a lexical space, and a canonical lexical space (these three things are defined for all XML Schema simple data types). The URI xsd:integer denotes a URV scheme which defines a URI encoding of a typed data literal. It has a lit:mapsTo relation to the XML Schema data type http://www.w3.org/2001/XMLSchema#integer. Every URV scheme based on the 'lit:' ontology defines a canonical lexical space (not just a basic lexical space) which lit:mapsTo one or more data types (both their lexical and value spaces) and/or lit:correspondsTo to one or more data types (only their value spaces) or is lit:approximateTo one or more data types (only their value spaces, and with possible loss of precision). The URI xsd:integer:10 denotes a URV which defines a literal which conforms to the lexical space for the URV scheme xsd:integer and, based on the defined relations between xsd:integer and XML Schema http://www.w3.org/2001/XMLSchema#integer (lit:mapsTo), the lexical form "10" is a member of both the lexical and value space of http://www.w3.org/2001/XMLSchema#integer. Is that clearer now? > Another thing: could you highlight some of the options of encoding the > interpretation shown in > > http://WWW-DB.Stanford.EDU/~melnik/rdf/datatyping/fig/motivating_example.gif > > in the X scheme (just the 'compact' version)? I had planned to. Will do it today. I know that there needs to be some good and disperate data type examples. The examples I sent out yesterday were mostly intended to illustrate the notation and the fact that the same graph "interpretation" (whether or not it displaces the present graph model) provides for solutions to both the data typing issues as well as several other critical (IMO) issues such as reification and qualification. > Would it be something like > > _Jenny x:weight urv:kg:decimal:12 > > where urv:kg:decimal:12 denotes m1, or > > _Jenny x:weight _w > _w x:kg urv:decimal:12 > > where urv:decimal:12 denotes n2? ;-) Here's where it gets fun... OK, I've not yet published this, I'm working on the I-D this week (along with I-D's for several other URI schemes based on the URV concept) and it likely will be based on the Unified Code for Units of Measure (http://aurora.rg.iupui.edu/~schadow/units/UCUM/) or similar (I expect IEEE offers a few options ;-) Here's how I would encode the information in your example: 1. There is a URV scheme 'unit:' (with accompanying data type RDF classes) which is a data typing scheme for encoding magnitudes with units of measure. The basic form is 'unit:' {unit of measure} ':' {value} E.g. unit:kg:12 unit:cm:28818 unit:oz:11 unit: Now, a unit of measure is just another data type. In fact, we can define the data type represented by the URV scheme unit:kg to be a subclass of xsd:integer. The semantics of 'unit of measure' is defined for the unit:kg type, but being a subClassOf xsd:integer, we know that all unit:kg forms map to values in the xsd:integer value space. Having a separate URV scheme for units of measure is just a convenience. Thus, <rdf:Description rdf:about="unit:kg"> <lit:mapsTo rdf:resource="urn:unit:kg"/> </rdf:Description> 2. Define the RDF data type classes for units of measure and relate them to any other data type classes which they are subordinate to (e.g. ground them in the XML Schema data type taxonomy). I.e. <rdf:Description rdf:about="urn:unit:kg"> <rdfs:subClassOf rdf:resource="http://www.w3.org/2001/XMLSchema#integer"/> </rdf:Description> 3. Define your knowledge with literals encoded as 'unit:' URVs E.g. (using level 1 compression, all UNodes with same uriref's merged) [1,U,#Jenny] [2,U,#Robby] [3,U,#weight] [4,U,#age] [5,S,1,3,unit:kg:12] [6,S,2,3,unit:kg:14] [7,S,1,4,xsd:duration:1Y] [8,S,1,4,xsd:duration:12M] [9,S,2,4,xsd:duration:1Y2M] [10,S,1,4,xsd:duration:14M] Now, the actual age values for Jenny and Robby weren't clear from your graphic, so I had to guess a little, but I think the gist should be clear. The fact that the duration values can be parsed into individual subvalue components and that those subvalues correspond to integers are details of the xsd:duration data type, and any application that understands that data type will be able to interpret those values accordingly. It is not RDF's responsibility IMO to provide an explicit modelling of all complex data types which are expressed in single lexical forms. Folks may wish to do that, but it can't be RDF's responsibility because RDF should be neutral to all data types (IMO). The proper interpretation of lexical forms for a given data type should not be part of the MT. Only the relation of lexical form to data type (i.e. relation of RDF literal to RDF class). If RDF were to trully model the mapping from lexical form into value space, it would have to have a canonical representation for values in a given value space, and that canonical representation may or may not correlate to the internalized representation of any given system, so all that is really being done is defining a mapping from lexical space to canonical lexical space -- since the RDF canonical representation space is not an actual layer of any specific system. RDF cannot escape lexical forms because it cannot offer a universal set of value spaces which are shared by all RDF applications. Individual applications are still going to have to map from the RDF encoded lexical forms into their own value space representations. Thus, trying to model the mapping of the literal "10" into "the set of real integers" in the MT is not going to buy us anything, as that mapping is the domain of the application and thus we still remain in the realm of lexical forms, even if they are canonical. We may define upper-level type taxonomies which define only value spaces and ground our data types that define lexical spaces in those abstract classes via rdfs:subClassOf, and use that as the basis of a common interpretation of values, but to that end, RDF already then provides the interpretation in the semantics of class relations. E.g. xsd:integer rdfs:subClassOf ieee:integer . xsd:integer rdfs:subClassOf ieee:integer . The best we can do IMO is to model the relation of the literal "10" to one or more data type classes (via rdf:type and rdfs:range) and the mechanisms by which the lexical space and value space are defined for the class (e.g. the 'lit:' ontology or similar, or left implicit such as for XML Schema classes), and mechanisms for defining relationships between classes (rdfs:subClassOf, rdfs:subPropertyOf), but not mechanisms for "interpreting" them (defining the mapping from lexical space to value space) in RDF. If the above was not clear, please see the glossary definitions about lexical space and canonical lexical space and look at how those terms are used in the XML Schema spec for defining simple types. Cheers, Patrick
Received on Wednesday, 14 November 2001 02:36:48 UTC