RE: incomplete datatyping (was: Re: datatypes and MT) from Patrick.Stickler@nokia.com on 2001-11-13 (w3c-rdfcore-wg@w3.org from November 2001)

From: <Patrick.Stickler@nokia.com>
Date: Tue, 13 Nov 2001 09:42:01 +0200
To: w3c-rdfcore-wg@w3.org
Message-ID: <2BF0AD29BC31FE46B78877321144043114C07F@trebe003.NOE.Nokia.com>
> -----Original Message-----
> From: ext Jeremy Carroll [mailto:jjc@hplb.hpl.hp.com]
> Sent: 07 November, 2001 14:59
> To: w3c-rdfcore-wg@w3.org
> Subject: RE: incomplete datatyping (was: Re: datatypes and MT)
> 
> 
> Patrick:
> > Ahhh... here's where it gets really interesting...
> >
> > Do we mirror this derived type definition in the RDFS defined
> > class hierarchy? I.e., do we need to define xsd:integer and
> > xsd:string as a subClassOf xxx:size, so that folks can
> > define values such as [ rdf:value "1"; rdf:type xsd:integer ]
> > for properties with a range of xxx:size?
> >
> > Or should an RDF/RDFS engine testing range constraints also
> > be an XML Schema data type engine able to parse and understand
> > native XML Schema derived type defintions?
> >
> 
> My proposed XML Schema/RDF Schema/RDF integration is showing 
> in the examples
> I sent an hour ago:
> 
> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Nov/0198.html
> 
> Basically the XML Schema sits in one file, and the RDF or 
> RDFS refers to it
> in some way (e.g. using its URL or using xsi:schemaLocation).
> 
> In this framework RDFS *does not* duplicate any of the 
> mechanisms of XML
> Schema but merely uses them (in external XML Schema files).
> 
> This is me trying to play by the charter; we might want to 
> conclude that
> this is sufficiently messy that the charter should be interpreted more
> liberally.

I think that an approach such as you suggest could be workable
in an environment which is interpreting literals in terms of
XML Schema simple data types, but I also think that it is
imperative that the intepretation of typed literals be at a
completely different level than the association of type
to literals, and logical inference about the suitability of
typed values in a given context based on subClassOf relations
between type classes.

There are three reasons for this separation:

1. XML Schema is likely to evolve, and we wouldn't want RDF
coupled so tightly that it has to track XML Schema evolution
step by step (not that your proposal above really suggests
too tightly a coupling, per se).

2. XML Schema offers only one data type scheme, and in many
cases, represents "overkill" such that one may not wish to
have to employ an XML Schema parser to validate lexical forms,
and may in fact need to intern those values in a non-XML
system and an XML Schema interpretation layer may be just
"in the way" -- thus, an ontology that defines lexical spaces
in terms of basic regular expressions (such as the 'lit:'
URV scheme) will serve all the needs of validation of lexical
forms and relation of lexical space to (possibly multiple)
value spaces.

XML Schema is intended for defining both complex and simple
types, and the mechanisms for defining simple types are mostly
the same as for the complex types, which makes since in an
environment where both must be addressed; but RDF only would
use the subset of XML Schema functionality relating to simple
types, and that by itself, represents a very heavy notation
and layer of machinery which could be addressed equally well 
using simple regular expressions and a basic ontology for
relating lexical and value spaces.

Also, there already exist many data schemes in use in the world,
and RDF systems should be able to support knowledge ported from
those systems without the requirement of conversion to XML Schema
data types (which may not even suffice) nor should the mechanisms
for data typing exhibit any characteristics which could be seen
as prejudiced against non-XML Schema based or grounded data type
schemes.

3. Folks are going to want and need to apply generic operations
and interpretations to typed values in general (not just literals)
based on subClassOf relations between type classes, rdfs:range
constraints, daml:equivalentTo/ont:equivalentTo relations, etc.
and interpretation of lexical forms to map value representations
into internal canonical representations in a value space within
a given system does not, and should not, come into play for such
generic layers -- thus we need to focus on a solution that provides
consistent, explicit representation of typing for all values (not
just literals) and which preserves immutably the contextual
knowledge about literals (local type or predicate of original
statement) so that, when the time for interpretation does come,
all necessary information is available.

--

So, my recommendations/wishes are that the solution we choose
be fully generic, based solely on rdf:type and rdfs:range
constructs, focus on the consistent representation of and 
preservation of context for literal typing, and leave 
interpretation of values (both literals and otherwise) outside 
the scope of RDF proper. 

This means no "adoption" of any XML Schema value spaces or 
definitions of mappings from lexical forms to value spaces in 
the core MT, though the issues of data types, lexical spaces,
value spaces, relations between types and spaces, and mappings
from lexical forms embodied in literals to values in particular
value spaces can and *should* be addressed in a non-normative
appendix to the RDF spec (or MT) so that implementors are
provided with a general overview of the issues and suggested
practices, etc.

But interpretation of values such as mapping from lexical form
to system internal value is IMO clearly outside the scope of the 
RDF core model and strictly within the realm of individual 
applications which utilize the knowledge encoded in RDF.

RDF simply needs to ensure that all knowledge is represented
and maintained reliably, and that operations based on type
relations do not lose the context of original typing of values.

Cheers,

Patrick

--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com
Received on Tuesday, 13 November 2001 02:42:37 UTC