Sets, Nodes and Types was: Re: DAML ObjectProp vs DatatypeProp

Let me expand on my previous post regarding a few of these important issues
(at length).

>
> Other specifications, e.g. RDF, XML Schema etc. define their own type
> mechanisms and in different ways. In the current situation, for the
reasons
> you mention as well as others, there is no coherent notion of types.

There has been a long debate about the nature of an XML namespace not being
defined traditionally as a set of names. The rationale for this is that an
XML type name (as defined in a DTD) can refer to both an attribute and
element e.g.

<!ELEMENT ex:foo (bar|baz|bop)+>
<!ATTLIST ex:foo NMTOKEN #REQUIRED>

So the Name "ex:foo" serves as both an element _and_ attribute type name in
XML 1.0.

This relationship of QNames to type definitions is carried over to XML
Schema which uses a namespace qualified name (QName) to denote a type
definition _yet such a QName may identify a set of type definitions each for
{attribute, element and complexType}.

One 'problem' with assigning a URI reference to an XML Schema type in
general is that the QName _is not mapped to a single definition_, rather a
_set_ of definitions.

The XML Schema Formal Specification WD (formerly MSL) introduces a new
fragment identifier syntax for XML Schema datatypes:

http://www.w3.org/TR/xmlschema-formal/#section-overview-normalization

(since DAML contemplates leveraging XML Schema datatypes this document
should be understood -- yet does _not_ apply to XML Schema 1.0 ... hmmrph!)

>
> I have suggested that we might best use predicate calculus and set
> membership to define a schema independent (i.e. XML Schema and RDF and
TREX
> and DTD indenpendent) notion of types -- I am not the first to suggest
> datatyping in this manner.

And on the other hand I am not sure that I really care about the nitty
gritty details of the differences betweem "lexical" and "value" spaces ...
since I (firmly) consider XML to be character based, I consider _every_
datatype ultimately character based (i.e. grounded in lexical space). I
consider the "value" space a convenient shorthand for describing limits on
the range of values for an (e.g.) integer derived datatype. I.e. the fact
that a particular type is defined as 'between "1" and "10" is merely
shorthand for a lexical pattern/regular expression/EBNF describing this
limitation.

I accept that one can and should be able to assign semantics to a datatype
value (e.g. that "1" represents the integer 1) its just that I consider this
operation _exactly_ akin to assigning semantics to a triple as represented
in the RDF abstract syntax.

What this means is that one can offload this typing mechanism to the schema
specification, or schema validation engine. The notion of types that I
support defines _typing_ in terms of set membership.

The assumption is that each entity resolved from a URI is associated with an
abstract syntax and the format of the abstract syntax may be determined by
the media type of the HTTP MIME transaction. In any case, the entity is
represented as a set of "nodes" the structure of which is determined by the
abstract syntax.

The fragment identifier (part of a URI reference) maps to a set of nodes
(this set is termed an "Abstract Node". In considering a URI reference as a
QName, the URI part defines a root node (abstract syntax of the resolved
entity in RFC 2396 terms) and the fragment identifier an abstract node.

Assume a QName as a type specifier. The Namespace URI identifies a schema to
which the entity is an instance of, and the "localname" part of the QName
specifies a fragment identifier that specifies a _set_ of definitions
constraining the type of the node.

For example if the node is an attribute node, it is constrained to be an
instance of the attribute member of the type set. If the node is an element
node, it is constrained to be an instance of the element member

If the node is constrained to be a complexType: it is constrained to be a
member of the instance set of the complexType class definition.

How is this determined? I don't necessarily care, one looks to the XML
Schema specification (in reality one asks a piece of software that
implements XML Schema whether the particular node 'conforms' to the type
definition).

So we can define types using DTDs, XML Schema, DAML+OIL ... and I consider
DAML+OIL _exactly_ the same as XML Schema in its ability to provide a (data
or whatever kind of) type. We simply 'ask' the predicate

for all expr isTypeOf(expr, class) => expr in Instances(class)

and this magical predicate tells us yes or no. So really whether it
determines this by lexical or value space or asking an oracle or whatever
other mechanism doesn't really matter (of course the mechanism _does_ matter
but for other reasons)

So at the end of the day, and for the reasons that I've outlines, I don't
support the distinction between ObjectProp and DatatypeProp ... if forces
you to look into boxes probably best left undisturbed.

Jonathan Borden
The Open Healthcare Group
http://www.openhealth.org

Received on Thursday, 17 May 2001 15:05:54 UTC