let's subset XML Data

The current protocol draft includes language copied from the XML Data
draft.  (We copy it, rather than refer to it externally, to avoid
dependencies.)

I would like to propose that, for the purposes of DASL, we subset XML data
in two ways:

1) drastically shorten the list of supported datatypes
2) Prohibt (or rather, simply fail to document) simplify certain syntactic
features of XML Data.  

These changes will make DASL datatyping much easier to understand and are
fully compatible with the XML data superset.

In XML Data, the datatype is indicated by an attribute "dt" which is
defined in the namespace uuid:C2F41010-65B3-11d1-A29F-00AA00C14882/.  (In
all extant examples of XML Data, the prefix dt is used for this namespace,
but this is simply a custom, and is not required.)  Thus one might express
the datatype of the dav:creationdate property (which holds an ISO 8601
string) as follows

 <?xml:namespace ns="urn:uuid:C2F41010-65B3-11d1-A29F-00AA00C14882/"
prefix="T"?>
 <D:prop>
   <D:creationdate t:dt="t:dateTime.iso8601tz"/>
 <:D:prop>

A UI that recieved such a prop (as a part of the query grammar schema
reply) could use the datatype attribute to ensure that the user entered
only valid dates in an type in box, for example.

The XML Data note defines 28 datatypes.  I believe we should dispense with
nearly all of them.  My main reason is to keep the protocol draft small.
Secondly, I don't think we can possible define datatypes for *every* type
that a UI might want to implement.  For example, it is often useful to
enter a pathname of a local file, e.g. for uploading.  But XML Data does
not specify a "pathname" datatype.  The use of XML Namespace allows servers
and clients to make private agreements about additional syntactic
datatypes, if they wish.

I propose that we keep only as many datatypes as are needed to define
property values defined by DAV itself, and avoid the slippery road of
trying to define every useful type. (note that there is no XML Data type
more specific than string for the HTTP-date format used in the
DAV:getlastmodified property.  I think we should ignore this.)

The syntactic feature I wish to remove concerns namespace processing of
attribute values.
The XML Data proposal requires namespace processing not only of the
attribute name (eg, t:dt) but also of the attribute *value*.  This is
because the attribute value is supposed to be a URI.

Having read the XML namespace note, it's clear to me that that this is not
sanctioned by the XML Namespace, and I infer that it won't be supported in
XML parsers, so any application using XML data will have to provide
namespace expansion of attribute values itself, in a post processing step.
(I have written to C Frankston, the Microsoft representative to XML Data,
to check this.) 

I think this is ugly, but I can tolerate it.  What I can't tolerate is that
XML Data mandates that if the datatype attribute value has no colon (e.g.
it is "int" as opposed to "t:int") then a default prefix corresponding to
the UUID URN above is to be used.  This is where I draw the line.  This
feature requires every XML Data application to have wired into it the URN
of the XML Data default schema, which is a compatibility risk for future
extensibility, and for this one saves a mere two characters in the
attribute value.

We should not use this convention.  All DASL datatype attribute values
should be fully qualified.  This is a fully compatible restriction, and
simplifies DASL clients and servers.

Here is my proposed replacement for the current section 12, in plain text:

12. DASL Data-typing

A dataype indicates that the contents of an element can be parsed or
interpreted to yield a type more specific than a string.

We expose the datatype of an element instance by use of an attribute whose
value is a URI giving the datatype. (The URI might be explicitly in URI
format or might rely on the XML namespace facility for resolution.) For
example, we might find a document containing something like: 

 <?xml:namespace ns="urn:uuid:C2F41010-65B3-11d1-A29F-00AA00C14882/"
prefix="T"?>
 <D:prop>
   <D:creationdate t:dt="t:dateTime.iso8601tz"/>
 <:D:prop>

12.1.1 The Datatype attribute and namespace 

The datatype attribute "dt" is defined in the namespace named
"urn:uuid:C2F41010-65B3-11d1-A29F-00AA00C14882/".  The full URN of the
attribute is "urn:uuid:C2F41010-65B3-11d1-A29F-00AA00C14882/dt". 

Datatypes are identified by URIs.  The URI as simply a reference to a
section of a document that defines the appropriate parser and storage
format of the element.  To make this broadly useful, this document defines
a set of a common datatypes sufficient for WebDAV data.

12.1.2 Specific Datatypes 

[Sorry the table does not translate well into plain text.]

Name	Parse type	Examples

String	pcdata		Omwnuma legatai wn onoma monon koinon, o de kata tounoma
logos thV ousiaV eteros, oion zuon o te anqropoV kai to gegrammenon.

Number 	A number, with no limit on digits, may potentially have a leading
sign, fractional digits, and optionally an exponent. Punctuation as in US
English.
	15, 3.14, -123.456E+10

Int	A number, with optional sign, no fractions, no exponent.
	1, 58502, -13

Float	Same as for "number."	.314159265358979E+1

boolean	"1" or "0"	0, 1 (1=="true")

dateTime.iso8601tz	A date in ISO 8601 format, with optional time and
optional zone. Fractional seconds may be as precise as nanoseconds.	
	1994-11-05T08:15:5Z

Received on Friday, 19 June 1998 21:28:08 UTC