Re: Datatypes questions

Good comments!

1) We could add an optional facet to the URI datatype restricting the types
    of elements it refers to.  Paul Biron, what do you think?
2) I am for allowing both pictures and regexs because, as you say, each
    has its virtues.  I'm less keen on allowing both on a single datatype
    specification because of the complex errors it can cause and the problems
    of finding them.
3) Others have argued that we need to keep ID, IDREF, NMTOKENS etc.
    because they appear in XML 1.0.  We can certainly downplay them.

Regards, Ashok

 image moved to Paul Prescod <>                    
 file:          05/12/99 05:58 PM                                  

To:, Ashok Malhotra/Watson/IBM@IBMUS,
Subject:  Datatypes questions

> "Issue (uri-scheme-facet): should we have a facet to allow a limitation
> to a specific scheme? It might be useful to able to say that something
> was not only a URI, but that it was a "mailto" and not a "http://...".

No. I think it would be in bad form to restrict by protocol. If I invent
httpplus next week my schema should not restrict me from using it. The
much more interesting sort of restriction is by target -- i.e. "this link
must go to an XML element with GI foo." But that might be out of scope.

> Issue (picture-or-regex): Should the values of the
> [Lexical representation] facet be pictures, regexs, both or some
> other mechanism?

Not only do we need both, I'm going to argue that we should be allowed to
specify both for the same user-defined data type. Pictures are nice and
simple. Regexps are powerful. One feature that pictures support that
regexps do not is nice, guided editing. ###-####-#### can be easily
rendered into a GUI. A regexp cannot. In the (admittedly rare) case that a
type had both I would expect the picture to be used for guided editing and
the regular expression for more complicated constraints. Of course the
input would have to match both.

> Issue (nmtoken-primitive-or-generated): should NMTOKEN be defined as
> a primitive (as above) or as a subtype of [string] with a
> regular expression facet such as "[a-zA-Z0-9_-]+" (or whatever
> the regular expression actually should be to match the
> Nmtoken production)? A similar issue also applies to all of the
> XML attribute types, [ID], [IDREF], [IDREFS], [ENTITY], [ENTITIES]
> and [NOTATION].

First you have to ask yourself whether you want this stuff just for XML
compatibility. If not, get rid of it. Otherwise, I would encourage you to
stick it into a "for backwards compatibility" gutter in the "generated
types" section of the document.

 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself

Diplomatic term: "Regret"
Translation: To care, but not enough to condemn. ("We regret the loss of
             life in Sierra Leone. We have no intention to do anything
             to stop it, mind you, but we regret that it happened.")
(Brills Content, Apr. 1999)

Received on Thursday, 13 May 1999 10:03:21 UTC