Re: On constraining/validating datatypes

"Steven J. DeRose" <sjd@eps.inso.com> wrote:

> ----------------------------------
> Along the first axis, there are several obvious points we can choose:
>
> a) Do nothing: [...]
> b) Define a small, fixed number of atomic types.
> c) Define a language for defining datatypes: regex (say, per POSIX), or
> perhaps HyLex.
> d) Define a way to access *any* programming, scripting, or other language at
> all.


I vote for (c) and (d), using HyLex for (c) and NOTATIONs
to declare (but not necessarily define) support for (d).

HyLex is not as familiar to most people as Perl or POSIX regexps,
but it's equally powerful.  It's also more verbose, but is
less cryptic to the uninitiated.

> Along the second axis, possible approaches are shown below. [...]
>
> a) Associate datatypes with data via attributes. [...]
> b) State the relationships between datatypes and attributes or content right
> with the definitions, for example in header elements [...]
> c) In the DTD itself, via an amendment.


Or (d), in the DTD itself via (GASP!) processing instructions.
Something like:


    <!-- <?LEXMODEL ... > PI defines a lexical model: -->

    <?LEXMODEL scheme	("http" | "ftp" | "gopher")	>
    <?LEXMODEL host   	(([a-zA-Z][a-zA-Z0-9]*, ".")+	>
	<!-- or   	((NMSTART, NMCHAR*), ".")+    -->
    <?LEXMODEL port 	(DIGIT+) 			>
    <?LEXMODEL path 	((URLCHAR*, "/")+)		>
    <?LEXMODEL fragment ([a-zA-Z0-9]+)			>
    <?LEXMODEL query 	(URLCHAR*)			>
    <?LEXMODEL url	( (scheme, "://", host, (":", port)?, "/")?,
			  path,
			  ("#", fragment)?,
			  ("?", query)? )		>

    <!-- <?LEXTYPE ... > PI associates a lexical model
	 with an attribute or content:  -->

    <?LEXTYPE url (IMG/SRC | WEBADDRESS/#CONTENT | #ANY/HREF)>

    <!-- Possibility: defining a NOTATION with the same name
	 as a lexical model declares semantics for that type: -->

    <!NOTATION url PUBLIC
	"-//IETF//NOTATION RFC 1738 Uniform Resource Locators">


All this stuff can go in the DTD, or possibly at the front of 
the document (except for <!NOTATION...> decls).


--Joe English

  jenglish@crl.com

Received on Thursday, 22 May 1997 16:17:16 UTC