Re: questions about attribute declarations from Gavin Nicol on 1996-09-25 (w3c-sgml-wg@w3.org from September 1996)

From: Gavin Nicol <gtn@ebt.com>
Date: Wed, 25 Sep 1996 13:06:28 GMT
To: lee@sq.com
CC: ht@cogsci.ed.ac.uk, w3c-sgml-wg@w3.org
Message-Id: <199609251306.NAA12405@wiley.EBT.COM>

>I'd rather see NUMBER left as it is in SGML, dropped from XML, and
>replaced with something like a simplified HyLex:
>    <!AttList Person
>	Salary /^[$�][0-9]+(\.[0-9][0-9])?/ #REQUIRED
>
>
>
>(the 8-bit character is a pound-sterling, perhaps I should've used an entity!)
>
>This doesn't cover multilingual documents, but at least it doesn't wire a
>single number representation into a standard either.
>
>This could eaily be extended to content:
>
>    <!Element postCode - -
>	(/^[A-Z][A-Z]+[0-9]?( [0-9]+[A-Z]+)?$/)
>    >
>
>    <!Element CAPITALS - -
>	(/[:upper:]/|smalls)*
>    >
>
>    <Element smalls
>
>	(/[:lower:]/|CAPITALS)*
>    >
>
>where /.../ is like PCDATA except matching the given regexp.

This is something I've also wanted for a long time. Having this would
make it easy to validate not only thr structure, but also validate
the lexical structure of the data: great for database applications.

The typical response to this is that "this is an application issue".

I should not that the "doesn't cover multilingual" can be removed
easily by fixing the coded character set, and then defining a set
of standard character classes based on that (like DIGIT, NAMESTART
etc). Differences in numeric formats and whatnot would make
it somewhat complicated in the general case, but in cases where there 
would be a mixture (for example, HTML) it would be better to require
a normalised form, as I proposed to HTML-WG a long time ago. Note, this
is still just a DTD issue and application convention.

Received on Wednesday, 25 September 1996 09:08:11 UTC