- From: Gavin Nicol <gtn@ebt.com>
- Date: Wed, 25 Sep 1996 13:06:28 GMT
- To: lee@sq.com
- CC: ht@cogsci.ed.ac.uk, w3c-sgml-wg@w3.org
>I'd rather see NUMBER left as it is in SGML, dropped from XML, and >replaced with something like a simplified HyLex: > <!AttList Person > Salary /^[$£][0-9]+(\.[0-9][0-9])?/ #REQUIRED > > > >(the 8-bit character is a pound-sterling, perhaps I should've used an entity!) > >This doesn't cover multilingual documents, but at least it doesn't wire a >single number representation into a standard either. > >This could eaily be extended to content: > > <!Element postCode - - > (/^[A-Z][A-Z]+[0-9]?( [0-9]+[A-Z]+)?$/) > > > > <!Element CAPITALS - - > (/[:upper:]/|smalls)* > > > > <Element smalls > > (/[:lower:]/|CAPITALS)* > > > >where /.../ is like PCDATA except matching the given regexp. This is something I've also wanted for a long time. Having this would make it easy to validate not only thr structure, but also validate the lexical structure of the data: great for database applications. The typical response to this is that "this is an application issue". I should not that the "doesn't cover multilingual" can be removed easily by fixing the coded character set, and then defining a set of standard character classes based on that (like DIGIT, NAMESTART etc). Differences in numeric formats and whatnot would make it somewhat complicated in the general case, but in cases where there would be a mixture (for example, HTML) it would be better to require a normalised form, as I proposed to HTML-WG a long time ago. Note, this is still just a DTD issue and application convention.
Received on Wednesday, 25 September 1996 09:08:11 UTC