Re: C.12 types of declared values for attributes? from lee@sq.com on 1996-10-18 (w3c-sgml-wg@w3.org from October 1996)

From: <lee@sq.com>
Date: Fri, 18 Oct 96 10:26:26 EDT
To: dgd@cs.bu.edu, w3c-sgml-wg@w3.org
Message-Id: <9610181426.AA05677@sqrex.sq.com>

David G. Durand <dgd@cs.bu.edu> wrote:
> We should either expand the space as originally suggested; I like regexp
> myself, and it's a drop-in in any language I can imagine people using for
> an XML parser (C, C++, Java).
> 
>    Otherwise, we should just go for CDATA, IDREF, and enumerated attributes
> and bag the rest (ID should always be legal, and always be an attribute
> called "id", I think). [...]

I agree with you about regexp.  Few people today would design a language
with anything other than POSIX-style internationalised regular expressions
(e.g. they have [:lower:], which refers to all lower case letters where
[a-z] doesn't, as it omits AE-ligature, edh, e-acute, etc., used even
in English outside certain countries...).

You're also correct that there are regexp drop-ins for C, C++, Java and
most other programming languges.  Heck, even for FORTRAN probably :-)

Non-standard regexps -- the defacto standard is significantly more
influential than ISO, and has been for over 15 years, to the extent that
even applications such as Microsoft Word use "standard" regular expression
semantics int he Advanced Search -- non-standard regexps are a curse
and and cannot be justified.

But this applies to content models as well as recognition of attributes.

*

>    Otherwise, we should just go for CDATA, IDREF, and enumerated attributes
> and bag the rest (ID should always be legal, and always be an attribute
> called "id", I think). [...]

I would rather see every attribute called ID or beginning with ID. be an ID,
and allow multiple ones -- the SGML restriction makes no sense to me at all,
and in any case can't be enforced without a DTD.  But you can always use
CDATA for all except one (or all) of the ID attributes you want, and
check them with application-level code, which is what people do today
with SGML, whilst they are wondering why they have to do that :-)
SO I could live with just ID.

If people can't accept the use of fixed attribute names, a prefix would
probably work: xml.id for example.

The same sort of convention is needed for IDREF, and xml.idref is more
appealing to me, as it's easy to explain tht attribute names starting
with xml. are reserved.

Lee

Received on Friday, 18 October 1996 10:26:47 UTC