RE: XHTML/XML comment from Arjun Ray on 2000-01-31 (www-html@w3.org from January 2000)

From: Arjun Ray <aray@q2.net>
Date: Mon, 31 Jan 2000 13:15:26 -0500 (EST)
To: www-html@w3.org
Message-ID: <Pine.LNX.4.10.10001311254440.15214-100000@mail.q2.net>

On Mon, 31 Jan 2000, Christopher Luebcke wrote:

> my question is why XML (and thus XHTML) was created as
> case-sensitive in the first place (especially if neither SGML or
> HTML share this characteristic).

Short answer:  Unicode.

There are scripts which don't have any case distinction at all; even
among those that do, case equivalence is not a given (i.e. one form
may exist but not the other); and the Unicode standard recommends that
case substitution where needed should be to lowercase.  Interestingly
enough, this recommendation is in direct contrast to SGML, where case
substitution is to uppercase.

(It's actually a mistake to characterize HTML as case-insensitive.
The standard requires all such names to be folded to uppercase.) 

The XML spec, having taken a hard line on internationalization from
the beginning, couldn't keep SGML's case substitution rules in the
face of Unicode's superior recommendation, and since case substitution
*is* problematic in general, decided to eliminate it all together.

Given that, the specific choice of lowercase wasn't necessarily
arbitrary.  The weight of technical evidence points to that being the
best choice.

One thing to note, though.  If the W3C's protestations about SGML and
XML are worth anything, then you're not obligated to switch, let alone
stampede, to XHTML.  SGML is a lot about the continued viability of
document and data formats.  Use the "older" standards if you must:
they haven't gone away.

Arjun

Received on Monday, 31 January 2000 13:03:05 UTC