W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2000

Re: Why is "html" forced to lower case in DOCTYPE?

From: Dave Raggett <dsr@w3.org>
Date: Fri, 24 Mar 2000 11:47:14 -0600
To: "J. David Bryan" <jdbryan@acm.org>
Cc: HTML Tidy List <html-tidy@w3.org>
Message-ID: <OFD5ADA497.5911A502-ON86256894.00443AA5@rfdinc.com>

On Wed, 23 Feb 2000, J. David Bryan wrote:

>     Whenever Tidy supplies a corrected DOCTYPE, it produces one such as
> "!DOCTYPE html PUBLIC...", i.e., with the "html" in lower case.  In
> lexer.c, the "FindGivenVersion" routine, which is responsible for parsing
> the DOCTYPE statement, has this comment at line 769:
>
>   /* but at least ensure the case is correct */
>
> FindGivenVersion then replaces the DOCTYPE string supplied in the source
> HTML file with the identical string but containing "html" in lower case.
>
>     Can someone please explain why changing this to lower case is
> "correct?"  Thanks.

In SGML the case of a tag doesn't matter. In XML it does. For XHTML,
which is a reformulation of HTML in XML, the W3C HTML working group
was forced into making a choice of case for HTML elements and
decided to use lower case. The root element "html" therefore needs
to be in lower case for XHTML but could be in either case when the
document conforms to SGML.

Regards,

-- Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
tel/fax: +44 122 578 3011 (or 2521) +44 385 320 444 (mobile)
World Wide Web Consortium (on assignment from HP Labs)
Received on Friday, 24 March 2000 13:13:06 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:43 GMT