On Mon, 8 Nov 1999, Dan Connolly wrote:
> Arjun Ray wrote:
> > Yet another place where the 4.0 spec's "friendly prose" fails to state
> > the exact requirements.
> What's not exact about it? Comments are markup[1]. 

Yes, and the content model of TITLE is (#PCDATA).  Clause 4 "Definitions"
of ISO 8879 (see p.277 in the Handbook):

: 4.228 parsed character data: Zero or more characters that occur in a
: context in which text is parsed and markup is recognized.  They are
: classified as data characters because they were not recognized as 
: markup during parsing.
: 4.229 PCDATA: Parsed character data.

The issue is "a context in which text is parsed and markup is recognized".
The operative concept here is *recognition* of markup.  Simply because
something looks like markup doesn't make it so.  In some ways, this is a
problem with SGML itself, but either the spec's normative reference to ISO
8879 counts for something, or it doesn't.

> Perhaps we should have added a NOTE about why this restriction is there:
> it's there because older HTML implementations treated <!--...---> as
> character data, and I think some versions of the HTML spec declared
> the TITLE element as CDATA. 

It might have been better to specify RCDATA declared content.
> Let's see if I can find the original IETF html-wg discussion of CDATA
> vs. PCDATA for TITLE... nope; but 

> [...] reviewing the changes to of html.dtd[2], I see that TITLE was
> RCDATA for a while, 

AFAIK, the original spec had (#PCDATA).


So when did it change?

> then changed to %title-content which could be either CDATA or PCDATA
> in v1.8, date: 1994/04/09 01:02:10.
> [2]
> (hm... the ,v file isn't available via HTTP. bummer. see:
> )

The Changelog goes back to only v., dated 1994/04/01.  The v1.8
entry just says:

| * Revamped HTML, HEAD, elements in light of feature test entities

> So if you can find html-wg archives from around there (we have them
> somewhere at W3C, I think) you'll probably find it discussed.

I have my own copy of the html-wg list.  I suppose I'll have to slog
through megabytes of it...

But 1994/04/01 is too early for the html-wg anyway.  The welcoming letter
(from Stu Weibel) is dated 1994/07/29:

