Re: [VE][79] New Error Message Suggestion from Liam Quinn on 2004-06-06 (www-validator@w3.org from June 2004)

From: Liam Quinn <liam@htmlhelp.com>
Date: Sun, 6 Jun 2004 18:21:38 -0400 (EDT)
To: TROJjER - Marc Kirkwood <marcpetkirkwood@hotmail.com>
Cc: www-validator@w3.org
Message-ID: <Pine.LNX.4.44.0406061806180.25220-100000@localhost.localdomain>
On Sun, 6 Jun 2004, TROJjER - Marc Kirkwood wrote:

> I have allocated a page as HTML 4.01 Transitional, and I am curious as to 
> why the only errors The Validator filters from the source code are 
> pertaining to the nature of the <p> tags, which I felt as though were 
> dutifully closed with </p>, in keeping with my thoughts regarding the 
> Standard. However, the SGML parser dictates that the tags are "unopened". I 
> thought this was compliant--in fact, necessary--future-proof practice; was I 
> mistaken in this case?
> 
> I feel that I perhaps needn't bother to omit the closing tags, as they are 
> of course a good idea to include... So I will keep them regardless, I think. 
> I would up the DOCTYPE to the Strict DTD; but that results in numerous error 
> reports and as of yet, I am rather unwilling to tackle the horde of "illegal 
> attributes" and unrecognised entities (well, okay, not so much of the 
> latter) which prevail.
> 
> Or, perhaps throwing open the gauntlet here, I don't know, but does the 
> presence of a <hr /> entity within such a <p> container effectively negate 
> the usage of a closing </p> tag? I somewhat doubt this; but it was the only 
> alternative I could think of...

Yes.  "p" elements cannot contain "hr" elements.  Since </p> is optional, 
the <hr> tag implies the end of the open "p" element.  So when the parser 
finds your </p> later, there's no open "p" element to match it.

(BTW, you should use <hr> in HTML, <hr /> in XHTML, and not mix the two.  
For why, see <http://www.cs.tut.fi/~jkorpela/html/empty.html>.)

> Another of my grievances, but with regards to the XHTML hosts this time: 
> *Why* does the engine parse URLs???

Because it's required by the definition of HTML and XHTML.

> It is really annoying and I find it 
> rather trivial that characters such as ampersands have to be escaped in the 
> *source code*--since when did HTML special entity characters become involved 
> with the content of href attributes

It's always been that way.

> It makes it an annoying factor when dealing with serverside 
> scripts etc

It doesn't matter to server-side scripts.  The browser parses the entities 
before communicating with the server-side scripts.

> I am curious as to why the nature of the parser, or its 
> maintainers and developers, does not allow for such href URL attributes to 
> be exempt from such entity parsing

Because the parser, its maintainers, and its developers are following the 
definition of HTML and XHTML.

> Furthermore, even if I *was* to use such special entities, I always adhere 
> to the convention of terminating them with a semicolon (e.g. &amp;). Am I 
> mistaken here? Is it not, in fact, a *dictation* of the Standard by now, 
> that this must be done?

It's required in XML and XHTML, but there are some cases where the 
semicolon can be omitted in SGML and HTML.

> Then why is there an annoying measure which 
> backwardly dictates that "anything immediately following an ampersand in 
> HTML has the potential to be confused with a special character 
> entity"--when, in fact, it surely should only be true if, and when, there is 
> a semicolon at the end? I do not understand this... Could it be due to the 
> read-order of the engine itself? Even so, I do not know why it should 
> produce an error.

Wouldn't you want an error reported if you mistyped the entity, perhaps by
omitting the semicolon?

> Is it that, once an ampersand is found, and the characters 
> to the right of it are not acknowledged to be such character references, it 
> results in a "hiccup" even though a semicolon may not be present at all?

Yes, it's an error.

-- 
Liam Quinn
Received on Sunday, 6 June 2004 18:19:55 UTC