W3C home > Mailing lists > Public > www-validator@w3.org > March 2005

Re: W3C validating broken code

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
Date: Sat, 05 Mar 2005 18:43:30 +1100
Message-ID: <42296322.8090904@lachy.id.au>
To: Lori Eldridge - Vaszary <info@loriswebs.com>
CC: www-validator@w3.org

It would help if you had provided a URI for the web page, or at least a 
piece of sample code, though a URI is always preferred.

Lori Eldridge - Vaszary wrote:
> I keep using W3C to validate my pages and it passes them and later I
> find out some of the pages had broken code and I have to use another
>  validator to find them,

If the validator doesn't report an error, then your code is technically 
valid according to the formal rules specified in the DTD.  However, that 
does not mean that the document is strictly *conformant* with the HTML 
recommendation in all cases, nor that modern user agents will parse them 
correctly; particularly with the SHORTTAG features, as there is no 
popular user agent that has implemented them correctly.

See appendix B of the HTML recommentation to see some of the widely 
unsupported features.

> things such as missing carets in table tags,

Carets?  There are no carets in a tag, but I'm going to take a while 
guess an assume you mean something like missing the ">" character off 
the end of the start tag.  The tag close delimiter ">" may be omitted in 
certain circumstances, such as when the next non-whitespace character is 
tag open delimiter "<".

>  table tags missing altogether,

Several tags in HTML are declared to be optional.  Some elements require 
both start and end tags, some allow the omission of the end tag and 
others allow the omission of both.

For example, from the DTD:
<!ELEMENT TBODY    O O (TR)+           -- table body -->
<!ELEMENT (TH|TD)  - O (%flow;)*       -- table header cell, table data 
<!ELEMENT UL - - (LI)+                 -- unordered list -->

The "-" and "O" following the element name indicate the requirements for 
the start and end tags, the first within each element declaration being 
for the start-tag and the second being the end-tag.  "-" means the tag 
is required, "O" means optional, so that should be interpreted as:

TBODY:    Start-tag: Optional, End-tag: Optional
TH or TD: Start-tag: Required, End-tag: Optional
UL:       Start-tag: Required, End-tag: Required

The following example contains many omissions, such as omitting ">", tag 
names (such as <> and </> where you would expect to find <td> and </td> 
respectively),  end tags (</td>) and even both start and end tags 
(<tbody> and </tbody>).  This includes examples of all the SHORTTAG 
features discussed in sectino B.3.7 of HTML 4.01.

     <td>row 1 col 1
     <>row 1 col 2
     <td/row 2 col 1
     <>row 2 col 2</>

Although it may appear to be invalid, it is infact perfectly valid 
according to the rules of SGML and the structure defined within the DTD. 
  The above is equivalent to the following markup (ignoring whitespace 

       <td>row 1 col 1</td>
       <td>row 1 col 2</td>
       <td>row 2 col 1</td>
       <td>row 2 col 2</td>

Both examples will produce identical document object models (DOM).  Use 
the "Show Parse Tree" option in the extended interface of the validator 
to confirm how both are parsed.

However, few browsers actually have conforming SGML parsers and may 
choke on such constructs.  It is not the validators job to report errors 
based on limitations in other user agents, only to report violations of 
the formal specification.  Although, other validators like the WDG's 
validator will issue *warnings* (not errors) about the use of such 
widely unsupported features.

> and the # sign missing in body tag for color and link colors.

It is impossible to express such requirements within the DTD.  The value 
for those attributes is declared as CDATA, which basically means any 
ordinary character data.  The DTD says nothing about the internal 
structure of the value, though such requirements may be expressed by the 
prose within the HTML recommendation.

Althoguh, the use of such presentational attributes is not recommended, 
in favour of using CSS.  Most presentational attributes in HTML 
including those that control colour, sizes, padding, margins, etc. are 
very easily expressed using CSS, please consult a CSS tutorial for more 

Lachlan Hunt
http://GetFirefox.com/     Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Received on Saturday, 5 March 2005 07:49:57 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:17:44 UTC