W3C home > Mailing lists > Public > whatwg@whatwg.org > October 2009

[whatwg] A call for tighter validation standards

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 23 Oct 2009 04:36:17 +0000 (UTC)
Message-ID: <Pine.LNX.4.62.0910230420510.9145@hixie.dreamhostps.com>
On Thu, 22 Oct 2009, Curtiss Grymala wrote:
> 
>      1. Unquoted attributes
>      2. Implied tags (such as leaving out a closing paragraph tag)
>      3. Inconsistent use of the closing slash on empty elements
> 
> My concern with all three of these points is that they are relying on
> the browser to interpret the coder's intent when rendering the elements.
> The unquoted attributes and implied tags are subject to wide
> interpretation by the browsers.

HTML5 defines exactly how they are to be treated, removing any room for 
interpretation.


> Regarding the unquoted attributes, I fear that new coders might not 
> understand when and why attributes should be quoted and will spend a lot 
> of time wondering why their pages are not rendering properly.

I'm skeptical that when people forget to quote their attributes, it's 
because they don't understand that they need quoting. It doesn't take much 
to understand that

   <p title=Hello World! class=blue>

...needs quotes.


> For instance, imagine a new coder trying to declare inline style 
> definitions without quoting the style attributes. How will a browser 
> interpret something like:
> 
>   <p style=width: 500px; height: 100px;>

What the browsers must do with this is fully defined, it's the same as:

   <p style="width:" 500px;="" height:="" 100px;="">

I think authors would figure out that something was wrong pretty soon, 
though, when their element didn't change size.


> It becomes even more dangerous in that height and width are actually 
> attributes of most HTML elements, so the browser will have to do quite a 
> bit of work to figure out what to do with these types of definitions.

Not really. The spec defines it precisely, and it's actually no more work 
than dealing with correct attributes. (It's the same work, in fact.)


> The implied tags can be just as frightening. For instance (and I realize 
> this is not the best example, but it's the one I can think of at the 
> moment), what happens with the following:
> 
> HTML Version:
> <p>This is some paragraph text.
> <span>This is some more text</span>
> 
> XHTML Version:
> <p>This is some paragraph text.</p>
> <span>This is some more text.</span>
> 
> In the XHTML version above, it's obvious that the span will be separated
> from the paragraph text. However, in the HTML version, browsers will
> include the span in the paragraph (unless appropriate CSS is applied to
> the span).

I think it's pretty obvious that the span is part of the paragraph in the 
HTML version, personally. :-)


> I would like to reiterate that I am not asking the WHATWG to recommend 
> browsers dropping support for any older HTML specs (in fact, I am very 
> much in support of the browsers continuing to support all older HTML 
> specs and, to the best of their ability, supporting HTML from before 
> there were specs and recommendations).
> 
> However, what I am asking is that the WHATWG consider writing the specs 
> so that those older, less rigid styles of coding do not validate 
> according to the standard.

I think less rigid styles are good, and are what made the Web the success 
that it is. Authors are welcome to use validators that complain about this 
kind of markup, but we should enforce this style on everyone. Some people 
(e.g. me) like being able to omit tags.

It's a whole heck of a lot easier to maintain this:

   <table>
    <thead>
     <tr>
      <th>Song
      <th>Singer
    <tbody>
     <tr>
      <td>Original Prankster
      <td>The Offspring
   </table>

...than:

   <table>
    <thead>
     <tr>
      <th>Song</th>
      <th>Singer</th>
     </tr>
    </thead>
    <tbody>
     <tr>
      <td>Original Prankster<td>
      <td>The Offspring</td>
     </tr>
    </tbody>
   </table>

(Did you notice the error in the second one?)


> Coders will still be free to write the code with implied closing tags, 
> unquoted attributes and inconsistent use of the closing slash, but I 
> don't believe that type of code should validate, as it does not conform 
> to an actual standard, rather it conforms to exceptions to standards.

The point of making things invalid is to say that authors aren't free to 
do it. "You can write invalid markup" is not a valid argument. If we're ok 
with people writing it, then it should be valid.


> In my blog post, I likened the looser standards of HTML5 to removing
> laws against driving while intoxicated. Sure, abolishing those laws
> would not force anyone to drive drunk, but without any legal
> ramifications for doing so, it would be much more prevalent.

IMHO that's a very dubious analogy. First, there are mental health issues 
relating to drinking while intoxicated (alcohol addiction, a neurological 
disease, can lead to such behaviour, for instance). Second, omitting tags 
doesn't lead to people dying.


> As with that example, if the standards for HTML are loosened as compared 
> to XHTML 1 (served as text/html), coders will not feel the need to code 
> neatly or consistently, and we will begin to slip back into the 
> spaghetti code we experienced throughout the 1990s.

Valid HTML4 is not spaghetti code.

Invalid XHTML1 can be as much spaghetti code as invalid HTML4 (indeed it 
can be worse, since people throw in namespace mistakes!).

HTML5 includes XHTML5, which you can use if you prefer using XML rather 
than HTML.

Cheers,
-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 22 October 2009 21:36:17 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:18 UTC