Re: Comparing conformance requirements against real-world docs

On Sat, 30 Aug 2008, Henri Sivonen wrote:
> 
> To get data for assessing how well the drafted conformance requirements 
> fit existing practice, I validated 516875 Web documents with the 
> Validator.nu engine. [...]
> 
> 0.5932	Attribute “border” not allowed on element “img” at this point.

I've allowed validators to downplay this error if the value is "0".


> 0.5547	Attribute “cellspacing” not allowed on element “table” at this point.
> 0.5469	Attribute “cellpadding” not allowed on element “table” at this point.
> 0.5315	Attribute “border” not allowed on element “table” at this point.
> 0.5079	Attribute “width” not allowed on element “table” at this point.
> 0.4676	Attribute “width” not allowed on element “td” at this point.
> 0.4172	Attribute “valign” not allowed on element “td” at this point.
> 0.4156	Attribute “align” not allowed on element “td” at this point.
> 0.3100	Attribute “height” not allowed on element “td” at this point.
> 0.2592	Attribute “align” not allowed on element “table” at this point.
> 0.2105	Attribute “bgcolor” not allowed on element “td” at this point.
> 0.2049	Attribute “height” not allowed on element “table” at this point.
> 0.1413	Attribute “background” not allowed on element “td” at this point.
> 0.1383	Attribute “bgcolor” not allowed on element “table” at this point.
> 0.1156	Attribute “valign” not allowed on element “tr” at this point.
> 0.0930	Attribute “nowrap” not allowed on element “td” at this point.
> 0.0753	Attribute “align” not allowed on element “tr” at this point.
> 0.0531	Attribute “bgcolor” not allowed on element “tr” at this point.
> 0.0516	Attribute “bordercolor” not allowed on element “table” at this point.
> 0.0472	Attribute “background” not allowed on element “table” at this point.
> 0.0374	Attribute “height” not allowed on element “tr” at this point.
> 0.0169	Attribute “valign” not allowed on element “table” at this point.
> 0.0126	Attribute “width” not allowed on element “th” at this point.
> 0.0124	Attribute “bordercolor” not allowed on element “td” at this point.
> 0.0110	Attribute “align” not allowed on element “th” at this point.

These attributes are the sign of a layout table (or, more rarely, actual 
presentational effects on a data table), so I don't think we should 
downplay these.


> 0.4428	Attribute “language” not allowed on element “script” at this point.

I've allowed validators to downplay this error if the value is 
"JavaScript".


> 0.4268 The internal character encoding declaration must be the first child of
> the “head” element.

See below.


> 0.4120	Text after “&” did not match an entity name. Probable cause: “&”
> should have been escaped as “&”.

Not sure what to do about this. I've considered making a non-; delimited 
unknown string after & in href="" valid, but I don't know if that would be 
wise.


> 0.3415	The “font” element is obsolete.
> 0.3185	Attribute “align” not allowed on element “div” at this point.
> 0.2407	Attribute “color” not allowed on element “font” at this point.
> 0.2347	Attribute “size” not allowed on element “font” at this point.
> 0.2191	Attribute “bgcolor” not allowed on element “body” at this point.
> 0.1813	Attribute “align” not allowed on element “p” at this point.
> 0.1805	Attribute “align” not allowed on element “img” at this point.
> 0.1769	The “center” element is obsolete.
> 0.1713	Attribute “face” not allowed on element “font” at this point.
> 0.1650	Attribute “topmargin” not allowed on element “body” at this point.
> 0.1569	Attribute “leftmargin” not allowed on element “body” at this point.
> 0.1494	Attribute “frameborder” not allowed on element “iframe” at this point.
> 0.1200	Attribute “marginheight” not allowed on element “body” at this point.
> 0.1192	Attribute “marginwidth” not allowed on element “body” at this point.
> 0.1104	Attribute “marginwidth” not allowed on element “iframe” at this point.
> 0.1102	Attribute “marginheight” not allowed on element “iframe” at this
> point.
> 0.0979	Attribute “link” not allowed on element “body” at this point.
> 0.0961	Attribute “vlink” not allowed on element “body” at this point.
> 0.0913	Attribute “text” not allowed on element “body” at this point.
> 0.0778	Bad value (consolidated) for attribute “width” on element “iframe”:
> Bad positive integer: Expected a digit but saw “%” instead.
> 0.0766	Bad value (consolidated) for attribute “height” on element “iframe”:
> Bad positive integer: Expected a digit but saw “p” instead.
> 0.0688	Attribute “background” not allowed on element “body” at this point.
> 0.0671	Attribute “alink” not allowed on element “body” at this point.
> 0.0667	Attribute “hspace” not allowed on element “img” at this point.
> 0.0568	The “u” element is obsolete.
> 0.0558	Element “frameset” not allowed in this context. (The parent was
> element “html”.) Suppressing further errors from this subtree.
> 0.0538	Attribute “vspace” not allowed on element “img” at this point.
> 0.0369	Bad value (consolidated) for attribute “width” on element “img”: Bad
> positive integer: Expected a digit but saw “%” instead.
> 0.0361	Attribute “clear” not allowed on element “br” at this point.
> 0.0353	Attribute “rightmargin” not allowed on element “body” at this point.
> 0.0324	Attribute “size” not allowed on element “hr” at this point.
> 0.0275	Attribute “bottommargin” not allowed on element “body” at this point.
> 0.0262	Element “p” not allowed in this context. (The parent was element
> “font”.) Suppressing further errors from this subtree.
> 0.0209	Attribute “width” not allowed on element “hr” at this point.
> 0.0172	Attribute “color” not allowed on element “hr” at this point.
> 0.0171	Attribute “align” not allowed on element “object” at this point.
> 0.0151	Stray end tag “center”.
> 0.0142	Attribute “noshade” not allowed on element “hr” at this point.
> 0.0133	Attribute “align” not allowed on element “input” at this point.
> 0.0132	Attribute “hspace” not allowed on element “iframe” at this point.
> 0.0131	Attribute “vspace” not allowed on element “iframe” at this point.
> 0.0114	Attribute “border” not allowed on element “iframe” at this point.
> 0.0103	Attribute “width” not allowed on element “div” at this point.
> [...]

These are presentational, which I think we are better off leaving as not 
allowed for now. Let's see what happens if we don't have a "transitional" 
analogue for ten years...


> 0.2105	Attribute “size” not allowed on element “input” at this point.
> 0.2079	Attribute “name” not allowed on element “a” at this point.
> 0.1943	Bad value (consolidated) for attribute “http-equiv” on element “meta”.

See below.


> 0.1361	Attribute “scrolling” not allowed on element “iframe” at this point.
> 0.1061	Attribute “name” not allowed on element “img” at this point.
> 0.0794	Required attributes missing on element “object”.
> 0.0198	Attribute “allowtransparency” not allowed on element “iframe” at this
> point.
> 0.0167	Attribute “hidefocus” not allowed on element “a” at this point.
> 0.0164	Attribute “index” not allowed on element “div” at this point.

These may warrant further study. Do they represent a feature people want? 
It would be interesting to examine some cases of these; do you have sample 
URLs for these errors?


> 0.1226 Duplicate ID (consolidated).
> 0.1166 Unclosed elements.
> 0.0934	No “p” element in scope but a “p” end tag seen.
> 0.0910	Table column 2 established by element “td” has no cells beginning in
> it.
> 0.0851 No space between attributes.
> 0.0785	“body” start tag found but the “body” element is already open.
> 0.0738	Stray end tag “embed”.
> 0.0715	Element “style” not allowed in this context. (The parent was element
> “body”.) Suppressing further errors from this subtree.
> 0.0694	Stray end tag “head”.
> 0.0628	Element “div” not allowed in this context. (The parent was element
> “span”.) Suppressing further errors from this subtree.
> 0.0625	Stray end tag “form”.
> 0.0579 Entity reference was not terminated by a semicolon.
> 0.0558	Required children missing from element “html”.
> 0.0549	Stray end tag “div”.
> 0.0512	End tag “div” seen but there were unclosed elements.
> 0.0471	Start tag “form” seen in “table”.
> 0.0450	Stray end tag “td”.
> 0.0446	No element “a” to close.
> 0.0404	Stray end tag “span”.
> 0.0394	End tag for “body” seen but there were unclosed elements.
> 0.0373	Stray end tag “img”.
> 0.0360	Stray “html” start tag.
> 0.0354	No element “font” to close.
> 0.0351	Saw “<?”. Probable cause: Attempt to use an XML processing instruction
> in HTML. (XML processing instructions are not supported in HTML.)
> 0.0336	Stray “script” start tag.
> 0.0303	Table column 4 established by element “td” has no cells beginning in
> it.
> 0.0297	Stray end tag “tr”.
> 0.0292	An “a” start tag seen with already an active “a” element.
> 0.0270	End tag “a” violates nesting rules.
> 0.0250	“td” start tag in table body.
> 0.0247	Internal encoding declaration “iso-8859-1” disagrees with the actual
> encoding of the document (“utf-8”).
> 0.0243	End tag “font” violates nesting rules.
> 0.0236	Stray start tag “head”.
> 0.0227	Table columns in range 2…3 established by element “td” have no cells
> beginning in them.
> 0.0224	Saw “"” when expecting an attribute name. Probable cause: “=” missing
> immediately before.
> 0.0217	End tag for “p” seen, but there were unclosed elements.
> 0.0208	Quote “"” in attribute name. Probable cause: Matching quote missing
> somewhere earlier.
> 0.0198 End of file seen and there were open elements.
> 0.0175	Table column 3 established by element “td” has no cells beginning in
> it.
> 0.0173	Stray end tag “html”.
> 0.0169	Start tag for “head” seen when “head” was already open.
> 0.0168 Bogus comment.
> 0.0163 Table cell spans past the end of its row group established by a
> “tbody” element; clipped to the end of the row group.
> 0.0163	Stray end tag “table”.
> 0.0160	Row 1 of a row group established by a “tbody” element has no cells
> beginning on it.
> 0.0160	Element “div” not allowed in this context. (The parent was element
> “font”.) Suppressing further errors from this subtree.
> 0.0158	Stray end tag “body”.
> 0.0156	Element “title” not allowed in this context. (The parent was element
> “body”.) Suppressing further errors from this subtree.
> 0.0145	Row 3 of a row group established by a “tbody” element has no cells
> beginning on it.
> 0.0139	A slash was not immediate followed by “>”.
> 0.0134	End tag “b” violates nesting rules.
> 0.0133	Element “table” not allowed in this context. (The parent was element
> “span”.) Suppressing further errors from this subtree.
> 0.0130	No element “b” to close.
> 0.0125 Stray doctype.
> 0.0124	Element “table” not allowed in this context. (The parent was element
> “font”.) Suppressing further errors from this subtree.
> 0.0118	Row 2 of a row group established by a “tbody” element has no cells
> beginning on it.
> 0.0118	Element “p” not allowed in this context. (The parent was element
> “span”.) Suppressing further errors from this subtree.
> 0.0114	Stray end tag “input”.
> 0.0111	Attribute “"” not allowed on element “div” at this point.
> 0.0110	Attribute “sohu3” not allowed on element “div” at this point.
> 0.0110	Attribute “pdt” not allowed on element “div” at this point.
> 0.0100	“"” in an unquoted attribute value. Probable causes: Attributes
> running together or a URL query string in an unquoted attribute value.
> 0.0100	Bad value (consolidated) for attribute “width” on element “iframe”:
> Bad positive integer: Expected a digit but saw “p” instead.
> [...]

These are out-and-out errors as far as I can tell.


> 0.0714	Attribute “classid” not allowed on element “object” at this point.
> 0.0705	Attribute “codebase” not allowed on element “object” at this point.

This is probably ActiveX usage. It seems better to encourage use of the 
more open and more widely interoperable NPAPI mechanism, if we are to 
encourage such extensions at all.


> 0.0505	Required children missing from element “head”.

I'm considering making <title> optional for documents intended for 
<iframe>s. I welcome feedback on the matter. It'd be interesting to know 
if this was seen mostly on pages intended for top-level browsing contexts 
or mostly on pages intended for (i)frames.


> 0.0488	“=” in an unquoted attribute value. Probable causes: Attributes
> running together or a URL query string in an unquoted attribute value.

"="? I'm not sure what to make of that one.


> 0.0473	Element “link” not allowed in this context. (The parent was element
> “body”.) Suppressing further errors from this subtree.
> 0.0289	Element “link” not allowed in this context. (The parent was element
> “div”.) Suppressing further errors from this subtree.
> 0.0229	Element “style” not allowed in this context. (The parent was element
> “td”.) Suppressing further errors from this subtree.
> 0.0224	Element “style” not allowed in this context. (The parent was element
> “div”.) Suppressing further errors from this subtree.
> 0.0103	Element “link” not allowed in this context. (The parent was element
> “td”.) Suppressing further errors from this subtree.

Scoped style sheet imports? Scoped style sheets in <td> and <div>? Should 
we support these, maybe?


> 0.0340	Required attributes missing on element “area”.

Which attributes? Are we too strict here, or are these <area> elements 
that wouldn't work anyway?


> 0.0335	Attribute “accesskey” not allowed on element “a” at this point.

See below.


> 0.0329	Required children missing from element “dl”.

Presentational abuse?


> 0.0314	Attribute “border” not allowed on element “input” at this point.

Should we downplay this too?


> 0.0302	Attribute “name” not allowed on element “div” at this point.

Why would anyone do this?


> 0.0289	Consecutive hyphens did not terminate a comment. “--” is not permitted
> inside a comment, but e.g. “- -” is.

I'd be ok with allowing this, but it would break XML parity, and would 
fail to catch nested comments (which won't work, but right now will be 
caught by this rule).


> 0.0279	Self-closing syntax (“/>”) used on a non-void HTML element. Ignoring
> the slash and treating as a start tag.

This is probably a good error to report.


> 0.0275	Attribute “profile” not allowed on element “head” at this point.

See below.


> 0.0265	Attribute “summary” not allowed on element “table” at this point.
> 0.0133	Attribute “abbr” not allowed on element “th” at this point.
> 0.0029	Attribute “longdesc” not allowed on element “img” at this point.

See recent thread for summary. I've downplayed these errors to help with 
the transition.


> 0.0262	Bad value (consolidated) for attribute “href” on element “a”: Bad IRI
> reference: WHITESPACE in QUERY.
> 0.0128	Bad value (consolidated) for attribute “src” on element “img”: Bad IRI
> reference: WHITESPACE in PATH.
> 0.0125	Bad value (consolidated) for attribute “src” on element “img”: Bad IRI
> reference: PORT_SHOULD_NOT_BE_WELL_KNOWN in PORT.
> 0.0121	Bad value (consolidated) for attribute “href” on element “a”: Bad IRI
> reference: WHITESPACE in PATH.
> 0.0115	Bad value (consolidated) for attribute “href” on element “a”: Bad IRI
> reference: ILLEGAL_CHARACTER in SCHEME.
> [...]

These messages could be improved, but I think we want to continue 
disallowing illegal IRIs, right? I mean, it's convenient to allow some of 
these cases, but it doesn't seem like the most useful thing to do for the 
long run, even for the author.


> 0.0253	“script” element between “head” and “body”.

I wish we could allow this, but the legacy behavior is confusing and so 
it's better to call this one out.


> 0.0239	Attribute “web:culture” not allowed on element “html” at this point.

XHTML with namespaced content leaking into text/html?


> 0.0239	Attribute “content” not allowed on element “input” at this point.

I wonder what this was used for. Do you have sample URLs?


> 0.0236	Bad value (consolidated) for attribute “disabled” on element “input”.

Are there values we should allow? It would be interesting to see the 
values here.


> 0.0208	Required attributes missing on element “style”.

This seems like a bug in the validator; there are no required 
attributes on <style> in HTML5 as far as I can tell.


> 0.0208	Required attributes missing on element “img”.

Why would src="" be omitted? How odd that it would be so common. Fake 
spacer gifs maybe?


> 0.0196	Bad value (consolidated) for attribute “width” on element “img”: Bad
> positive integer: Zero is not a positive integer.
> 0.0154	Bad value (consolidated) for attribute “height” on element “img”: Bad
> positive integer: Zero is not a positive integer.
> 0.0139	Bad value (consolidated) for attribute “height” on element “iframe”:
> Bad positive integer: Zero is not a positive integer.
> 0.0133	Bad value (consolidated) for attribute “width” on element “iframe”:
> Bad positive integer: Zero is not a positive integer.

Zero? Really? That seems like an out-and-out error.


> 0.0194	Element “noframes” not allowed in this context. (The parent was
> element “html”.) Suppressing further errors from this subtree.

We don't support <noframes> in HTML5, since frame support is required 
(and simultaneously, we make them non-conforming). But I've downplayed 
this error.


> 0.0192	Element “meta” not allowed in this context. (The parent was element
> “body”.) Suppressing further errors from this subtree.

This seems worth reporting, though we could downplay it, as it's not such 
a big issue... Opinions?


> 0.0191	Attribute “width” not allowed on element “input” at this point.
> 0.0187	Attribute “height” not allowed on element “input” at this point.

I've allowed these.


> 0.0171	When the attribute “xml:lang” in no namespace is specified, the
> element must also have the attribute “lang” present with the same value.

As painful as this is, this seems like it should be reported. The message 
could be more helpful though. (e.g. "for this to have any effect...")


> 0.0150	Element “nobr” not allowed in this context. (The parent was element
> “div”.) Suppressing further errors from this subtree.
> 0.0107	Element “nobr” not allowed in this context. (The parent was element
> “td”.) Suppressing further errors from this subtree.
> [...]

Should we downplay these? Or maybe allow <nobr> in some way?


> 0.0126	Attribute “value” not allowed on element “input” at this point.

Is this always name=_charset_ cases? Why would people give value="" 
attributs in that case? Are they values we should be allowing?


> 0.0123	Text not allowed in element “script” in this context.

How common is this, and are any of these cases particularly useful? Some 
people have suggested allowing text in <script> elements with src="" 
attributes to help with documentation; is this what is going on, or are 
these cases of pages having real errors?


> 0.0120	Attribute “scheme” not allowed on element “meta” at this point.

This was allowed in HTML4, but reporting it as an error seems sensible 
since it had no effect and we should probably bring this to people's 
attention. (Or did it have an effect? Should we downplay it?)


> 0.0116 A numeric character reference expanded to the C1 controls range.

How much of a problem is this? Should we allow these and just define the 
hard-coded mappings to real characters as legal?


> 0.0114	Attribute “alt” not allowed on element “input” at this point.

Over-enthusiastic accessibility?


> 0.0111	Required attributes missing on element “meta”.

What attributes _were_ present in these cases? Are people doing something 
bogus, or is there some great feature we're missing out on?


> 0.0110	Attribute “param” not allowed on element “a” at this point.

Are we missing something here? Why are people doing this?


> 0.0105	Attribute “themeid” not allowed on element “link” at this point.

Is this some well-known feature?


> 0.0103	Element “o:p” not allowed in this context. (The parent was element
> “span”.) Suppressing further errors from this subtree.

Word junk. Not worth downplaying, since those pages (a) won't be validated 
by anyone except those looking to fix the mess Word made, and (b) contain 
so many other errors anyway.


Below 0.0100, I've skipped the uninteresting errors altogether.

> 0.0098	The “for” attribute of the “label” element must refer to a form
> control.

What are people making it point to?


> 0.0094	Attribute “type” not allowed on element “textarea” at this point.

What values are people using?


> 0.0087 NEITHER ERRORS

Well that's sad. More than 99% of pages have a conformance error...


> 0.0077	Bad value (consolidated) for attribute “tabindex” on element “input”:
> Bad integer: The empty string is not a valid integer.

Do browsers do anything with the empty string that might be worth 
legitimising?


> 0.0055	Element “title” not allowed in this context. (The parent was element
> “head”.) Suppressing further errors from this subtree.

I assume that's a duplicate <title>; the message could be clearer.


> 0.0034	Element “map” not allowed in this context. (The parent was element
> “p”.) Suppressing further errors from this subtree.

Should <map> autoclose the <p>? Should we make it transparent and allow it 
where phrasing content is allowed?


> 0.0020	Attribute “charset” not allowed on element “meta” at this point.

This message could be clearer.


> 0.0018	The “acronym” element is obsolete.

This message should say "use <abbr> instead" or something.


> 0.0017	Attribute “rules” not allowed on element “table” at this point.

0.17%. And to think I spent hours of my life working on making test cases 
for this attribute.


> 0.0011	Bad value (consolidated) for attribute “lang” from namespace
> “http://www.w3.org/XML/1998/namespace” on element “script”: Bad language tag:
> Subtags must next exceed 8 characters in length.

Bug: "must next" is presumably a typo.


> 0.0011	A “charset” attribute on a “meta” element found after the first 512
> bytes.

Was this expected? I'd have thought if you were checking for this we'd 
have seen http-equiv="" cases of this error much more often than that.


> 0.0009	Attribute “autocomplete” not allowed on element “form” at this point.

That's very rare. I've removed the XXX in the spec saying we might add 
this. (IE supports it.)


> 0.0007	Saw a start tag “image”.

This message could say "use <img> instead". We could also downplay it, 
though at 0.07%, who cares. (My own studies in 2005 found it on 0.2% -- 
it's either dropping, or we have a lot of variance. Or both.)


> 0.0005	Bad value (consolidated) for attribute “target” on element “a”: Bad
> browsing context name or keyword: Reserved keyword “blank"” used.

Why is this an error?


> 0.0003	Bad value (consolidated) for attribute “href” on element “a”: Bad IRI
> reference: unterminated string literal (unnamed script#1)
> 0.0003	Bad value (consolidated) for attribute “href” on element “a”: Bad IRI
> reference: syntax error (unnamed script#1)
> 0.0001	Bad value (consolidated) for attribute “href” on element “a”: Bad IRI
> reference: invalid return (unnamed script#1)
> 0.0001	Bad value (consolidated) for attribute “href” on element “a”: Bad IRI
> reference: Unexpected end of file (unnamed script#1)

Validator bug?


> 0.0002	The hash-name reference in attribute “usemap” referred to “Map2”, but
> there is no “map” element with a “name” attribute with that value.

If there's only one <map> element in the document, it might be worth 
saying what it's name="" is here, as a convenience.


> 0.0002 Text run starts with a composing character.

This message could definitely be nicer, though at 0.02% you might not 
really care!


> First, some philosophical assumptions underlying my conclusions:
>  1) Time is a precious resource for people. Therefore, wasting people's time
> is bad.
>  2) A validator should primarily be a tool that authors use to help themselves
> in their authoring task. The primary purpose of a validator is not imposing a
> particular code aesthetic onto other people.
> 
> With legacy language features it becomes problematic to help people not waste
> their time. If an author is writing a new HTML page, the author's time is
> wasted if we make a useless piece of syntax conforming and pundits convince
> the author in to use the useless syntax. In this sense, conforming no-op
> syntax isn't harmless. On the other hand, if an author has existing HTML
> templates and then adds a new HTML5 feature (<video>) to his/her site and
> starts using an HTML5 validator as a quality assurance tool, since an HTML5
> validator recognizes the new feature, the author's time is wasted if the
> validator spews a lot of errors about legacy language features that are
> interoperably implemented and don't really cause harm beyond wasted bandwidth
> (and perhaps slightly lower maintainability).
> 
> More concretely, it wastes people's time if experts advise people to write
> <style type=text/css> instead of <style> but it also wastes people's time if a
> validator tells people to take type=text/css out when it already has been
> written.

Have you considered offering authors an option along the lines of "this is 
a legacy document that I'm updating" or some such? Or maybe an option to 
get a baseline from a document before updating it, so that only new errors 
are highlighted?


> (Things grouped together a bit below.)
> 
> > 0.1142 The internal character encoding declaration must be the first
> > child of the “head” element.
> 
> I think we should go back to requiring the declaration to occur within 
> the first 512 bytes. Whether it has non-ASCII before it doesn't matter 
> in that case even for streaming implementation that perform a prescan on 
> the first 512 bytes.
> 
> The old definition is theoretically ugly, but it seems to be more 
> practical for everyone except validator writers and for me as a 
> validator writer it's sunk cost already.

Done.


> > 0.1001	Attribute “border” not allowed on element “img” at this point.
> 
> It seems to me that Gecko's and Trident's default image border is extremely
> unpopular among authors, and making border=0 non-conforming is unhelpful, too.
> I reiterate my suggestion to make border=0 conforming.

I haven't made it conforming, but see earlier comments about toning down 
this error.


> > 0.1013	Attribute “cellspacing” not allowed on element “table” at this
> > point.
> > 0.0951	Attribute “cellpadding” not allowed on element “table” at this
> > point.
> > 0.0935	Attribute “border” not allowed on element “table” at this
> > point.
> > 0.0924	Attribute “width” not allowed on element “table” at this
> > point.
> > 0.0779	Attribute “valign” not allowed on element “td” at this point.
> > 0.0759	Attribute “width” not allowed on element “td” at this point.
> > 0.0451	Attribute “height” not allowed on element “td” at this point.
> > 0.0365	Attribute “align” not allowed on element “table” at this
> > point.
> > 0.0273	Attribute “height” not allowed on element “table” at this
> > point.
> 
> It's clear by now that the layout model offered by HTML tables is 
> something that authors find useful. Using layout tables in HTML and 
> using CSS is not an either-or choice. Since people who use CSS for some 
> things still use layout tables, this is an indication that the CSS 
> language or its incumbent implementations don't make it easy to make 
> that kind of layouts that authors use tables for.
> 
> Realistically, it will take many years for CSS grid layout to be as 
> deployable by authors as HTML layout tables are today. Moreover, the 
> current installed base of browsers doesn't make CSS table layout a 
> viable alternative for HTML table layout. Chances are that this won't 
> change until the computers that came with Windows XP pre-installed have 
> been disposed of. Chances are that there will be demand for validating 
> HTML5 language features before then.
> 
> Considering the above, it seems unhelpful for HTML5 to take the position 
> that layout tables are not conforming.
>
> (Aside: The accessibility argument against layout tables is moot. Layout 
> tables are so abundant out there that accessibility technology must deal 
> with them anyhow.)

I think making layout tables conforming would be a significant blow to 
evangelisation efforts, and thus have not done this.


> > 0.0793	Attribute “language” not allowed on element “script” at this
> > point.
> 
> <script language=JavaScript> as harmless and useless as <script 
> type=text/javascript>.

Downplayed.


> > 0.0638	Attribute “align” not allowed on element “td” at this point.
> 
> I think this one isn't like the other "presentational" table attributes. 
> The alignment of table cells is often tightly coupled with the kind of 
> content the cells have.
> 
> Moreover, its structure and presentation are truly separated it should 
> be possible to write a style sheet ahead of time for a given set of 
> content features. Here a content feature can be something like 
> "multi-paragraph blockquotes" or "tables with both numbers and text in 
> them". However, intuitively, "tables with numbers in the fifth column" 
> is too specific to be a generic content feature that a style sheet is 
> written to support.
> 
> If you need to tweak your CSS and class attributes whenever you make a 
> table with a new column mix, structure and presentation are not really 
> being separated. Once you get there, why not encode the alignment in 
> HTML?

Yeah, this may be true. It's unclear though whether it wouldn't be better 
just to have some sort of attribute like datatype="number" and rely on a 
generic stylesheet rule, rather than align=""... it doesn't cost much 
more, and avoids any ambiguities in our messaging. Then again, you can 
already do this with class="" (or data-*).

All of these have pretty much the same back-compat story.


> > 0.0609	Attribute “size” not allowed on element “input” at this point.
> 
> This HTML feature doesn't have a convenient CSS alternative that were
> deployable today considering the existing installed base browsers. I think we
> should just make this attribute conforming.

Done.


> > 0.0529	Attribute “align” not allowed on element “div” at this point.
> > 0.0282	Attribute “align” not allowed on element “p” at this point.
> > 0.0372	Attribute “align” not allowed on element “img” at this point.
> 
> Wow.
> 
> It would be interesting to examine the use cases for aligning divs and
> paragraphs. I'd be interested to know if the popularity of the align attribute
> has something to do with legacy RTL authoring habits.

I'm not really sure how to determine this. Do you have any sample URLs?


> > 0.0401	Bad value (consolidated) for attribute “http-equiv” on element
> > “meta”.
> 
> I don't know what values these are, but I hadn't implemented Content-Language
> yet.

You had separate research data on this before; it's in the spec. I'm not 
sure how many we should really allow. Note that this is now extensible.


> > 0.0386	Attribute “name” not allowed on element “a” at this point.
> 
> That one just refuses to go away. :-(

Downplayed this.


> > 0.0354	The “font” element is obsolete.
> > 0.0208	Attribute “color” not allowed on element “font” at this point.
> 
> <font color> is the simplest way to map color-coded text from a WYSIWYG 
> editor to HTML. Would <span style='color:red;'> be any better for 
> color-based emphasis or annotations? (Yeah, yeah, it's not good for 
> accessibility, but neither of those are. Is it realistic to kill color 
> UI in WYSIWYG editors?)

The same could be said of size="" and face="". I don't think allowing 
these or downplaying these is the way forward though. I'm not sure what to 
say or do here.


> > 0.0279	Attribute “accesskey” not allowed on element “a” at this
> > point.
> 
> The design of accesskey sucks, but the attribute seems relatively popular.

Yeah, we'll need to figure something out here. I'm not sure what though. 
I'm waiting for <command> implementation feedback first. I recommend 
making the validator silently accept accesskey="" for now, at least if its 
value is a single character (maybe with a downplayed warning about the 
HTML5 spec not having addresses this attribute for now).


> > 0.0236	Attribute “profile” not allowed on element “head” at this
> > point.
> 
> The profile instances are mostly due to WordPress. The scheme of picking at
> most one page per *hostname* still picked a lot of username.wordpress.com
> blogs. Also, there are a lot of other WP instances out there. These could be
> knocked out by a single WP version update.

Agreed. No change to the spec.


> > 0.0224	Attribute “size” not allowed on element “font” at this point.
> > 0.0202	Attribute “bgcolor” not allowed on element “td” at this point.
> 
> Presentationalism.

Doesn't seem to be value in allowing these.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Wednesday, 24 December 2008 10:44:12 UTC