- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Mon, 7 Aug 2006 16:03:19 +0300 (EEST)
- To: www-validator@w3.org
On Mon, 7 Aug 2006, Jon Ribbens wrote: > An excerpt such as: > > <input type="text" name="foo" size="12'"> > > is "valid" according to the DTD (and hence the Validator), but it is > not correct HTML. That's correct, for all published HTML DTDs (for some odd reason - the prose of the HTML specification restricts the values in size="..." to unsigned integers, so I wonder why the attribute is declared as CDATA and not as NUMBER - while the maxlength="..." attribute _is_ NUMBER). > HTML Tidy will correctly identify this as a problem. I'm not sure about this. The version of Tidy that has been built into HTML-Kit does not seem to issue a warning. Using the online version of Tidy at http://infohound.net/tidy/tidy.pl I get the message Warning: <input> attribute "size" has invalid value "12'"; This might be useful, but it's still wrong. If the message were correct, it should be an error message, not a warning! The value "12'" is not invalid, though it is incorrect. Apparently Tidy internally checks that the size="..." attribute value is a sequence of decimal digits. This could be useful, especially if it were reported correctly (surely there are other English words that could be used instead of abuse of the term "valid") and more informatively (the message is now the same for size="12'" and size="foobar", with no hint to what the syntax of the value _should_ be). Compare the message with the validator's message in a case where "12'" _is_ invalid, namely in the value of the maxlength attribute: (start quote of an error message) Error Line 12 column 20: character "'" is not allowed in the value of attribute "MAXLENGTH". <input maxlength="12'"> It is possible that you violated the naming convention for this attribute. For example, id and name attributes must begin with a letter, not a digit. (end quote) That's fine. Not optimal, but fine and understandable. The obvious direction for improving it would require rewriting major parts of the code so that the validator would know what it has been doing (namely parsing a NUMBER value, so that there would be no need for the excessively generic remark and the message could simply say "The value of this attribute must contain only digits 0 through 9."). The online version of Tidy also seems to "clean up" valid markup, making it invalid: if I use <input> as a direct subelement of <body>, Tidy constructs a <form> element containing it. No <form> markup is required by the DTD _or_ by the prose of the specification; of course there might be good reasons not to use <input> outside a <form>, but that's a different story. Anyway, the markup that Tidy adds has a <form> element without an action="..." attribute, making it invalid. (A Tidy check of the "Tidied" output causes a _warning_ about a missing action="..." attribute.) Regarding the W3C Validator, I wonder whether it _really_ has a bug in processing of attribute values. If I use <input maxlength=" 12"> the validator reports no error. Yet, I cannot see anything in the SGML standard that would allow the leading space, when the attribute is declared as NUMBER. > Both the W3C Validator and HTML Tidy are behaving correctly as > designed, but you can see why people might get confused. Whenever people refer to the W3C Validator as a guarantee of correctness or as making pages work across browsers, a substantial contribution to the confusion is made. -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Monday, 7 August 2006 13:03:30 UTC