Re: Validation error frequencies from Henri Sivonen on 2008-02-04 (public-html@w3.org from February 2008)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Mon, 4 Feb 2008 11:08:32 +0200
To: Sam Ruby <rubys@us.ibm.com>
Cc: HTML Issue Tracking WG <public-html@w3.org>
Message-Id: <24AB80DD-C411-4D79-BC7A-7D203A706D1A@iki.fi>
On Feb 4, 2008, at 01:15, Sam Ruby wrote:

> Henri Sivonen wrote:
>> If an error is very common AND the way browsers react to the error  
>> is interoperable the way browsers react to the error makes  
>> intuitive sense to authors (i.e. the browser behavior is not crazy)  
>> AND the error has harmless consequences, I think we should  
>> seriously consider making the error into a non-error in order to  
>> cut noise from validation results to allow authors to focus on more  
>> important matters. (I suspected spaces in IRIs may be an example of  
>> this case but I'm not sure.)
>
> The keyword here is 'harmless consequences'.  And actually, the real  
> issue is false positives vs false negatives.

I agree. That's why I said I'm not sure whether spaces in IRIs are an  
example of this case. I thought it was initially, but this discussion  
is suggesting it isn't.

>> If an error is very common AND the way browsers react to the error  
>> is interoperable the way browsers react to the error makes  
>> intuitive sense to authors (i.e. the browser behavior is not crazy)  
>> AND the error has less harmful consequences than the obvious  
>> workaround, I think we should seriously consider making the error  
>> into a non-error in order to avoid the more harmful consequences.  
>> (I think target='_blank' is an example of this case, and I'm pretty  
>> sure.)
>
> A space in a URI has less harmless consequences than %20?

Sorry, that was bad communication on my part. That paragraph wasn't  
about IRIs but about what I was trying to accomplish more generally  
with posting the error stats. I was trying to point out that I had a  
slightly different motivation for suggesting (earlier in this thread)  
that some other things be made conforming.

>> So what I'm trying to accomplish is making HTML5 validation errors  
>> more useful from the author point of view.
>
> So, the question boils down to: which is worse, asking people to  
> replace spaces with %20, or not detecting or reporting on real  
> errors? Hopefully as I phrased this sentence, you can see my point  
> of view.

I do want validators to detect and report real errors. The data I was  
looking at initially just suggested that it wasn't an issue. But it is  
likely that HTML5 needs to be stricter here than XML system ids after  
all in order to avoid false negatives.

>> I'd be particularly interested in your opinion on <img border='0'>  
>> considering your previous opinions on /> and content models (and  
>> usage on Planet Intertwingly).
>
> There are two parts of my answer to that.
>
> For my usage on Planet Intertwingly, I would gladly change that one  
> line.  To be brutally honest, the origin of that line is from the  
> "classic_fancy" planet theme that I hastily converted from htmltmpl  
> to xslt, and never carefully reviewed.

On the Web scale, though, I think it is not productive to ask a large  
number of authors to make that small change on a large number of sites  
when migrating existing designs to HTML5. Moreover, I think it isn't  
nice that a designed-from-scratch brand new design would be rendered  
invalid by copying and pasting an HTML snippet for one of those badge  
graphics. Even if Gecko's defaults were changed and the change caused  
border='0' to slowly fade over time, border='0' is going to stay  
around for a long time in those embedding snippets that various sites  
offer for people to copy and paste into their HTML.

> The other part of my answer is that I personally believe all of the  
> social engineering in HTML5 that is attempting to steer people away  
> from well defined and interoperable behavior and towards CSS, while  
> well intentioned, is seriously misguided and counter-productive.

I tend to agree.

CSS is great and well-established. The remaining presentational HTML  
bits and CSS aren't mutually exclusive: sites that have presentational  
HTML bits use CSS as well. We don't need to fear that making a few  
presentational interoperably implemented attributes conforming were a  
threat to CSS.

As for the usual "CSS is more maintainable" argument, I think authors  
should be free to opt for better maintainability but we shouldn't take  
a patronizing "this is for your own good" attitude and insist on it.  
Besides, the HTML attributes are easily overridden from an external  
style sheet whereas embedded style (which is the obvious lazy  
workaround to silence a validator) isn't.

> For two reasons.  The first is the shared value that you and I have  
> that people tend to view specs through the lens that is a  
> validator.  Specs that are written in a way that cause validators to  
> produce messages that aren't meaningful to authors tend to be ignored.

I agree.

> The second is more specific.  External style sheets don't syndicate.  
> Style attributes are more difficult to sanitize than separate  
> attributes, and so many consumers don't bother.

That's true, although I tend to think that dropping site-specific  
styles and unifying the appearance of syndicated content is a feature  
and not a bug.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Monday, 4 February 2008 09:08:48 UTC