Re: Issue 31c: Meta generator

Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, 2012-05-19 16:44 +0200:

> since you are developing the experimental HTML5-compatible version of 
> Tidy (https://github.com/w3c/tidy-html5), have you considered whether 
> to remove Tidy's ability to insert a generator string, in order to 
> prevent the effect it has on validation of alternative text for images? 

No. I have no plans to make HTML5 Tidy quit emitting meta generator. Tidy
has always emitted meta generator by default, and there's no evidence to
indicate that Tidy users would like for that to change. (And for those who
want it, Tidy already has an user option for suppressing meta generator
http://w3c.github.com/tidy-html5/quickref.html#tidy-mark)

I've only recently come to realize the argument in favor of the meta
generator exception appears to assume that the current way that tools use
meta generator is -- due to this spec change -- going to become "legacy
use". That "legacy use" argument about meta generator seems to assume that
at least one of the following things is going to happen:

  - Tools which currently emit meta generator in a way that violates the
    current spec are going to be updated to quit emitting it from now on.

  - Users of tools which currently emit meta generator in a way that
    violates the current spec are going to switch to using whatever option
    the tool provides for suppressing meta generator (e.g., start using
    tidy-mark=no with Tidy).

  - Authors of what the current spec calls "hand authored" documents that
    contain meta generator (e.g., because it was put into the doc by some
    other tool before the author switched to "hand authoring" it) will take
    time to manually remove the meta generator instances from their
    documents (which assumes the authors are aware of this spec constraint
    to begin with -- that is, aware that the spec has redefined meta
    generator as a document-global switch, and retroactively assigned new
    magic semantics to all documents containing meta generator.

I think an assumption that any of those things is actually going to happen
in practice on any kind of scale is a fundamentally flawed assumption that
completely undermines the argument in favor of the meta generator exception.

For one thing, I think there are many widely tools that currently emit meta
generator which are not actually actively maintained any longer. The
non-HTML5 version of Tidy is one example; there has not been a new release
of that since 2009, and there has been no active work on it since. Yet that
is still the main version that most users have, and the version that's
going to be the most widely used for a long time to come.

On top of that, I think it's clear that developers of actively maintained
tools that emit meta generator in a way the violates the new magic
semantics and constraints that the spec has retroactively assigned to it
are not going to stop having their tools emit it. They have no incentive to
do that. Consider the comments already made in this thread Daniel Glazman
(for example) who actively develops an application that allows users to
"hand author" documents. (At least as far as I interpret what the spec
means by the term "hand authoring". Unfortunately, though, the spec never
actually defines that term, so what it means by it is ambiguous.)

  http://lists.w3.org/Archives/Public/public-html/2012May/thread.html#msg104

Finally, I don't think authors are, on any kind of scale, going to start
manually removing meta generator instances from their documents. Why would
they? Without users having read the spec, we have no way of alerting them
to the fact their documents now may all contain a document-global switch
with new magic semantics that they're possibly now violating.

This has introduced a serious problem with respect to validator behavior.
The purpose of the validator is to help authors find and fix errors in
their documents that they otherwise would not be aware of. A key problem
here is that the way things stand now, we have no way of alerting authors
about this particular spec constraint. The spec has silently changed the
behavior of the validator in a way that most authors are never going to
realize and never be made be aware of. The end result is that many users
are no longer being alerted to a very important case of errors in their
documents that they would otherwise be informed about and fix.

A spec constraint that validators have no way of ever informing users about
in any way is a really bad idea. As is a spec constraint ("meta generator
must not be used on hand-authored pages") that's neither possible to check
programatically nor possible for a validator user to evaluate and assess
just given the document in isolation without any knowledge of how it might
have been authored.

> Have you considered warning Tidy's user about this effect?

No. That would be the wrong solution to the problem. If we were to emit a
warning anywhere, it would make much more sense to have the validator now
emit a warning for every single document that contains meta generator
("Warning: meta generator is a document-global switch that means you don't
want to be alerted about any missing alternative text in your documents" or
"Warning: You are possibly using meta generator in violation of the HTML
spec if the document you're checking has been 'hand-authored', but
unfortunately this constraint cannot ever be checked programatically and
hey it's possible that even you have no way of determining whether the
document you're checking has been 'hand-authored' or not. So, good luck.")

  --Mike

-- 
Michael[tm] Smith http://people.w3.org/mike

Received on Thursday, 7 June 2012 02:43:24 UTC