Re: several messages from Jonas Sicking on 2009-06-02 (public-html@w3.org from June 2009)

From: Jonas Sicking <jonas@sicking.cc>
Date: Tue, 2 Jun 2009 15:27:30 -0700
To: John Foliot <jfoliot@stanford.edu>
Cc: Simon Pieters <simonp@opera.com>, Ian Hickson <ian@hixie.ch>, public-html@w3.org
Message-ID: <63df84f0906021527i75665ca7odb9e0650263ace8f@mail.gmail.com>

On Tue, Jun 2, 2009 at 12:33 PM, John Foliot <jfoliot@stanford.edu> wrote:
> Jonas Sicking wrote:
>> >>
>> >> Additionally, I given how easy it is to get unexpected results, I
>> >> think we should strongly discourage authors from not escaping
>> >> ampersands. And the best tool that we have for doing that is by
> making
>> >> unescaped ampersands non-conformant.
>> >>
>> >> / Jonas
>> >
>> > Good Luck with that Jonas.  They will never use non-conformance* as a
> tool
>> > to modify author behavior, as they are completely unwilling to attach
>> > critical fail to non-conformance, and without that, what other penalty
>> > would you propose?  A "tsk tsk" from Henri's validator?  Ouch, that
>> > hurts...
>>
>> Is there a purpose to the above other than just trying to be
>> inflammatory? Are you requesting any changes to the spec?
>>
>> / Jonas
>
> Jonas,
>
> I pose a serious question: what is the real benefit of making unescaped
> ampersands non-conformant? (Of making anything "non-conformant"?)   What,
> in practical terms, will it achieve - how will it modify author behavior?

The answer will probably heavily depend on who you ask. All I can give
is my personal opinion:

Making something non-conformant has very little effect. Most web
developers test if what they write works in browser, and if it does,
they leave it at that. If we're lucky, people test in multiple
browsers. A terribly small number of people will run their page
through a validator and fix any errors.

So practically speaking it makes very little difference.

Generally I feel like people on this list are much *much* more
passionate about what is conformant and what isn't than the rest of
the world does. There has been much discussion about if unquoted
attribute values in foreign elements (such as <svg>) should be
conformant or not.

As I said above, I'm sure there's going to be authors out there that
will ensure that they produce conformant HTML. And I'm sure there will
be HTML consumers that will only accept valid HTML. So I still think
we should define what is conformant and what is not.

What to make non-conformant and what not to is a separate question.
And again, I'm sure the answer will depend heavily on who you ask. The
following are a few rules that I personally think should apply.

* You should be able to parse valid HTML using a streaming parser with
a SAX-like interface. I.e. things like the foster parenting algorithm
shouldn't be required when parsing valid HTML.
* We should make non-conformant things that are very likely bugs, such
as miss-nested tags.
* We shouldn't make things non-conformant which are harmless and very
commonly used.

Of course, there's all sorts of subjective words in that, such as
"likely", "harmless", "commonly", and "SAX-like". And I'm sure there
are lots more rules that are appropriate, these were three I could
think of off the top of my head.

The reasons why I think these rules are appropriate:
The first is to allow HTML to be used as a data-interchange format.
The second to help developers develop bug-free web pages.
The third to avoid having people stop using validators out of frustration.

/ Jonas

Received on Tuesday, 2 June 2009 22:28:26 UTC