W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2008

[whatwg] Thoughts on HTML 5

From: Giovanni Campagna <scampa.giovanni@gmail.com>
Date: Thu, 18 Dec 2008 18:15:51 +0100
Message-ID: <65307430812180915n77fa6d94la9558b3b65c8a83d@mail.gmail.com>
2008/12/18 Benjamin Hawkes-Lewis <bhawkeslewis at googlemail.com>

> Perhaps (got any actual evidence about author expectations in this case?),
> but that's not a problem for tokenizer performance. You're "shifting the
> goalposts".

My comment about tokenizer performance was later. By the way, author should
not expect that invalid markup work in any particular way (in the past they
did and wrote specific markup for specific implementation)

> Anyway, if we're talking authorial expectations, ordinary authors don't
> expect
> <a href="http://example.com?foobar&baz">
> to be an unrecoverable error, but it is in XHTML.

authors didn't expect that example.com?foobar&section=1 became
example.com?foobar?ion=1 but this happened in Netscape and IE quite long ago
if they got an error, at least they knew that it was not a correct syntax
and should have been avoided, since it could lead to different results in
different browsers
(it is not valid HTML, btw)

It's not like either of these syntaxes make sense to ordinary people or were
> even intended to do so. The original authoring model for HTML was supposed
> to be "paragraph" and "anchor", mediated by some sort of vaguely WYSIWYG
> type editor, not angle-bracketed tags.

If you don't like like less-than and greater-than (it is not Unicode angle
bracket actually), publish your work in PDF or DOC. HTML stays for HyperText
Markup Language. Markup (i.e. tags) can't be removed.

A conforming browser will interpret the markup as specified by the
> specification, so there is no difference.

Yes, the fact is that the specification itself "guesses" what an average
author thinks when it writes HTML

> In practice, people find this very hard for XML and most web publishing
> systems (WordPress etc.) don't work like this even if they should.

Why do SQL injections or buffer overrun attacks happen? Because applications
don't check for input. The same for XML: you check, you're sure nobody will
try to take your site down. You don't check, that's your fault.

> Also, much of the web is ad-supported. The ads ecosystem is based around
> including markup from trusted sources. Those including the markup are
> generally not able to exert much control over the included markup, even when
> they are some of the biggest publishers on the web. Getting ads to have
> user-friendly HTML (e.g. alt attributes for image links) is nigh impossible;
> trying to get conforming HTML is a wet dream; and trying to get ads in valid
> XML is a likely to be a complete non-starter. Why would an ad creator
> bother, when they could choose a different partner and use their old
> text/html ads?

If ad buyer refuses to buy a non-valid-XML ad, probably the ad creator will
rewrite them.

"Probably" - got any empirical evidence for that? I don't usually report
> errors in websites I visit (even _I_ usually have other things to do with my
> time).

If any error prevents someone from correctly browsing that page, he first
reports that to web owner, then to browser creator.

>  Indeed, they would be upset. And they might even try porting it. However,
> there's little incentive for browser makers to throw information bars over
> the majority of the existing web just to assuage your desire for people to
> switch to XML. In fact, there are disincentives for browser vendors to
> include such an information bar since:

1. Users will complain about error messages about sites that have always
> worked just fine. ("I'm switching back to IE8.")

2. Users will be trained to ignore error messages since sites work just fine
> even with a finger-wagging information bar slapped across the top, which is
> a security risk.

Even persuading browser vendors to include an indication of whether a
> website is valid or not has been a non-starter for every browser except iCab
> - and even iCab has dropped that indication in the latest version.

If an user complains about a warning (not error) indication, he can disable
it (but not security errrors). On the other hand, some user will complain
with the site creator, instead of with the browser creator.

Ian was effectively asking: "Why deprecate text/html?" You appear to be
> trying to answer: "How would we deprecate text/html?" which is a different
> question (and I've indicated some problems with your suggestion above).

Sorry, I didn't understand (it looked like "we want to deprecate html but we
don't have instruments", but it didn't make much sense).

Except on the ad-supported web?

1) use <iframe>
2) use <object>
3) use <embed>
4) use <img>
5) use well-formed XHTML
6) use JS + DOM
Do you think it is enough?
Giovanni Campagna
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20081218/afb01502/attachment.htm>
Received on Thursday, 18 December 2008 09:15:51 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:46 UTC