W3C home > Mailing lists > Public > public-html@w3.org > July 2009

Re: Publishing a new draft

From: Leif Halvard Silli <lhs@malform.no>
Date: Fri, 31 Jul 2009 17:33:20 +0200
Message-ID: <4A730EC0.2030208@malform.no>
To: Anne van Kesteren <annevk@opera.com>
CC: Simon Pieters <simonp@opera.com>, Sam Ruby <rubys@intertwingly.net>, Laura Carlson <laura.lee.carlson@gmail.com>, Ian Hickson <ian@hixie.ch>, HTML WG <public-html@w3.org>, "Michael(tm) Smith" <mike@w3.org>
Anne van Kesteren On 09-07-31 15.35:

> On Fri, 31 Jul 2009 00:42:33 +0200, Leif Halvard Silli
> <lhs@malform.no> wrote:
>> Anne,  seeing that Simon too agreed that it would be good if
>> the HTML 4/HTML 5 differences document gave more examples of
>> what it means by "esoteric SGML" features, perhaps you'll do
>> that?
> I added processing instructions as another example.

Very good. But I am not completely satisfied regarding PIs until 
you add an example of what that means. PHP is natural to mention.

You also have not removed the word "esoteric". That is not a word 
  that adds to anyones understanding but rather fuel certain 
prejudices and "I think I know"-ism.

Why not instead point to Appendix B in HTML 4, which itself lists 
some things it considers badly supported, and tell which of those 
things are now now not supported?

>> It is not clear what "esoteric SGML features" means. Here are
>> some things that HTML 4 considers esoteric (Appendix B.3.3
>> and onwards of HTML 4) but not all of them are:
> I don't think being exhaustive here is important. SGML never
> existed on the Web and typical authors for who this document is
> intended will not care.

That SGML doesn't exist "on the Web" (or, rather, in the UAs) 
isn't news - I seem to read the same message out of HTML 4 itself. 
So to say that it doesn't exist has the tone of saying that "we 
have pretended as if it exist, but now we tell you the truth".

That said, if we separate our concerns, not only w.r.t. the spec 
we write for the future, but also w.r.t. to what exists today, 
then SGML exists for most Web authors in the form of HTML 4 - just 
try a validator and ask an author who has tried to validate.

>> * <?PI > syntax  (Not esoteric when considering UA support &
>> scripting languages implementation - see the parallel
>> thread.)
> If there was user agent support it would end up in the DOM as a
> ProcessingInstruction node.

Is this specified somewhere? In the DOM specifications, perhaps?

UA support has at least two levels. The first level is 
"consciously ignore unsupported things". The second level is "do 
something with it". A vendor cannot claim CSS compliance if their 
UA goes bananas over unknown CSS selectors. In that same sense, 
UAs support PI. That not all of them render them in the DOM do not 
change that. (So far none of them display PI in the DOM as "bogus 
comments", as HTML 5 says they should.)

>> * Boolean attributes. (Should be well supported - new ones
>> are even introduced in HTML 5.)
> SGML did not have boolean attributes. What SGML had was known
> values. If you knew the value the parser could figure out which
> attribute you meant based on the DTD.

I think you are wrong if you consider that HTML 4 spoke about 
parsers that used a DTD.

> So if you specify <input
> readonly> in HTML4 you actually specify <input
> readonly="readonly"> whereas in HTML5 you specify <input
> readonly="">. I don't think it's worth bothering the world with
> this detail that was never implemented as such and never
> understood except by a rare few.

There is no point in telling that "by our own esoteric definition 
of 'boolean', HTML 4 did not have boolean attributes". So I 
support you in not mentioning this.

My point above was that support for boolean attributes is not 
esoteric, even if HTML 4 thought it was. HTML 5 accepts both 
'readonly' and 'readonly=readonly', in addition to 'readonly=""'. 
And I don't know how to measure what is most esoteric - the HTML 5 
  way, or the HTML 4 way. Proves that the notion of "esoteric" is 
not very useful.

[moved this paragraph for better context:]
 >> * "</" as "end-tag open delimiter". (Well known to anyone
 >> trying to validate a SCRIPT element with HTML code inside.
 >> Probably no tears for seeing this "feature" go.)
 > Right.

>> * Shorthand markup (Of which the currently mentioned NET
>> syntax is just one example. Another one is the above
>> mentioned "</" which - one could claim - is supported as it
>> produces a "bogus comment". While "<>" - empty start tag - is
>> not supported in any way.)
> I don't think one can claim </ is supported syntax on its own.

The point is that "</" should be mentioned as an example of a 
positive SGML disappearance. Reading the differences document, I 
would like to see how HTML 5 differs, and understand what that 
means to me. That it becomes easier to use <SCRIPT> is worth 
telling. The W3 validator even has a note about this problem [*].

[*] http://validator.w3.org/docs/help.html#faq-javascript

>> In general, it (A) feels fruitless to lump all this together
>> as "esoteric SGML". And (B) many will not understand - it
>> will hamper review - unless you give examples.
> This document is a very high-level overview as to what has
> changed since HTML4. I would be open to the suggestion of not
> mentioning anything specific and just say that the syntax is no
> longer SGML-based,

However, the HTML 5 draft opens by saying that HTML is _inspired_ 
by SGML.

> which seems to suffice in most blog posts on
> the matter. I don't think making a big deal out of this will in
> anyway help anyone.

I don't think blog posts should be the focus - the PR should be be 
the task for those that /presents/ the Differences document. OTOH, 
if bloggers read that document, then that is a good reason to make 
it informative.

The HTML 5 draft itself lists what elements and attributes that 
are no longer supported. But the Differences seems fitted as a 
place to give some tiny chunks of informative background info.

I think the focus of the Differences should be on helping those 
that are evaluating if HTML 5 is right for them and what it means 
for them if/when HTML 5 becomes the defacto standard. And then, I 
maintain that being more concrete about what it means - 
practically  that "</" is dropped and if "PI" are dropped,  would 
be helpful. Helpful for these people and helpful for the review of 
    HTML 5.

Btw, I could certainly supply you with some text phrases, if that 
would help and if it can wait until next week.
leif halvard silli
Received on Friday, 31 July 2009 15:34:03 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:51 UTC