[whatwg] WA1 - The Section Header Problem from James Graham on 2004-11-16 (public-whatwg-archive@w3.org from November 2004)

From: James Graham <jg307@cam.ac.uk>
Date: Tue, 16 Nov 2004 13:31:41 +0000
Message-ID: <419A013D.7060100@cam.ac.uk>
Matthew Raymond wrote

>
> COUNTERARGUMENTS:
>
>    Some may have problems with the six rules above that go something 
> like this:
>
> 1) The <h1>-<h6> elements should not be depreciated. We should allow 
> both methods for greater flexibility.
>
>    Using two different section systems will only result in confusion 
> and bloated user agents. 

Define 'bloated'. As far as I can tell the term 'bloated' when applied 
to software means that it either consumes very large amounts of system 
resources or has a over-complex feature set. Retaininbg <hn> won't 
clearly lead to either of these criterion being fulfilled.

> The simple fact of the matter is that the <h#> elements are inferior

Well they don't allow for robust structuring. They are, however, 
excellent for creating semi-structured documents or documents where 
different sets of heading information are required. Note that many 
documents on the web could be well described as semi-structured.

> or otherwise we wouldn't be creating new markup, and therefore we 
> should move to discontinue them with "all due speed".
>
>
> 2) Why should <h#> elements have no semantic value when inside 
> <section>? The webmasters should be allowed to mix markup for the sake 
> of legacy support.
>
>    The <section> element is new markup. Therefore, if a page uses it, 
> then the contents are not legacy markup.

But it may be viewed by a legacy UA. You seem to have missed the point 
about backward-compatibility. Backward-compatibility is not a measure of 
how similar a document looks to a document in the previous version of a 
language. Backwards comaptibility is about the UA's interpretation of a 
document. If a document has different meaning depending on which spec 
you're reading, backward compatibility has been lost.

For that reason, it's OK to replace non-semantic elements with semantic 
ones (div -> section where appropriate) but not to change the sematics 
of preexisting elements. If you can demonstrate that HTML4 headings are 
used so little or so badly that no practical difference will be noticed 
then this might be a reasonable idea.

> If a webmaster wants to use sections, I see no reason for allowing 
> them to mix section-related markup when the newer markup is obviously 
> superior. The specification should not encourage poor markup, nor 
> allow markup combinations that may confuse 

It's not obviously superior. It is better for some tasks. Other tasks 
(writing a document editor) are easier with the old-style markup. There 
are reasons that <h1> through <h6> are back in XHTML2 and, as far as I 
can tell, it's not because the working group was enthusiastic about the 
idea.

>
>
> 3) The presentation of <h1>-<h6> should be retained inside <h> to 
> avoid removing the presentation that the webmaster wants to use.
>
>    If the webmaster doesn't like the styling of <h> for a specific 
> level, he/she should style the appropriate <h> elements. The use of 
> <h#> elements in <h> is intended solely for legacy user agents, so we 
> shouldn't encourage webmasters to use if or styling in WA1 clients.

Styling is irrelevant. I'm not sure why people keep discussing style 
when talking about document models.

>
>
> 4) Why not use the <h1> through <h6> tags for headers instead of <h>?
>
>    Because it only goes to six levels, and it makes it encourages 
> mixed section markup. 

Why not use /any/ of <h1> to <h6> inside a section to denote a header of 
that section and any of <h1> to <h6> outside a section to denote a 
header? Why not take the HTML4 spec literally and have <h1> through <h6> 
denote the importance of a heading and give authors the flexibility of a 
two-component system for structuring and heading documents without 
trying to shoehorn all documents on the web into a formal-report style 
that they simply don't have? Why break backwards compatibility in a spec 
specifically designed to retain compatibility with existing UAs?

>
>
> 5-6) We could use repeated <h> elements to save markup instead of 
> having a <section> for every subsection on the same level.
>
>    By enforcing the single header rule, we create a situation where 
> the webmaster must create the document structure in markup rather than 
> relying on implied meanings. Why is this important? Well, for example, 
> let's say you have this markup:
>
> | <section>
> |  <h>Header 1</h>
> |  <p>...Content 1...</p>
> |  <h>Header 2</h>
> |  <p>...Content 2...</p>
> |  <h>Header 3</h>
> |  <p>...Content 3...</p>
> | </section>
>
>    Now suppose you want to put the individual sections in tabs? You 
> have to break up the one big section into three separate sections:
>
> | <tabbox>
> |  <section>
> |   <h>Header 1</h>
> |   <p>...Content 1...</p>
> |  </section>
> |  <section>
> |   <h>Header 2</h>
> |   <p>...Content 2...</p>
> |  </section>
> |  <section>
> |   <h>Header 3</h>
> |   <p>...Content 3...</p>
> |  </section>
> | </tabbox>
>
>    And what if the user has a special default CSS file to override 
> yours? They could be using CSS to put borders around each section, but 
> here that doesn't work. 

As was previously pointed out, this makes the markup very verbose (and 
so likely to be ignored). My initial feeling is that it's very difficult 
for editors to deal with this kind of markup so making it equally 
unappealing. I think this is a case where the spec has to say the 
obvious thing (a heading tag is a heading of some sort wherever it 
appears) or the spec will be quickly ignored by authors and UA vendors 
and we'll have the compatibility issues we've seen with HTML4 all over 
again.

There has been much discussion about headings but I think the following 
are necessary criterion for the spec to be usable:

Backwards compatibility must be maintained. <h1> to <h6> must represent 
headings. Given the abuse of headings-as-structure on the existing web 
there may be some leeway in (re)defining the way that the headings 
interact to give e.g. an outline/toc.
(Given the above, introducing <h> to represent a generic heading seems a 
little pointless. UAs will have to deal with <h1> through <h6> anyway)
The heading model has to be something that Dreamweaver-like applications 
can implement in a sensible way. This is much much easier with <h1> 
through <h6> than with <section> and <h>.
Multiple headings per section will probably happen anyway. So we may as 
well allow them.
Many documents on the web do not have a formal structure of the sort 
that would be edxpected in a legal report. The heading model should be 
able to cope with that.
It has to be possible to get an unambigous structure from the headings 
of a document. This means having an algorithm in the spec that UAs can 
implement that will give a 'tree view' of the document structure.
Received on Tuesday, 16 November 2004 05:31:41 UTC