proposal to allow div, h1 .. h6 and section as children of list elements from Bruce Lawson on 2008-10-22 (public-html@w3.org from October 2008)

From: Bruce Lawson <bruce@brucelawson.co.uk>
Date: Wed, 22 Oct 2008 17:57:19 +0100
To: public-html@w3.org
Message-ID: <op.ujfqpt03u355il@bruce-pc>
Hi - here's a request/ proposal to allow div, h1 .. h6 and section as  
children of list elements in HTML 5, with some real-world use cases. (This  
is filleted from a blog post  
http://www.brucelawson.co.uk/2008/proposals-changes-to-lists-in-html-5/)

The specs for both HTML 4 and 5 allow only li for ul, ol, and only dt and  
dd are allowed inside dl definition lists. I’d like to expand that to  
allow h1…h6, section and div.

HEADINGS (h1…h6) IN LISTS

Until recently, I worked for the Law Society and Solicitors Regulation  
Authority. In such a business, we spent a lot of time marking up rules,  
regulations and statutes.

In the UK, as most (all?) other jurisdictions, laws and rules are written  
with numbered paragraphs. Within those lists are headings that introduce  
sections. The headings are not part of a list item, but group list items.  
Check out any of the thousands of examples at Office of Public Sector  
Information http://www.opsi.gov.uk/ or the UK Statute Law Database  
http://www.statutelaw.gov.uk/.

Here’s a small but nevertheless real-world example: take a quick look at  
the Solicitors’ Practising Certificate Regulations 1995 (PDF 34K)  
http://www.sra.org.uk/documents/regulatory-framework/practising-certificate-regulations-1995.pdf,  
which I naturally want to mark up like this:

<ol>…
<h2>Commencement</h2>
<li> These regulations replace the Practising Certificate Regulations 1976  
in relation to all practising certificates, and applications for  
practising certificates, for any period commencing on or after 1st  
November 1995.</li>
<h2>Requests for information</h2>
<li>In addition to information supplied on any prescribed form under these  
regulations, solicitors must supply to the Law Society such information as  
to their practice as solicitors as the Society shall from time to time  
reasonably require for the purpose of processing applications.</li>
<h2>Replacement date and conditions</h2>
<li>The replacement date for every practising certificate shall be the  
31st October following the issue of the applicant’s current practising  
certificate.</li>
<li>Every practising certificate shall specify its commencement date, its  
replacement date, and any conditions imposed by the Law Society</li>
…</ol>

You’ll notice that the heading "Replacement date and conditions" is not  
part of either of the following two items, so is not a child of either li.  
Instead, it groups (or introduces) them, and therefore, its semantically  
most appropriate location is as a child of the surrounding ol.

Another way to mark up this document is as a succession of headings and  
paragraphs, with each paragraph beginning with a hard-coded paragraph  
number, perhaps surrounded with a span that is styled with dislay:block;  
in order to make the number look like a list marker.

This spectacularly fails the  Markup Duck Test which states that if it  
looks like a duck, walks like a duck and quacks like a duck then it is a  
duck: a list of paragraphs, each beginning with a number indicating the  
order of the paragraphs is an ordered list, and needs to be marked up as  
one.

Take a more complex example, Legal Services Act 2007, paragraphs 203-206  
http://www.opsi.gov.uk/acts/acts2007/ukpga_20070029_en_15#pt9-pb4. This  
legislation is a long list of numbered paragraphs, interspersed with  
headings to group the following paragraphs into sections. Being more  
complex, this legislation has nested (ordered) sublists, but the same  
logic and basic structure holds here too:


<ol>
…
<li><h5>
The giving of notices, directions and other documents in electronic  
form</h5>
<ol>
<li>[subparagraph 1]</li>
<li>[subparagraph 2]</li> …
<li>[subparagraph 8]</li>
</ol>
</li>
<h4>Orders, rules etc</h4>
<li><h5>Orders, regulations and rules</h5>
… lots of subparagraphs …

</li>
<li><h5>Consultation requirements for rules</h5></li>
<li><h5>Parliamentary control of orders and regulations</h5></li>
<h4>Interpretation</h4>
…
</ol>

A counter argument is that that these whole piece of legislation is an  
ordered list of sections, each containing a sublist list of paragraphs  
within that section.. And that is an legitimate way to look at it, except  
that the actual numbered paragraphs would no longer have the correct  
paragraph numbers auto-generated, as they’d be split into sublists.

Playing with CSS counters wouldn’t help, as different lists are treated as  
separate entities, so numbering in one list can’t follow on from numbering  
in another list. To avoid the paragraph immediately below a section  
heading (the h4 in my code example above) going back to 1, you would have  
to give the li a start attribute and hard-code the paragraph number,  
making a mockery of the idea of automatically generating numbers in  
ordered lists. Even if it could be faked with CSS counters or hardcoding  
the start attribute, it shouldn’t be because that fails the Duck Test, too.

HTML 5 SECTIONS

For HTML 5, it would be ideal if the spec allowed the new section element  
to be a child of a list. This means that content could be pulled from a  
CMS into different pages with different heading hierarchies, and the  
headings would automatically be the correct level within that context.  
This is an idea from the XHTML 2 spec, which has an unnumbered h element:

Structured headings use the single h element, in combination with the  
section element to indicate the structure of the document, and the nesting  
of the sections indicates the importance of the heading. The heading for  
the section is the one that is a child of the section element.

In HTML 5 this is complicated by backwards compatiblity, so any heading  
element from h1…h6 can be chosen, and the headings and sections algorithm  
determines what “level” it actually is.

I’ve marked up the Practising Certificate example as HTML 5  
http://www.brucelawson.co.uk/tests/html5-sections.html and styled the  
various different levels of h1s using CSS so you can see a practical  
example of the usefulness of allowing headings and section to be children  
of a list.

HEADINGS IN DEFINITION LISTS

An example in a definition list would be similar. Here’s a real-world  
glossary marked up as a definition list (which is the best way to mark  
them up, in my opinion, although some favour tables):  
http://www.sra.org.uk/sra/consultations/339.article

A really long alphabetical glossary would be enhanced by dividing it up  
with headers for each letter of the alphabet, for reasons of scannability,  
or so an on-the-fly table of contents generator could make a linked table  
of contents above the glossary.

That could be done by the following (illegal code):

<h1>Glossary</h1>
<dl>
<section>
<h1>A</h1>
<dt>Aardvark</dt>
<dd>Never hurt anybody</dd>
<dt>Allegro</dt>
<dd>The lower limbs of people standing side-by-side</dd>
<dd>The finest car known to man</dd>
</section>
<section>
<h1>B</h1>
<dt>Bee porn</dt>
<dd>definition here</dd>
</section>
</dl>

You might say that each letter of the alphabet should have its own dl. I  
contend that a glossary is a single entity, not twenty-six different lists.

DIV AS A CHILD OF A LIST

While we’re talking of rules and specifications, I’d like to know why I  
can’t use div inside a list.

Mostly I’d like to do this so that I could properly style definition lists  
to look like tables.

You can’t reliably style definition lists at the moment, but you can if  
you can wrap a dt and its associated dds in a div. This is illegal, but  
works cross-browser already.

I agree with the HTML 5 gang when they refuse a new grouping di element  
(presumably "definition item"), saying "This is a styling problem and  
should be fixed in CSS. There’s no reason to add a grouping element to  
HTML, as the semantics are already unambiguous."

Yes, there is no reason for a new definition grouping element; we already  
have a generic grouping element called div. And, yes, it’s true that it’s  
a problem for CSS, but with all the other stuff on the CSS Working Group’s  
agenda, they’re unlikely to get round to it soon.

It must be a common problem (the HTML 5 crew cite it as a "frequently  
asked question") and it can be easily solved using the interoperable,  
backwardly-compatible method I outlined above.

It also raises a philosophical question: I can understand why there are  
restrictions on where some elements can go (for example, it would make no  
semantic sense to allow a list inside an image), but why restrict where an  
author can put an element that has absolutely no meaning ("The div element  
represents nothing at all")?

CONCLUSION

I see the argument against over-complicating a specification, but I think  
that if a new spec can’t accommodate real-world examples of content then  
the specification is not in danger of over-complication—rather, it’s  
currently over-simplistic. HTML 5 has been bravely making itself  
backwards-compatible and thereby becoming more complicated in some areas  
(such as the algorithm for working out the importance of headings in  
sections), so slight extra complication to help developers can also help  
its adoption.


-- 


Bruce Lawson
www.brucelawson.co.uk
Received on Wednesday, 22 October 2008 16:58:11 UTC