Re: dl elements and non-dt, non-dd content

T.J. Crowder wrote:

> To my non-spec-reading eyes, the current spec disallows any content
> other than dt and dd elements within dl elements (although there's
> some slight ambiguity; more below, that's not my main point/question).

I don't see any ambiguity here:

"Content model:
    Zero or more groups each consisting of one or more
    dt elements followed by one or more dd elements."

"Content model" specifies everything that is allowed (and/or required) 
inside an element.

The requirement is stricter than in HTML 4 in the sense that HTML 4 allows 
any mixture of dt and dd elements, even one beginning with dd (which does 
not make much sense semantically) but looser in the sense that it allows the 
content to be empty (apparently to allow the use of <dl></dl> as a 
placeholder to be filled using a script).

> In some documents, one needs to label each term definition with a
> reference number or similar

That's understandable, but numbering of items is generally not supported in 
HTML as marked-up _elements_. Even in an "ordered" (read: numbered) list, 
i.e. an ol element, the numbering is implied, though you can set the numbers 
explicitly - but in attributes, e.g. <li value="42">, not as elements. 
Moreover, you cannot use the numbers directly in links - you need to assign 
id attributes to the <li> elements, or use scripting, or some special way of 
referring to a specific <li> element.

I guess the need for labeling dt elements, though real, is not common enough 
to justify added complexity, especially since it would be rather illogical 
to add it for dl but not ul, ol, and menu.

> The number isn't a term, so
> it doesn't make sense to make it a dt, but it's not a definition
> either, so it's not a dd.

Logically and semantically, you are quite right. In practice, I suppose you 
could just make it part of the dt element contents _or_ use something like

<dt id="42">

together with, say,

dt[id]:before { content: attr(id) " "; }

in CSS (though this won't work in old versions of IE, which don't support 
generated content). You may need to use different id values (as id values 
must be unique in a document), but basically this would seem to solve most 
of the problem, with no added HTML features.

> The header for dl says:
>
> Content model:

That's normative.

> Quite clear, but then the text below says:
>
> If a dl element contains non-whitespace text nodes, or elements other
> than
>> dt and dd, then those elements or text nodes do not form part of any
>> groups in that dl.
>
> ...which softens that a bit.

Well, not really. It does not change the rules (for authors and documents). 
It just adds some rules (for browsers) on dealing with documents that 
violate the rules.

Today I noticed a somewhat similar issue with the title element: only text 
content is allowed (no markup), but the rule for this is followed by the 
definition of the IDL attribute text in terms of picking up just the text 
content of nodes - as if non-text nodes were allowed.

> The validator seems to agree with the
> former, disallowing (say) a p or span.

In general, the validator(s) for HTML5 do not correspond to current HTML5 
drafts in every detail. This is rather understandable, as HTML5 is a moving 
target. But in this case, as in most cases, the validator(s) reflect(s) the 
rules.

> Is the latter simply giving an
> indication of how invalid content should be treated?

Rather, how invalid content of certain type _must_ be treated - I gather it 
is a requirement on user agents, instead of just a recommendation.

-- 
Yucca, http://www.cs.tut.fi/~jkorpela/ 

Received on Friday, 8 April 2011 15:56:56 UTC