[whatwg] [html5] Semantic elements and spec complexity

On 10 Nov, 2004, at 3:48 AM, Ian Hickson wrote:
> ...
> The whiteboard in my office currently has a list of elements under the 
> heading "HTML5 BLOCK LEVEL ELEMENTS", and I'm trying to work out how 
> to make them work well (the elements in question are currently 
> mentioned in the draft, but the draft doesn't handle headers at all 
> well). I haven't looked at inline markup yet, but that's on the cards 
> too.

I believe the past 15 years of semantic markup have shown these three 
things to be true:

1.  Most authors Just Don't Care about semantic markup. They'll only use
     it if it's the easiest way of getting the visual effect or behavior
     they want in their own favorite browser, or if they can use it to
     game search engines. (That's why authors use <ul> and <li>, for
     example, but not <address>.)

2.  Those authors who do care about semantic markup often do so
     overzealously, using it even when it's not appropriate. For example,
     they use <em> whenever they want italics or <strong> whenever they
     want bold.

3.  The more complex a markup language, the fewer people understand it,
     the less conformant the average article will be, so the less useful
     the Web's semantics will be. Current HTML authors may clamour for
     new features, but they have forgotten what it was like to be a new
     HTML author; and new authors are neither subscribed to this list nor
     employed by browser vendors, so it is easy to forget about them.

So if <section>, <navigation>, <header>, <footer>, <article>, and 
<sidebar> are introduced, with the default presentation currently 
suggested {display: block; margin: 0;}, I predict the following:

*   The A-list of Web developers will begin using all the elements
     correctly on their Weblogs, and they will feel good about it.

*   A greater number of Web developers will never use most of these
     elements, but they will replace all occurrences of <div> on their
     pages with <section> because it's more "semantic" (just like they
     did with <em> for <i> and <strong> for <b>), and they will feel good
     about it.

*   The vast majority of article producers (Weblogs and online
     newspapers) will never use <article>, because there's no visual or
     behavioral benefit from doing so. So <article> will never become a
     reliable way of dissecting or aggregating pages.

*   The number of knowledgable HTML authors, the proportion of HTML
     pages that are valid, and therefore the overall usefulness of the
     Web, will be less than it otherwise would have been because of
     HTML's increased complexity.

One way of improving this situation would be to reduce the number of 
new elements -- forget about <article> and <footer>, for example.

Another way would be to recommend more distinct default presentation 
for each of the elements -- for example, default <article> to having a 
drop cap, default <sidebar> to floating right, default <header>, 
<footer>, and <navigation> to having a slightly darker background than 
their parent element, and default <header>...<li> and <footer>...</li> 
to inline presentation. This would make authors more likely to choose 
the appropriate element.

A complementary long-term approach would be to deprecate the most 
redundant and/or least effectual elements and attributes from HTML 4.01 
-- for example, <acronym>, <big>, <small>, <q>, <var>, accesskey=, 
cite=, longdesc=, and name= -- in preparation for removing them. This 
would eventually help reduce the complexity of the spec.

Matthew Thomas

Received on Wednesday, 10 November 2004 05:57:07 UTC