Re: Intuitiveness and documentation of XHTML from Bjoern Hoehrmann on 2004-12-16 (public-evangelist@w3.org from December 2004)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Thu, 16 Dec 2004 18:32:55 +0100
To: "John McLaren" <fieldlab@yahoo.com>
Cc: <public-evangelist@w3.org>
Message-ID: <41c3bee7.299838515@smtp.bjoern.hoehrmann.de>

* John McLaren wrote:
>Look at all the differences in the syntax requirements of XHTML.
>They have added all sorts of arbitrary closing keystrokes and
>required characters here and there, and ridiculous "/" requirements
>that any intelligent parser could automatically recognize or at
>least recognize MULTIPLE REPLACEMENTS for. But they have chosen
>a completely rigid, inflexible framework. HTML just gets less
>intuitive and less "human" all the time.

Suppose you are new to web authoring and you see these rules:

  In HTML, you must close the elements

    a, abbr, acronym, address, applet, b, bdo, big, blockquote,
    button, caption, center, cite, code, del, dfn, dir, div, dl,
    em, fieldset, font, form, frameset, h1, h2, h3, h4, h5, h6,
    i, iframe, ins, kbd, label, legend, map, menu, noframes,
    noscript, object, ol, optgroup, pre, q, s, samp, script,
    select, small, span, strike, strong, style, sub, sup, table,
    textarea, title, tt, u, ul, var

  and you may close

    body, colgroup, dd, dt, head, html, li, option, p, tbody,
    td, tfoot, th, thead, tr

  and you must not close

    area, base, basefont, br, col, frame, hr, img, input, isindex,
    link, meta, param

and

  In XHTML, you must close all elements.

Many people would argue that latter is much simpler. Of course,
the comparison is a bit unfair, you could aswell create meta-
rules like

  In HTML, you must close all elements that may have content,
  and must not close elements that must not have content.

but that would be wrong and difficult to apply since you might
not always be sure whether an element may have content, for
example, you might think it does not make sense for

  <script src="...">

to have content, so you can leave the end-tag off, which is not
correct... The same goes for attribute values,

  In HTML, you must quote attribute values except if the
  attribute value contains only a-z, A-Z, 0-9, '-', '_',
  '.', and ':'.

versus

  In XHTML, you must quote all attribute values.

You might again simplify the rules to

  In HTML, you must quote attribute values except if they
  contain only letters, digits, dashes and dots.

Then,

  ... foo=Björn ...

might be considered correct since it consists only of letters,
even though it is not. A better simplification is probably you
must quote all attributes -- that's XHTML...

Another thing about HTML is that certain elements do not need to
be specified (which is assumed to be more intuitive), for example
<html>, <head>, and <body> may be omitted, so you can have

  <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
  <title>...</title>
  <p>...

As a minimal document. This is possible because all HTML documents
start with a <html> element and most elements allowed inside <body>
are not allowed inside <head>. This is quite "intuitive", yet it
confuses many people (who report bugs in the Markup Validator for
not catching the "missing" elements as errors). In XHTML, all
elements need to be specified.

So, with XHTML syntax, there are much less rules to bother with, at
the expense of some shortcuts, indeed. But does it really matter
much?
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

Received on Thursday, 16 December 2004 17:33:11 UTC