W3C home > Mailing lists > Public > www-html@w3.org > April 2007

Re: HTML5 script start tag should select appropriate content model according to src

From: Philip Taylor (Webmaster) <P.Taylor@Rhul.Ac.Uk>
Date: Mon, 23 Apr 2007 18:03:28 +0100
Message-ID: <462CE6E0.4090109@Rhul.Ac.Uk>
To: Henri Sivonen <hsivonen@iki.fi>
CC: www-html@w3.org

Henri Sivonen wrote:

> OK. Do you believe that semantic markup is important for its own sake? Why?

May I answer that question (on my own behalf, not on behalf
of Patrick or others) ?

Yes.  Because only semantic markup can truly indicate what
I am trying to say, as opposed to how I am trying to say
it.  If I write "<Linnean-binomial>Felis silvestris</>",
I am unambiguously indicating the /nature/ of the phrase
"Felis silvestris".  An arbitrary and unspecified document
processing system can then make arbitrary and unspecified
use of that meta-information.  A web browser might, for
example, render it in italics, as might a typesetting
engine; a speech synthesiser might choose to add a verbal
cue to indicate to the listener that what is about to
follow is the scientific name of something, rather than
merely being two Latin (or pig-latin) words.  And a
data-mining application might choose to add the phrase
to the set of Linnaean binomials found in the current
document.  Now ask these same systems to process the
markup  "<i>Felis silvestris</i>" : all they can "know"
is that it was the author's intention that this particular
phrase be rendered in italics.

There are, of course, alternative markups that might serve
the same purpose : <i class="Linnaean-binomial">
Felis silvestris</i>, <em class="Linnaean-binomial">
Felis silvestris</em>and even <span class="Linnaean-binomial">
Felis silvestris</span>.  I have nothing against these,
and -- working within the constraints of HTML 4.01 --
I use one or other of the latter forms frequently.
But the <i> variant pre-supposes that there is universal
agreement that Linnaean binomials be italicised (which,
fortunately, is the case).  Whether this is also the case
for (e.g.,) the names of ships is moot.  And the second
example from WA1 is /really/ dubious: "<p>The <i>block-level
elements</i>  are defined above.</p>" : here the need
for a classed <span> is clearly indicated.

So, what do I see as the building blocks of a semantically
rich markup language ?  Probably just two components.

1) A semantically neutral set of core elements.
2) A mechanism for defining additional elements
    in such a way that they are (a) derivable from
    the core elements, and (b) that their semantics
    can be unambiguously and deterministically

In practice, (2) could be accomplished by making the
vocabulary extensible (which will result in less verbose
markup but at the expense of additional complexity
in the browser/renderer/w-h-y), or by a mechanism
for "registering" class names (which would lead to more
verbose documents, but to an elegant simplicity in
the rendering engine).  And "registering" in this sense
does not imply a formal registration with an ICANN-like
central authority, but rather a formalised mechanism
whereby the semantics of a given class name can be
unambiguously specified, either in the document itself
or in a <link>ed document.

My two penn'orth, but as far as I am concerned,
semantic markup is the /only/ avenue worth pursuing.

Philip Taylor
Received on Monday, 23 April 2007 17:03:38 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:06:15 UTC