- From: Philip Taylor (Webmaster) <P.Taylor@Rhul.Ac.Uk>
- Date: Mon, 23 Apr 2007 18:03:28 +0100
- To: Henri Sivonen <hsivonen@iki.fi>
- CC: www-html@w3.org
Henri Sivonen wrote: > OK. Do you believe that semantic markup is important for its own sake? Why? May I answer that question (on my own behalf, not on behalf of Patrick or others) ? Yes. Because only semantic markup can truly indicate what I am trying to say, as opposed to how I am trying to say it. If I write "<Linnean-binomial>Felis silvestris</>", I am unambiguously indicating the /nature/ of the phrase "Felis silvestris". An arbitrary and unspecified document processing system can then make arbitrary and unspecified use of that meta-information. A web browser might, for example, render it in italics, as might a typesetting engine; a speech synthesiser might choose to add a verbal cue to indicate to the listener that what is about to follow is the scientific name of something, rather than merely being two Latin (or pig-latin) words. And a data-mining application might choose to add the phrase to the set of Linnaean binomials found in the current document. Now ask these same systems to process the markup "<i>Felis silvestris</i>" : all they can "know" is that it was the author's intention that this particular phrase be rendered in italics. There are, of course, alternative markups that might serve the same purpose : <i class="Linnaean-binomial"> Felis silvestris</i>, <em class="Linnaean-binomial"> Felis silvestris</em>and even <span class="Linnaean-binomial"> Felis silvestris</span>. I have nothing against these, and -- working within the constraints of HTML 4.01 -- I use one or other of the latter forms frequently. But the <i> variant pre-supposes that there is universal agreement that Linnaean binomials be italicised (which, fortunately, is the case). Whether this is also the case for (e.g.,) the names of ships is moot. And the second example from WA1 is /really/ dubious: "<p>The <i>block-level elements</i> are defined above.</p>" : here the need for a classed <span> is clearly indicated. So, what do I see as the building blocks of a semantically rich markup language ? Probably just two components. 1) A semantically neutral set of core elements. 2) A mechanism for defining additional elements in such a way that they are (a) derivable from the core elements, and (b) that their semantics can be unambiguously and deterministically ascertained. In practice, (2) could be accomplished by making the vocabulary extensible (which will result in less verbose markup but at the expense of additional complexity in the browser/renderer/w-h-y), or by a mechanism for "registering" class names (which would lead to more verbose documents, but to an elegant simplicity in the rendering engine). And "registering" in this sense does not imply a formal registration with an ICANN-like central authority, but rather a formalised mechanism whereby the semantics of a given class name can be unambiguously specified, either in the document itself or in a <link>ed document. My two penn'orth, but as far as I am concerned, semantic markup is the /only/ avenue worth pursuing. Philip Taylor
Received on Monday, 23 April 2007 17:03:38 UTC