W3C home > Mailing lists > Public > www-html@w3.org > May 2007

Re: Separating core elements from domain-specific (was : code, samp, kbd, var)

From: Philip Taylor (Webmaster) <P.Taylor@Rhul.Ac.Uk>
Date: Wed, 16 May 2007 13:05:23 +0100
Message-ID: <464AF383.2010205@Rhul.Ac.Uk>
To: www-html@w3.org

David Woolley wrote:
> Philip Taylor (Webmaster) wrote:
>> But "dropping them" in this context would cost nothing; "dropping them"
>> from a putative HTML 5 is not the same as "dropping them" in real life.
> I thought you were taking a structuralist point of view, but the way
> you are going no longer seems consistent with that.

Not sure if that comment applies to the citation preceding
it or not : what I am arguing is that "dropping {code, var, ...}"
from a specification to which they were specifically /added/
(we are assured that WHATWG HTML5 started life as a blank
sheet) is not at all the same as dropping them from something
that is overtly derived from earlier work.  But I don't think this
is at the core of any apparent disagreement, so let me
address your more key points :

>> pure source code level).  Imagine an HTML that consisted only of
>> those elements that we can be certain are required by virtually
>> all classes of document : <html>, <head>, <title>, <script>, <style>,
>> <meta>, <link>, <body>, <p>, <h$n$>, <ol>, <ul>, <li>, <a>, <img>,
>> <object>, <table>, <div> and <span> (there may be others, but this
> That's minimal neither in the sense that every class of document needs 
> them nor in the sense that it provides a set of constructs which matches 
> most authors want.  Unfortunately, I've only got a few minutes to write 
> this before I head for the day job, so I don't have time to expand on 
> that theme.

As I said, my list wasn't meant to be exhaustive or definitive,
but rather to reflect a set of elements about which I am
confident there would be near-universal agreement that they
are /necessary/ (as opposed to sufficient).

> Congratulations :-(.  You have just *re*-invented DOCTYPE, in particular 
> internal subsets.  Specifying the language within the main part of the 
> document is just wrong!  (This also has similarities to xmlns=....)

Only if the extension can affect the <head> region; if (as I believe
would be borne out in real life) the extensions would affect only
the body region, then I do not think there is a /prima facie/ case
against introducing them in the <head> region.  Internal subsets
do indeed address the same issue, but few were ever capable of
writing and using them.  My hope is that we /might/ be able to
specify an extension mechanism that is sufficiently simple [1]
to be capable of being used by the man on the Clapham Common omnibus ... .

>> Then, if a particular author needed <var>, <code>, <samp> and <kbd>,
>> and if these were provided by (say)
>>     http://www.whatwg.org/html/dialects/informatics
> <var> is a much wider concept than informatics.  Its origins are in 
> mathematics, but it is useful in a much wider field.  "informatics" 
> implies a niche for <samp> and <kbd>, when these are actually concepts 
> that anyone who writes HTML, uses a mobile phone, or almost any other 
> piece of modern technology, are familiar with.

"Familiarity with" has nothing to do with "needing to tag as";
I really can't imagine the average hoody-wearing chav creating
a web page that reads "press <kbd>*#06#</kbd> to see your IMEI",
much as I would like to !  But again, my choice of "informatics"
as the domain was illustrative rather than central to my argument.
Rather more central would the the fact that "HTML-Dialect : Music"
and "HTML-Dialect : Cricket" would both introduce the element
<score> but with very different semantics and possibly with a
different syntax.

> You would get resistance from me,

That genuinely saddens me.

>  and, as this is basically the W3C 
> model for XHTML (with XHTML 1.0 as an aberration), I think the WHATWG 
> will object, because one of their objections to W3C standards seems to 
> be the use of modular standards.

No comment.


[1] An extension mechanism simple enough for use by the man on
     the Clapham Common omnibus.

     I do not think that anyone would argue that

	<Linnaean-binomial>Lagopus hyperboreus</>


	<span class = "Linnaean-binomial" Lagopus hyperboreus</>

     are fundamentally different in any way when considering the
     semantics (/qua/ semantics) of the document.  In fact, many
     of the elements about which there is currently disagreement
    (<code>, <kbd>, <samp>, <var>, ...) fall into exactly the same
     category : they are, to all intents and purposes, merely
     shorthands for the more verbose

	<span class="..."> or <div class="...">

			or even

	<span role="..."> or <div class="...">

     constructs, although historically they have been afforded special
     treatment by some browsers.

     Since (as again I am confident there would be near-universal agreement)
     there is an unbounded possible set of such tags, it would be ludicrous
     to attempt to incorporate all of them into a formal definition of a markup
     language, and I therefore argue that it is pointless to incorporate /any/
     of them; we should instead provide a simple mechanism by which the element
     set can be dynamically extended to meet the demands of the current document.

     Obviously such a mechanism can work only where a new element can be sub-
     classed from an existing one; if the new element has a unique syntax,
     then "internal subsets" is probably the best solution.  But I genuinely
     believe that a simple extension mechanism such as that outlined above
     would meet the needs of the vast majority of web authors, and would
     allow the specification of a "lean, clean, mean" HTML 5 whilst at
     the same time allowing web authors complete freedom of expression in
     terms of their markup.

Philip Taylor
Received on Wednesday, 16 May 2007 12:05:17 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 30 April 2020 16:21:03 UTC