W3C home > Mailing lists > Public > public-html@w3.org > November 2008

Re: Should we Publish a Language Specification?

From: Maciej Stachowiak <mjs@apple.com>
Date: Mon, 24 Nov 2008 16:27:17 -0800
Message-Id: <1703F5F3-FEDC-4186-BB38-2C350E2BC822@apple.com>
Cc: Ian Hickson <ian@hixie.ch>, "public-html@w3.org" <public-html@w3.org>
To: Julian Reschke <julian.reschke@gmx.de>

On Nov 24, 2008, at 3:49 PM, Julian Reschke wrote:

> Maciej Stachowiak wrote:
>> A document can be nonconforming and yet still use only features  
>> that work the same served over file: with scripting disabled as  
>> served over http: with scripting enabled. So this certainly  
>> doesn't follow from your definition.
>> So let's amend your formulation to "conforming document that uses  
>> only features which work the same served over file: with scripting  
>> disabled as served over http: with scripting enabled". Or  
>> "conforming non-application HTML documents" for short.
> Yes, agreed. Thanks for pointing it out.
>> The real problem with this subset of HTML documents is a small and  
>> uninteresting subset of the actual content on the Web. For  
>> example, out of the Alexa Top 100 Sites, zero fall into this  
>> subset (I checked them all, only took a few minutes). On the lists  
>> of Google PageRank 9 and 10 pages I found, none of the sites I  
>> checked fell into this category (I only did a random sampling as  
>> there were many lists and they were long). In fact, I was not able  
>> to find a Web page that meets these criteria at all in about half  
>> an hour of searching through sites I visit regularly and links  
>> from them.
> You could have visited by company's page and would have found  
> something :-).
> Anyway, I agree that most "important" sites on the web will always  
> use scripting or simply be invalid. But this fact doesn't make the  
> subset mentioned above uninteresting. To you, maybe, but not to me.
> Anecdote: in the IETF we recently discussed moving away from RFCs  
> published as text/plain, using USASCII. One proposal was text/ 
> plain, using UTF-8. Another, much more ambitious proposal was to  
> use a well-defined profile of text/html. Guess what the feedback  
> was? "unstable", "moving target", "feature bloat"...
> So yes, I'll stick to my position that HTML-as-a-simple-document- 
> markup-lanuage is an interesting use case, and just the fact that  
> it's not used on the top web sites doesn't change the fact.
>> Why should we make a special spec for the kind of HTML Web content  
>> that apparently no one wants to create and no one wants to  
>> consume? Even more so, why should we do so when it will make it  
>> harder to correctly and precisely spec the real but theoretically  
>> impure content that real people care about?
> I disagree that nobody wants to create it. There are lots of  
> communities who are interested in long-term stability for document  
> formats (see, for instance PDF/A).

All right, conceded. The "conforming non-application HTML document"  
subset of HTML content is of interest to some people, for reasons  
that are at least arguably valid. This may be a small minority of  
people and content, but it does exist.

Certainly, I am not against the existence of subset descriptions for  
communities that are interested only in subsets of the features of  
HTML. However, you have asked for a strongly privileged status for  
the subset you care about, even though overall it is only of niche  

In particular, if I understand your requests correctly you want:

- A normative specification which defines the aspects of HTML  
relevant to your subset, and no others.
- All other features to be defined in one or more separate  
specifications which normatively reference the aforementioned base spec.

This would place your restricted dialect at a strongly privileged  
position over more expansive HTML subsets, up to and including the  
full language. And this would be at significant cost to those doing  
the editing, and to implementors or authors interested in broader  

So I ask again, what justifies such special treatment for the subset  
you are interested in, when in practice the most popular content and  
the most important sites do not use it? To treat a subset as favored  
in this way, I think we would need objective reasons, not just the  
personal preferences of some Working Group members.

Received on Tuesday, 25 November 2008 00:27:52 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:39 UTC