Re: Should we Publish a Language Specification?

Maciej Stachowiak wrote:
> A document can be nonconforming and yet still use only features that 
> work the same served over file: with scripting disabled as served over 
> http: with scripting enabled. So this certainly doesn't follow from your 
> definition.
> 
> So let's amend your formulation to "conforming document that uses only 
> features which work the same served over file: with scripting disabled 
> as served over http: with scripting enabled". Or "conforming 
> non-application HTML documents" for short.

Yes, agreed. Thanks for pointing it out.

> The real problem with this subset of HTML documents is a small and 
> uninteresting subset of the actual content on the Web. For example, out 
> of the Alexa Top 100 Sites, zero fall into this subset (I checked them 
> all, only took a few minutes). On the lists of Google PageRank 9 and 10 
> pages I found, none of the sites I checked fell into this category (I 
> only did a random sampling as there were many lists and they were long). 
> In fact, I was not able to find a Web page that meets these criteria at 
> all in about half an hour of searching through sites I visit regularly 
> and links from them.

You could have visited by company's page and would have found something :-).

Anyway, I agree that most "important" sites on the web will always use 
scripting or simply be invalid. But this fact doesn't make the subset 
mentioned above uninteresting. To you, maybe, but not to me.

Anecdote: in the IETF we recently discussed moving away from RFCs 
published as text/plain, using USASCII. One proposal was text/plain, 
using UTF-8. Another, much more ambitious proposal was to use a 
well-defined profile of text/html. Guess what the feedback was? 
"unstable", "moving target", "feature bloat"...

So yes, I'll stick to my position that 
HTML-as-a-simple-document-markup-lanuage is an interesting use case, and 
just the fact that it's not used on the top web sites doesn't change the 
fact.

> Why should we make a special spec for the kind of HTML Web content that 
> apparently no one wants to create and no one wants to consume? Even more 
> so, why should we do so when it will make it harder to correctly and 
> precisely spec the real but theoretically impure content that real 
> people care about?

I disagree that nobody wants to create it. There are lots of communities 
who are interested in long-term stability for document formats (see, for 
instance PDF/A).


BR, Julian

Received on Monday, 24 November 2008 23:50:10 UTC