Re: Formal definition of HTML5 (was Re: Version information)

On Sat, 14 Apr 2007, Henrik Dvergsdal wrote:
> 
> However, as a developer I don't like the idea of being dependend on 
> proprietary blackboxes for validation.

They don't have to be proprietary or black boxes; you could easily use an 
open source tool. If a schema is used it can be made available as easily 
in a tool as it can if it is part of the specification.


> Today I can just pick up an xml parser, the xhtml DTD and then validate 
> XHTML markup just as I validate any other XML document. (I haven't tried 
> it with HTML, but I guess there are also some SGML parsers around.)

The XHTML DTD is highly inadequate for conformance checking of HTML, 
though. For example it wouldn't spot any mistakes in this document:

   <!DOCTYPE li PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
   <li id="a">
    <a href="invalid uri" tabindex="none">
     <label for="a">
      <a href="" name="a">Test</a>
     </label>
    </a>
   </li>

...even though I count at least seven separate conformance errors in that 
document. In fact, the W3C validator claims that the above document is 
valid XHTML 1.0 strict:

   http://validator.w3.org/check?uri=http%3A%2F%2Fjunkyard.damowmow.com%2F280


> The same goes for authoring tools. I think It would be good for the web 
> to have schema driven authoring tools that could allways refer to the 
> latest version of a normative HTML5 schema - even if they miss out some 
> aspects of the language.

Well, there are efforts underway to create schemas for HTML5; but I don't 
see any advantage to making them official.


> In my view, you should have really some compelling reasons for not not 
> defining HTML5 by means of a schema. Could you be a little more specific 
> on this?

Putting forward a formal schema causes people to claim that things that 
are not caught by the schema are allowed, even when this contradicts other 
claims in the specification. See, for instance, the W3C validator 
referenced above -- despite it being incomplete (as shown by the sample 
above) it still is claimed by many to be authoritative. If the DTDs 
weren't part of the spec, then it would be much easier to just say "no, 
that's a bug in the validator" without being rebuffed "but the validator 
just uses the spec's DTD!".


> So far I've heard that this is about failure of XML schemas to represent 
> complex attribute syntax.
> 
> Is this really a big problem?
> Would it be possible to fix this in XML?

There are likely to always be things that are much (_much_) harder to 
express in a formal language than in English. What's the _advantage_ of 
having an official formal schema? Note that I'm not at all opposed to 
making publicly available unofficial formal grammars.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Monday, 16 April 2007 01:50:33 UTC