Re: Formal definition of HTML5 (was Re: Version information)

On Apr 14, 2007, at 11:20, Henrik Dvergsdal wrote:

> However, as a developer I don't like the idea of being dependend on  
> proprietary blackboxes for validation.

The implementation I am offering is neither proprietary nor a black  
box. You may use it as a black box, though, if you don't want to look  

> Today I can just pick up an xml parser, the xhtml DTD and then  
> validate XHTML markup just as I validate any other XML document.

Yes, you can, but all you have accomplished is that you've checked  
against a subset of machine-checkable criteria and gained a false  
feeling of having done a full check. DTDs are far less expressive  
than generally thought.

> The same goes for authoring tools. I think It would be good for the  
> web to have schema driven authoring tools that could allways refer  
> to the latest version of a normative HTML5 schema - even if they  
> miss out some aspects of the language.

As far as functionality goes, a non-normative schema or off-the-shelf  
checker is as good as normative.

> In my view, you should have really some compelling reasons for not  
> not defining HTML5 by means of a schema. Could you be a little more  
> specific on this?

Experience suggests that designating a normative schema causes people  
to use it and to ignore the machine-checkable conformance criteria  
that the schema does not embody. (Case study: Any normative DTD and  
any DTD-based validation service as used in practice.) Moreover,  
experience suggests that when one schema is normative, better  
implementations have an uphill explaining battle. (Case study: Need  
to explain the legitimacy of Relaxed just about every time it comes  
up in a discussion with people who trust normative DTDs.)

> So far I've heard that this is about failure of XML schemas to  
> represent complex attribute syntax.

The W3C XML Schema datatypes are useless for HTML5 conformance  
checking with the exception of the regular expression facet of the  
string type. RELAX NG has an escape hatch that allows custom  
datatypes to be implemented in a Turing-complete programming  
language. But that comes back to the issue of having to use a Turing- 
complete language.

In the case of my HTML5 datatype library, the plan is to write a spec  
that is precise enough to allow independent interoperable  
implementations without inspecting the source of my implementation.

Henri Sivonen

Received on Saturday, 14 April 2007 11:35:01 UTC