W3C home > Mailing lists > Public > public-webapps@w3.org > July to September 2012

Re: Web Components Suggestion

From: Michael[tm] Smith <mike@w3.org>
Date: Mon, 13 Aug 2012 09:56:07 +0900
To: "Tab Atkins Jr." <jackalmage@gmail.com>
Cc: Florian Bösch <pyalot@gmail.com>, Dave Geddes <davidcgeddes@gmail.com>, public-webapps@w3.org
Message-ID: <20120813005606.GA76226@sideshowbarker>
"Tab Atkins Jr." <jackalmage@gmail.com>, 2012-08-12 15:43 -0700:

> What Dimitri said, but to address your comment directly, DTD-based
> validation is long-dead, at least when applied to HTML.  A DTD can't
> capture the validity requirements that the HTML spec already imposes,
> so it's irrelevant if it also can't validate a document containing
> custom elements.  The current validator used by the W3C is a
> combination of (iirc) constrains expressed in Schematron and custom
> Java code.

The core of the backend for the W3C Nu Markup Validator
(http://validator.w3.org/nu/) and validator.nu is James Clark's Jing, a
Relax NG implementation. The backend doesn't actually use Schematron, for
performance reasons. Instead it has some Java code to perform the
equivalent the of assertions-based checking that Schematron provides but
that can't be done with grammar-based checking alone (whether with Relax NG
or anything else). No grammar-based schema language is capable of
expressing all the constraints in HTML spec. Things like checking the data
types (microsyntaxes) of attribute values requires custom code --
especially if you want to report useful messages for errors (something
regexp-based checking it totally useless for). Also, more to the point
here, things like the fact that arbitrary attribute names prefixed with
"data-" are valid -- grammar-based checkers can't handle that at all. So
the validator.nu backend has some custom code that Henri wrote that drops
those data-* attributes -- basically, filters them out -- before the Jing
part of the toolchain even sees them.


Michael[tm] Smith http://people.w3.org/mike
Received on Monday, 13 August 2012 00:56:12 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 18:13:38 UTC