W3C home > Mailing lists > Public > public-html@w3.org > March 2010

Re: Schemas and validation

From: Henri Sivonen <hsivonen@iki.fi>
Date: Tue, 2 Mar 2010 09:48:40 +0200
Cc: "Maciej Stachowiak" <mjs@apple.com>, "Leonard Rosenthol" <lrosenth@adobe.com>, "Anne van Kesteren" <annevk@opera.com>, "Larry Masinter" <LMM@acm.org>, "'Toby Inkster'" <tai@g5n.co.uk>, "'Adam Barth'" <w3c@adambarth.com>, "'HTML WG'" <public-html@w3.org>
Message-Id: <A7DD475E-8467-46DD-80BA-84A938EB762B@iki.fi>
To: Joe D Williams <joedwil@earthlink.net>
On Mar 2, 2010, at 01:10, Joe D Williams wrote:

> Maciej Sent: Monday, March 01, 2010 12:22 PM
>> On Mar 1, 2010, at 11:57 AM, Joe D Williams wrote:
>> 
>>>> no schema language can capture all the conformance requirements of XHTML5.
>>> 
>>> maybe so, because some requirements are runtime.
>> 
>> Henri's not talking about runtime requirements. The static machine- checkable syntax conformance requirements of HTML5 cannot be fully and  correctly expressed in any of the existing popular schema languages.
> 
> OK, like the URI example?

That's one example. Another (my favorite) is table integrity (http://hsivonen.iki.fi/thesis/html5-conformance-checker#table-integrity). There are others.

>>> If we can't produce a valid (highly informative) XML Schema that can  accurately represent the authortime syntax and sctructure requirements, then there will be no firm standards-track crosscheck between authortime content structures, the intent of the standard, and the runtime of the operating browser.
>> 
>> The crosscheck would be to use the validator.
> 
> Aren't we seeing some success with schema-driven validators?

I believe that users who prefer Validator.nu's features over the DTD-based features of the W3C Validator tend to like features that aren't attributable to RELAX NG but to the parts hand-written in Java.

Furthermore, there's a constant pressure to implement more and more of Validator.nu in Java instead of schema languages in order to improve the user experience even when the different implementation approach wouldn't change the set of documents recognized as (in)valid.

> With content exceptions present in html5 I could expect that some hand-tooling would be required to accept all html5 code, but I also would believe it should be possible to construct a schema that could valdate a target "correct' or recommendied form that could tell us if elements are not structured as intended by the spec, and some other details.

Like I said, you can get approximations using schemas, yes.

In practice, an approximation is available. It isn't blessed by this WG. I think the WG shouldn't bless a particular approximation, because doing so would discourage the development of better approximations.

>> My understanding is that DTDs and XML Schema are both significantly weaker than Relax NG and can represent even fewer of the requirements  accurately.
> 
> DTD for sure no way, I think. I am looking for a schema example. What element/attribute structures, attribute values, and content element features in html5 can't be shown via xml schema.

(Disclaimer: I didn't double-check that this exceeds the capabilities of XSD 1.0.)
The <video> element allows <source> children only if it doesn't have the src attribute. This can be represented in RELAX NG, but, IIRC, this can't be represented in XSD 1.0.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Tuesday, 2 March 2010 07:49:21 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:59 UTC