QA and current state of XML schemas for XHTML?

Dear Quality Assurance Dev group,
As you may know, there is soon a dead-line (4 August) for the call for comments regarding XHTML Modularization 1.1 and XHTML
Basic 1.1.

As an author of a very small validation prototype, I have been expecting an update of XHTML Modularization for a long time, in
order to correct some long standing errors and issues in XML Schemas for the various versions of XHTML.

The dead-line is approaching, but 'www-html-editor@w3.org' and 'www-html@w3.org' are still silent regarding those reported
problems.

Therefore, I am writing to ask your group to _assure_ that the _quality_ of the updated XML Schemas will be optimal.

Indeed, I believe that DTD-based validators are not good enough, and while XML Schema is maybe not (yet?) perfect, it is already
a good improvement. Therefore, I cannot really understand why those schemas, which should - in my opinion - be the second most
important deliverable after the specifications, are almost completely forgotten. If a team does not like XML Schema for any
reason, that is fine as long as an alternative is provided (Relax NG, Schematron, etc.), as e.g. the SVG team does.

Currently, the only "acceptable" XML Schema for XHTML is the one for XHTML 1.0 Strict, and it contains some parts that are imho
lazily defined. Short extract:

http://www.w3.org/2002/08/xhtml/xhtml1-strict.xsd
 <!-- a character encoding, as per [RFC2045] -->
 <xs:simpleType name="Charset">
   <xs:restriction base="xs:string"/>
 </xs:simpleType>
 <!-- comma-separated list of media types, as per [RFC2045] -->
 <xs:simpleType name="ContentTypes">
   <xs:restriction base="xs:string"/>
 </xs:simpleType>

Except if one makes a validator that is able to understand natural language in XML comments and then to fetch and parse the
RFCs, that type of definition is useless.

There is so far no official XML Schema for XHTML 1.1, XHTML Basic 1.0, XHTML Basic 1.1 and the official schemas for XHTML 1.0
Transitional and XHTML 1.0 Frameset contain some bugs. (I am aware that XHTML 1.0 has nothing to do with XHTML Modularization,
but it is on the same topic).


Here is a summary of the current status of XML Schema definitions for various versions of XHTML, with issues I have found and
reported after running some tests on a very small pool of XHTML documents:

+ XHTML 1.1:
  - Best available version (2006-07-05, not official)
    [http://www.w3.org/TR/xhtml-modularization/SCHEMA/xhtml11.xsd]
    # Error: do not allow events such as "onmouseover",
      in contradiction with the DTD and the specification
      [http://lists.w3.org/Archives/Public/www-html-editor/2006JulSep/0002.html]
      [http://lists.w3.org/Archives/Public/www-html/2006Jun/0029.html]
      [http://lists.w3.org/Archives/Public/www-html-editor/2006AprJun/0026.html]
    # xs:import problem (no schemaLocation for XHTML datatypes namespace in the driver)
      [http://lists.w3.org/Archives/Public/www-html-editor/2006JulSep/0002.html]
      [http://lists.w3.org/Archives/Public/www-html-editor/2006AprJun/0014.html]

+ XHTML Basic 1.0:
  - Best available version
    [http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd]
  - Latest version (2006-07-05, not official)
    [http://www.w3.org/TR/xhtml-modularization/SCHEMA/xhtml-basic10.xsd]
    # Error: "html" element not recognised...
      [http://lists.w3.org/Archives/Public/www-html-editor/2006JulSep/0002.html]
      [http://lists.w3.org/Archives/Public/www-html/2006Jun/0029.html]
      [http://lists.w3.org/Archives/Public/www-html/2006May/0009.html]

+ XHTML Basic 1.1:
  - None available

+ XHTML Modularization 1.1:
  - Best available version (2006-07-05)
    # Datatypes for "Charset", "ContentType", "MultiLengths" lazily defined
      (as strings, without any additional constraint)
      [http://lists.w3.org/Archives/Public/www-html-editor/2006JulSep/0015.html]
      [http://lists.w3.org/Archives/Public/www-html-editor/2006JulSep/0022.html]

+ XHTML Modularization 1.0:
  - Best available version (2006-02-13)
    # Same issues as XHTML Modularization 1.1

+ XHTML 1.0 Strict:
  - Best available version (2002-07-02)
    [http://www.w3.org/2002/08/xhtml/xhtml1-strict.xsd]
    # Same type of issues as XHTML Modularization for lazily defined datatypes

+ XHTML 1.0 Transitional:
  - Best available version (2002-07-02)
    [http://www.w3.org/2002/08/xhtml/xhtml1-transitional.xsd]
    # Error: "name" attribute missing for the "form" element
    # Same type of issues as XHTML Modularization for lazily defined datatypes

+ XHTML 1.0 Frameset:
  - Best available version (2002-07-02)
    [http://www.w3.org/2002/08/xhtml/xhtml1-frameset.xsd]
    # Major error in MultiLengths datatype
    # Same type of issues as XHTML Modularization for lazily defined datatypes


I take the chance to vote for a reasonable bug tracking system, easily accessible for the public with a direct link from the
recommendations, as well as a way to submit some patches, to be reviewed and applied within reasonable time.
[http://www.w3.org/Bugs/Public/] and [http://htmlwg.mn.aptest.com/voyager-issues/] are good, but hard to find, to browse, and
could be more up to date.

Indeed, here is just one final example of frustration
[http://lists.w3.org/Archives/Public/www-html-editor/2001OctDec/1246.html]: it took more than 4 (four) years for a major known
error in the official XHTML Basic 1.0 DTD to be addressed (the problematic module "XHTML Base Architecture" was then simply
removed...). In the mean time, this error was not reported in any public bug tracking system or official document, such as the
"known errors" section of the specification
[http://www.w3.org/2000/12/REC-xhtml-basic-20001219-errata] stating wrongly "Known errors: None at this time". Similarly, there
was no report to tell that the error was finally "corrected".

So, please have a look to the reported issues regarding XHTML Basic 1.1 and XHTML Modularization 1.1, before they are published.

Sincerely yours,
Alexandre
http://alexandre.alapetite.net

Received on Sunday, 30 July 2006 01:09:17 UTC