W3C home > Mailing lists > Public > public-html-xml@w3.org > December 2010

Re: What problem is this task force trying to solve and why?

From: James Clark <jjc@jclark.com>
Date: Mon, 20 Dec 2010 10:56:19 +0700
Message-ID: <AANLkTikuxgw7O81Kb4vdY2BhwWAgmgJdUBT==1ohED_W@mail.gmail.com>
To: public-html-xml@w3.org
On Sat, Dec 18, 2010 at 7:49 PM, Henri Sivonen <hsivonen@iki.fi> wrote:

What problem is this task force trying to solve and why?

I understand our goal is "convergence" HTML of XML.  What would constitute
convergence? I would suggest that the best one could hope for is something
like this:

- a subset of XML (and maybe XML namespaces); for the sake of discussion,
call this "convergently well-formed XML"
- some tweaks to the HTML syntax of HTML5
- a subset of the tweaked HTML syntax of HTML5; call this "convergently
valid HTML5"
- a subset of the XML Infoset; call this the "convergent XML infoset"

such that:

- Any convergently valid HTML5 document is convergently well-formed XML
- Convergent HTML validity of a document can be defined as a combination of
convergent XML well-formedness and constraints on the convergent XML infoset
of the document
- Any valid HTML5 document can be transformed into a convergently valid
HTML5 document (possibly referencing new resources)

The idea is to make polyglot documents a solid, reliable, workable approach.
 HTML5 in the HTML syntax could be processed by XML tools like a normal XML
vocabulary, provided only that the XML tools know about the extra
constraints of convergent well-formedness.

>From this perspective, the following areas of the HTML syntax of HTML5
deserve discussion:

   1. End-tags. Valid HTML5 does not allow end-tags for "void" (always
   empty) element types. HTML5 parsers will ignore such end-tags except in one
   case (<br>).
   2. Empty-element syntax. Valid HTML5 allows empty-element syntax (<foo/>)
   only for "void" element types. If you use empty-element syntax for a non
   "void" element type, it will be treated like a normal start-tag.
   3. Comments.  HTML5 imposes restrictions on comments beyond those in
   HTML4 or XML (must not start with "-" or "->")
   4. DOCTYPE declaration. HTML5 documents have to start with a DOCTYPE
   5. xlink:href on the SVG <a> element. I believe this is the only case
   where critical functionality in HTML5 requires the use of namespace syntax.
   Note that HTML5 already performs certain adjustments to the names of
   attributes on SVG elements.
   6. script/style elements (I can't think of any solution better than
   relying on external script/style, but it's worth brainstorming)

In addition, I think we should look at whether XHTML would benefit from
changes to XML 1.0's error handling requirements, in particular this

Once a fatal error is detected, however, the processor must not continue
> normal processing (i.e., it must not continue to pass character data and
> information about the document's logical structure to the application in the
> normal way).

Also I think we should look at the HTML5 distributed extensibility issue


Received on Monday, 20 December 2010 03:56:52 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:58:27 UTC