- From: scott lewis <sfl@scotfl.ca>
- Date: Thu, 5 Jul 2007 16:33:51 -0600
- To: HTML Working Group <public-html@w3.org>
On 5 Jul 2007, at 1602, Thomas Broyer wrote: > 2007/7/5, scott lewis: > >> HTML5 is a language with two serializations (I'll call them): HTML/ >> xml and HTML5/html. These are both representations of the same >> document. Both serializations of a document must parse identically, >> otherwise they aren't serializations of the same language. There is a >> simple test to ensure that: take a document in one serialization, >> parse it, generate the other serialization from it, then parse the >> other serialization and require the parsed documents are identical. >> > > ...with the exception of <tbody>'s in <table>'s (are there others?). > > Converting this XHTML fragment: > <table><tr><td>Cell</td></tr></table> > to HTML and then back to XHTML will produce: > <table><tbody><tr><td>Cell</td><tr></tbody></table> > except if your converter is able to omit the <tbody> in the XHTML > re-serialization because it's the only child of the <table> (it means > that you're not just parsing and serializing a DOM tree). > I think you're confusing the serialized bytestream with the HTML5 document. You must compare the output of your parser (which may be a DOM tree or some intermediary form -- it's entirely an implementation detail) not the serialized form. There are a number of variations in the serialized form which are normalized by the parser. scott.
Received on Thursday, 5 July 2007 22:34:11 UTC