- From: Anne van Kesteren <annevk@opera.com>
- Date: Wed, 17 Jun 2009 13:51:45 +0200
- To: "Jonathan Rees" <jar@creativecommons.org>
- Cc: "Dan Connolly" <connolly@w3.org>, "Henry S. Thompson" <ht@inf.ed.ac.uk>, www-archive@w3.org
On Wed, 17 Jun 2009 13:47:12 +0200, Jonathan Rees <jar@creativecommons.org> wrote: > I don't see how your answer or the linked documents bear on my > question, so let me amplify. > > The ideal situation: you can take any HTML5 document, convert it to > some XML-based language designed for the purpose (not necessarily > XHTML), convert it back, and get a semantically equivalent HTML5 > document. The parser of the HTML syntax is Turing-complete so that will not work. (You can inject characters into the tokenizer.) > The problem I'm worried about is the lack of interoperability between > HTML5 and XML processors. (It has nothing to do with browsers.) Other > specs such as OWL 2 and XQuery have addressed this problem by > providing XML syntax as an alternative. But this only achieves the > intended effect if semantics-preserving round trips work. > > For comparison, 'tidy' provides conversion from HTML4 to XHTML (I > think), and the resulting XHTML is in a subset (I think) of HTML4, so > the round trip property holds. I assume this approach doesn't work for > HTML5, which is why I do not necessarily have XHTML in mind as the > representation. If 'tidy' is good enough and you consider it working I do not see why that would not work for HTML5. -- Anne van Kesteren http://annevankesteren.nl/
Received on Wednesday, 17 June 2009 11:52:27 UTC