W3C home > Mailing lists > Public > www-archive@w3.org > June 2009

Re: question about XML and HTML5

From: Jonathan Rees <jar@creativecommons.org>
Date: Wed, 17 Jun 2009 07:47:12 -0400
Message-ID: <760bcb2a0906170447g415e54abk82439d8c6874d615@mail.gmail.com>
To: Anne van Kesteren <annevk@opera.com>
Cc: Dan Connolly <connolly@w3.org>, "Henry S. Thompson" <ht@inf.ed.ac.uk>, www-archive@w3.org
I don't see how your answer or the linked documents bear on my
question, so let me amplify.

The ideal situation:  you can take any HTML5 document, convert it to
some XML-based language designed for the purpose (not necessarily
XHTML), convert it back, and get a semantically equivalent HTML5
document.

The problem I'm worried about is the lack of interoperability between
HTML5 and XML processors. (It has nothing to do with browsers.) Other
specs such as OWL 2 and XQuery have addressed this problem by
providing XML syntax as an alternative. But this only achieves the
intended effect if semantics-preserving round trips work.

For comparison, 'tidy' provides conversion from HTML4 to XHTML (I
think), and the resulting XHTML is in a subset (I think) of HTML4, so
the round trip property holds. I assume this approach doesn't work for
HTML5, which is why I do not necessarily have XHTML in mind as the
representation.

Jonathan

On Wed, Jun 17, 2009 at 6:57 AM, Anne van Kesteren<annevk@opera.com> wrote:
> On Wed, 17 Jun 2009 12:51:05 +0200, Jonathan Rees <jar@creativecommons.org> wrote:
>> This question sounds so stupid that I didn't want to ask it in public.
>>
>> Many web-related languages that have idiosyncratic syntax also provide
>> an XML surface syntax. Examples are Turtle (RDF/XML), xquery, OWL 2
>> (OWL/XML). To ensure that HTML5 can participate in XML pipelines in a
>> standard way, wouldn't it be a good idea to have a standard XML
>> surface syntax for HTML5, with semantics preserved over round trips?
>> Perhaps this even could be done using a set of extensions to XHTML.
>
> http://www.whatwg.org/specs/web-apps/current-work/multipage/the-xhtml-syntax.html
>
> Already works fine in modern browsers for new elements such as <canvas>, <video>, etc.
>
>
> There is also the following section for how an HTML byte stream maps to an infoset
>
> http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html#coercing-an-html-dom-into-an-infoset
>
> which I believe is implemented by the Validator.nu software.
>
>
> --
> Anne van Kesteren
> http://annevankesteren.nl/
>
Received on Wednesday, 17 June 2009 11:47:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 7 November 2012 14:18:25 GMT