Re: The non-polyglot elephant in the room

Leonard Rosenthol <lrosenth@adobe.com>, 2013-01-21 05:31 -0800:

> On 1/21/13 8:13 AM, "Anne van Kesteren" <annevk@annevk.nl> wrote:
> >The web developer community went through this exercise once before
> >(around 2002-2006) to see what it took to use XHTML. That was so hard
> >that almost all those participating are now using HTML (again).
> 
> Yes, but (X)HTML is used by many more environments than just the web.
> 
> The best example of this,o f course, would be EPUB, which is based
> entirely on XHTML.  If you remove XHTML handling from the HTML
> specification, then you would make it unusable by EPUB.

I don't think anybody's arguing that the definition of what an XHTML
documents is and how to parse them should be removed from the HTML spec.

As far as I can see Anne at least was referring to the separate Polyglot
specification, which doesn't define XHTML handling at all; it instead just
describes some rules for documents that conform at the same time to both to
the HTML (text/html) rules and XHTML rules that are already in the HTML spec.

Anyway, about EPUB in particular, it's worth noting that there's nothing
inherent in the technology of EPUB that necessitates the use of well-formed
XML/XHTML in it rather than not-necessarily-well-formed text/html.

The reason EPUB requires XHTML is that the EPUB working group made an
explicit choice to require it. They could have chosen to allow text/html
EPUB books but they chose not to. And I think some of the people who
advocated for requiring XHTML didn't understand that existing XML-based
toolchains could be made to handle text/html content just by putting an
HTML parser in front of them.

So HTML could be make usable in EPUB books simply by having the EPUB spec
state that HTML is usable in EPUB books -- instead of having it impose a
technically unnecessary requirement that they must be XHTML.

  --Mike

-- 
Michael[tm] Smith http://people.w3.org/mike

Received on Monday, 21 January 2013 14:25:19 UTC