Re: The non-polyglot elephant in the room from Daniel Glazman on 2013-01-22 (public-html@w3.org from January 2013)

From: Daniel Glazman <daniel.glazman@disruptive-innovations.com>
Date: Tue, 22 Jan 2013 09:59:26 +0100
To: 'Henri Sivonen' <hsivonen@iki.fi>
CC: Michael Smith <mike@w3.org>, lrosenth@adobe.com, Anne van Kesteren <annevk@annevk.nl>, HTML WG <public-html@w3.org>, TAG List <www-tag@w3.org>
Message-ID: <50FE54EE.5080308@disruptive-innovations.com>

> I think requiring XHTML is the least of EPUB’s problems when it comes
> to author ergonomics.

Henri, with all due respect, you probably have carefully read the EPUB
specs but I'm not sure you are well aware of what is the software
production chain in the publishing industry...
Using an XML model is far more important than you think because it gives
EPUB authors the ability to do a first trivial validation on the
elements contained in a document instance w/o having to call an
expensive (in terms of time) validator. All IDEs out there are able to
tell you if you miss an end tag, or if your attributes are faulty
and that's already a bit of something, trivially implemented.

> The main annoyances are needless indirection (Why do you need to be
> able to locate the OPF wherever you want and have a pointer to it in a
> well-known location? Why aren't you just put the OPF in the well-known

Because EPUB is based on the assumption a reader may have no knowledge
and even no visibility of a filesystem, mimetypes, and such. The OPF
and its manifest then serve as a central declaration point for all
things found in the package. Yeah, that can be seen as painful; it
can also be seen as a savior by low-end devices.

> location?), the dependency on an XML vocabulary even worse than OPML
> (NCX), the requirement to declare various things that the reading

NCX is obsoleted in EPUB 3.

> system could easily inspect itself and cache for later use (e.g.
> whether a given file uses scripting or has MathML) and reinventing

Because parsing a potentially VERY long document just to know if it
contains script is a too expensive operation when the user is waiting
for its UI to present something. And if a given ebook reader does not
implement JS, you need to know it right away to let the user know
he/she may miss something in the document...
I agree this is not ideal. The world is not ideal. I agree EPUB 3
solution for the scripted and mathml properties is a compromise
probably needed by some ebook readers.

> ways to express many things that HTML can already express (stating
> book title and authorship without XHTML <title> and <meta
> name=author>, declaring the order of XHTML files using <spine> instead
> of <link rel=next> in the files themselves).

I am probably the only one in the world who implemented EPUB 3 metadata
authoring _fully_ . EPUB 3 metadata are incredibly powerful, probably
too powerful but the thing I know is that their power is currently
impossible to express in html. Same thing for the relationships between
document instances in the package.

> The annoyances mentioned in the previous paragraph make EPUB authoring
> by hand is terrible enough that you need a tool, and once you have a
> tool you might as well throw HTML to XHTML conversion into the tool.

Please define "tool" here? You mean a specialized app with knowledge of
the xml dialects used by EPUB or a generic editing tool with XML
knowledge, for instance emacs or eclipse?

</Daniel>

Received on Tuesday, 22 January 2013 08:59:57 UTC