- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Tue, 22 Jan 2013 09:52:47 +0200
- To: "Michael[tm] Smith" <mike@w3.org>
- Cc: www-archive <www-archive@w3.org>
(www-archive so that the HTML WG Chairs don’t need to remind us to stay on topic.) On Tue, Jan 22, 2013 at 7:18 AM, Michael[tm] Smith <mike@w3.org> wrote: > Henri Sivonen <hsivonen@iki.fi>, 2013-01-21 16:48 +0200: > >> On Mon, Jan 21, 2013 at 4:24 PM, Michael[tm] Smith <mike@w3.org> wrote: >> > Anyway, about EPUB in particular, it's worth noting that there's nothing >> > inherent in the technology of EPUB that necessitates the use of well-formed >> > XML/XHTML in it rather than not-necessarily-well-formed text/html. >> >> Nothing inherent, sure, but in practice, backwards-compatibility with >> existing Reading Systems is a pretty big deal, > > Really? Yes. Adobe’s Reader Mobile SDK enforces well-formedness and is emdebbed in pretty much every non-Amazon E Ink-based e-reading device. Also, those devices are within their useful life, but it’s virtually certain that many of them will never see another software update. > To be clear, I was referring to the current EPUB spec, EPUB3. And as far as > I understand it at least, EPUB3 documents are not backward-compatible with > prior EPUB2 Reading Systems. EPUB3 books that use fixed layout or scripting get their layout messed up in their scripting rendered non-operational in EPUB2-only Reading Systems. However, if you have an actual book-like thing (no video, audio or scripting) that is a pure-EPUB3 file, you can read the book on an EPUB2 Reading System with the following caveats: 1) fixed layout gets messed up (but you can still read the text) 2) MathML gets messed up 3) the table of contents appears empty. And in practice, today as well as in the future, there will be plenty of publications that don't use math or fixed layout. Also, I expect that EPUB3 publications will include an NCX table of contents for compatibility with legacy Reading Systems for a long time, since generating the NCX can be automated and publishing workflows written for EPUB2 already have that automation. So even if one of the selling points of EPUB3 is that you no longer need to deal with NCX, in practice you will deal with it anyway for compatibility. > And I think EPUB2 documents are not > necessarily forward-compatible with EPUB3 systems, because EPUB2 documents > may have some markup features that are not supported in existing browsers. In practice, EPUB2 books out there use XHTML—not DTB, so if you meant DTB, the issue is only theoretical. > I mean, EPUB3 systems are mostly just UI wrappers around an existing > browser engine, right? If you read the spec carefully, it's clear that it's been designed in such a way that EPUB3 reading systems can decide on a per-XHTML file basic whether to render the file in a paginating-capable non-browser engine-based renderer originally developed for EPUB2 or in a hastily-ported bolt-on copy of WebKit that doesn't even support pagination. (Of course, serious WebKit-based Reading Systems already support pagination in WebKit.) > It's not clear to me that you'd actually lose anything at all, since as I > said I don't think there's actually an expectation for backwards- > compatibility of EPUB3 documents with existing EPUB2 systems. They're > already known to be incompatible in other ways. There's a huge difference between losing the table of contents and being unable to read the book content at all. >> > The reason EPUB requires XHTML is that the EPUB working group made an >> > explicit choice to require it. They could have chosen to allow text/html >> > EPUB books but they chose not to. And I think some of the people who >> > advocated for requiring XHTML didn't understand that existing XML-based >> > toolchains could be made to handle text/html content just by putting an >> > HTML parser in front of them. >> >> In fairness, when the decision was made for EPUB, text/html parsing >> had not been defined. > > What version of EPUB do you mean? I mean EPUB in general, because you don’t get to throw away compatibility when you increment the spec version number. > When the decision was made for EPUB3, the > HTML (HTML5) spec already included a definition for text/html parsing. I > think in fact at that time you had even already implemented it for the > validator and maybe even landed it in Gecko too. Sure, but they have legacy, too. Reader Mobile SDK is to EPUB what IE on XP is to the Web. > Among other > problems that gives vendors very little incentive to compete on the quality > of their Reading Systems. Yet, Reader Mobile SDK, which is sold by DRM entanglements—not by implementation quality, is better at intra-paragraph typography than e.g. Kobo’s own WebKit-based EPUB engine. And Kobo ships both. So you get better typography for books bought from the Kobo store by exporting them as Adobe DRM and loading them onto your Kobo device via ADE so that Adobe DRM forces the device to use the Adobe engine instead of the Kobo engine! >> The main annoyances are needless indirection (Why do you need to be >> able to locate the OPF > > OK so yeah I see you are talking about EPUB2. I think things are much > better for EPUB3 authors. No they aren’t really. The only improvement is the lack of NCX. Except you still need NCX for compat with old Reading Systems. So now you need NCX and then something else, too! >> wherever you want and have a pointer to it in a >> well-known location? Why aren't you just put the OPF in the well-known >> location?), > >> reinventing ways to express many things that HTML can already express >> (stating book title and authorship without XHTML <title> and <meta >> name=author>, declaring the order of XHTML files using <spine> instead of >> <link rel=next> in the files themselves). > > I expect those particular discrepancies no longer exist in EPUB3. They do. >> The annoyances mentioned in the previous paragraph make EPUB authoring >> by hand is terrible enough that you need a tool, and once you have a >> tool you might as well throw HTML to XHTML conversion into the tool. > > All of which is true for EPUB2 I guess. But my point was that there is no > strong technical reason to perpetuate the XHTML-only requirement further > into EPUB3 -- and (to repeat something for emphasis) that text/html HTML > could be made usable in EPUB3 books simply by having the EPUB3 spec state > that HTML is usable in EPUB3 books. Considering what I said in this email and in the one you replied to, I think it made sense to limit EPUB3 to XHTML-only, too. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Tuesday, 22 January 2013 07:53:15 UTC