Re: HTML in EPUB 3.4 (was: Publishing Maintenance WG Teleconference - April 24 2025) from Eric Hellman on 2025-04-24 (public-pm-wg@w3.org from April 2025)

From: Eric Hellman <eric@hellman.net>
Date: Thu, 24 Apr 2025 09:48:05 -0400
To: Laurent Le Meur <laurent.lemeur@edrlab.org>
Cc: "public-pm-wg@w3.org" <public-pm-wg@w3.org>
Message-Id: <7F0E5AF8-5954-4B14-AAA9-D23C6DB39561@hellman.net>

A few points, based on using HTML5 files as source format for 75,000 different ebooks.

1. My "tooling" would easily switch to html5-inside EPUB.

2. Communicating the change to users would be impossible unless it was called "EPUB4". Also, forget EPUB4, it should EPUB5.

3. Publishers wanting to switch will discover that their converted XHTML files don't validate to HTML5 because the HTML5 validator is able to find more errors than the DTD/schema based XML validators. In particular, the requirement that tables should have the right number of cells in every row, while present in the early standards, was never checked by validators, and IS chaecked by HTML5 validators.

Eric

> On Apr 24, 2025, at 3:31 AM, Laurent Le Meur <laurent.lemeur@edrlab.org> wrote:
> 
> Just a warning: I asked the developer of FBReader - one of our members - for his opinion on this evolution. His answer is in brief: 
> 
> "That's a major change that will require significant additional effort on our end. So we will not be happy. However, I think it's not an absolute nightmare for us." 
> 
> Note: FBReader does not use a Web view for rendering EPUB. It uses an XML format internally, not HTML. It also provides basic support for plain HTML files through a separate mechanism. Therefore, it must port some modern features from the XML parser to the HTML parser.
> 
> Conclusion: It is a logical move, but we must communicate extensively and in advance (at least 1 year) so that reading system developers can prepare for that evolution. 
> 
> Best regards
> Laurent LE MEUR / EDRLab
> 
> << Attend the Digital Publishing Summit, 16-17 June 2025, Dublin - https://www.edrlab.org/events/digital-publishing-summit-2025/ >>
> 
> 
>> Le 24 avr. 2025 à 09:15, Gregorio Pellegrino - Fondazione LIA <gregorio.pellegrino@fondazionelia.org> a écrit :
>> 
>> Very interesting. I did some quick tests with the HTML-EPUB. It seems that the reading solutions read it without problems, while the supply chain tools (EPUBCheck, Ace, etc.) generate blocking errors.
>>  
>> This confirms what was discussed earlier: this change to the specification mainly impacts the supply chain (where many tools based on XML technologies are present), than the reading solutions, which in many cases are able to read HTML without problems.
>>  
>> In the meantime, I send the regrets for today: I am at the IAAP Europe conference in Brno and during the meeting time I have to moderate a panel discussion 😊
>>  
>> Gregorio
>>  
>> Da: Toshiaki Koike <koike@voyager.co.jp>
>> Data: giovedì, 24 aprile 2025 alle ore 05:34
>> A: public-pm-wg@w3.org <public-pm-wg@w3.org>
>> Oggetto: Re: [AGENDA] Publishing Maintenance WG Teleconference - April 24 2025
>> 
>> Hi all,
>>  
>> I have created a script to experimentally convert existing XHTML-based EPUB 3 files to HTML-based EPUB 3. Using this script, I converted a Japanese EPUB 3 sample into an HTML-based sample.
>> As expected, EPUBCheck v5.2.1 reports errors for this file.
>>  
>> https://github.com/toshiakikoike/html-based-epub-experimental
>>  
>>  
>

Received on Thursday, 24 April 2025 13:48:21 UTC