Re: HTML in EPUB 3.4 (was: Publishing Maintenance WG Teleconference - April 24 2025)

Ivan,

Is it contemplated that the epub 3.4 validator will use a proper html5 validator for html content files? So as to have two different validators for two different flavors of epub 3.4 content files? 

In other words, will epub 3.4 require html5 validation of html5 content files?
WIll epub 3.4 allow only a subset of html5 features? If so, how will it validate compliance?

Or perhaps, will epub 3.4 unify a valid DOM across both xml and html content files?

Because "publishers may use HTML content documents in future EPUB publications" is not exactly a plan.

Eric

> On Apr 29, 2025, at 1:33 AM, Ivan Herman <ivan@w3.org> wrote:
> 
> Hi Eric 
> 
> (this comment is with my staff contact hat put down…)
> 
> I have the impression, based on your comments below, that one factor is not clear (and we should be very careful about the messaging on that aspect). The plan is not to replace XHTML by HTML. We could not and should not do that; we have a strong constraint in our charter whereby we should keep backward compatibility. Any valid EPUB 3.3 document must remain valid EPUB 3.4. What we propose is that publishers may use HTML content documents in future EPUB publications. In other words, for example, publishers are not expected to "convert" their XHTML content files to HTML (which would definitely not obvious, just as you say). 
> 
> That is also why the WG has decided, during the charter discussion, to keep to the EPUB 3.4 denomination, b.t.w.
> 
> Cheers
> 
> Ivan 
> 
>> On 24 Apr 2025, at 15:48, Eric Hellman <Eric@hellman.net> wrote:
>> 
>> A few points, based on using HTML5 files as source format for 75,000 different ebooks.
>> 
>> 1. My "tooling" would easily switch to html5-inside EPUB.
>> 
>> 2. Communicating the change to users would be impossible unless it was called "EPUB4". Also, forget EPUB4, it should EPUB5.
>> 
>> 3. Publishers wanting to switch will discover that their converted XHTML files don't validate to HTML5 because the HTML5 validator is able to find more errors than the DTD/schema based XML validators. In particular, the requirement that tables should have the right number of cells in every row, while present in the early standards, was never checked by validators, and IS chaecked by HTML5 validators.
>> 
>> Eric
>> 
>>> On Apr 24, 2025, at 3:31 AM, Laurent Le Meur <laurent.lemeur@edrlab.org> wrote:
>>> 
>>> Just a warning: I asked the developer of FBReader - one of our members - for his opinion on this evolution. His answer is in brief: 
>>> 
>>> "That's a major change that will require significant additional effort on our end. So we will not be happy. However, I think it's not an absolute nightmare for us." 
>>> 
>>> Note: FBReader does not use a Web view for rendering EPUB. It uses an XML format internally, not HTML. It also provides basic support for plain HTML files through a separate mechanism. Therefore, it must port some modern features from the XML parser to the HTML parser.
>>> 
>>> Conclusion: It is a logical move, but we must communicate extensively and in advance (at least 1 year) so that reading system developers can prepare for that evolution. 
>>> 
>>> Best regards
>>> Laurent LE MEUR / EDRLab
>>> 
>>> << Attend the Digital Publishing Summit, 16-17 June 2025, Dublin - https://www.edrlab.org/events/digital-publishing-summit-2025/ >>
>>> 
>>> 
>>>> Le 24 avr. 2025 à 09:15, Gregorio Pellegrino - Fondazione LIA <gregorio.pellegrino@fondazionelia.org> a écrit :
>>>> 
>>>> Very interesting. I did some quick tests with the HTML-EPUB. It seems that the reading solutions read it without problems, while the supply chain tools (EPUBCheck, Ace, etc.) generate blocking errors.
>>>>  
>>>> This confirms what was discussed earlier: this change to the specification mainly impacts the supply chain (where many tools based on XML technologies are present), than the reading solutions, which in many cases are able to read HTML without problems.
>>>>  
>>>> In the meantime, I send the regrets for today: I am at the IAAP Europe conference in Brno and during the meeting time I have to moderate a panel discussion 😊
>>>>  
>>>> Gregorio
>>>>  
>>>> Da: Toshiaki Koike <koike@voyager.co.jp>
>>>> Data: giovedì, 24 aprile 2025 alle ore 05:34
>>>> A: public-pm-wg@w3.org <public-pm-wg@w3.org>
>>>> Oggetto: Re: [AGENDA] Publishing Maintenance WG Teleconference - April 24 2025
>>>> 
>>>> Hi all,
>>>>  
>>>> I have created a script to experimentally convert existing XHTML-based EPUB 3 files to HTML-based EPUB 3. Using this script, I converted a Japanese EPUB 3 sample into an HTML-based sample.
>>>> As expected, EPUBCheck v5.2.1 reports errors for this file.
>>>>  
>>>> https://github.com/toshiakikoike/html-based-epub-experimental
>>>>  
>>>>  
>>> 
>> 
> 
> 
> ----
> Ivan Herman, W3C 
> Home: http://www.w3.org/People/Ivan/
> mobile: +33 6 52 46 00 43
> 
> 

Received on Tuesday, 29 April 2025 13:19:27 UTC