W3C home > Mailing lists > Public > public-html@w3.org > January 2013

Re: EPUB and XML [was: The non-polyglot elephant in the room]

From: Bill McCoy <whmccoy@gmail.com>
Date: Sun, 27 Jan 2013 07:25:51 -0800
Message-ID: <CAJ0DDbAdMo9AtCUH=cKzfaqw1ftPXKqP8cj1qsjkHUEvq-FDjg@mail.gmail.com>
To: Daniel Glazman <daniel.glazman@disruptive-innovations.com>
Cc: public-html@w3.org
Hi Daniel,

I agree with you that for those using XML of whatever flavor for their
core content, generating XHTML in creating EPUB is a good fit, since
by definition they will have XML-oriented toolchains in place.

I also agree that Press (esp. with capital "P") have been widely using
domain-specific XML formats - e.g. NewsML - and I didn't mean to imply
that these were in any imminent danger of going away.

But as Web/EPUB has become a more central output for many content
publishers, and with HTML5 having more semantic elements and means for
microdata / semantic inflection, there's been something of a trend
towards certain book publishers (at least) looking at (X)HTML as an
option for the core content structure not just as a generated output
format. This has been helped along by  popular blogging platforms like
WordPress and Drupal having (X)HTML as their internal article storage
format, with online WYSIWYG editors like CKEditor integrated into
these systems, and no doubt by CSS-based printing systems like Prince
and standalone native HTML/EPUB editing apps like your own
BlueGriffon. I'm not arguing that this trend will make XML formats
obsolete. I was only suggesting that whether the trend, however
widespread it becomes, ends up centered on XHTML or "tag soup" HTML
might help inform future decisions.

--Bill

On Sun, Jan 27, 2013 at 1:44 AM, Daniel Glazman
<daniel.glazman@disruptive-innovations.com> wrote:
> On 26/01/13 18:44, Bill McCoy wrote:
>
>> situation will evolve over time. There's an increasing number of CMS
>> systems that are based on HTML as content rather than custom XML
>> formats like DITA or DocBook. If 2 years from now these systems
>> prevalently support "tag soup" for articles and other content
>> fragments then I think the answer will be clear. If 2 years from now
>> these systems prevalently store XHTML because it has led to other
>> benefits, that might be another story.
>
>
> Let me mitigate a bit the above: the Press is not using html internally;
> they use XML formats for very good technical reasons. Having discussed
> recently with a major european actor of that domain, they're absolutely
> not ready to switch from XML to a flavor of html for their sources or
> repositories. If they're deeply interested in EPUB to distribute content
> and daily or weekly press reviews, a xml serialization of html is for
> them a better fit at this time.
>
> </Daniel>
>
Received on Sunday, 27 January 2013 15:26:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 27 January 2013 15:26:18 GMT