Re: Support for XHTML5 from Robin Berjon on 2015-12-03 (public-scholarlyhtml@w3.org from December 2015)

From: Robin Berjon <robin@berjon.com>
Date: Thu, 3 Dec 2015 15:20:51 -0500
To: Sebastian Heath <sebastian.heath@gmail.com>, W3C <public-scholarlyhtml@w3.org>
Message-ID: <5660A423.50302@berjon.com>

Hi Sebastian,

thanks for taking the time to follow up from our Twitter discussion here.

On 03/12/2015 13:00 , Sebastian Heath wrote:
>  I of course welcome the development of a W3C standard for Scholarly
> HTML. For existing publications such as ISAW Papers support for the
> XHTML concrete syntax[3] of the abstract HTML language is important.

I hate to be pedantic, but then this is standards so we kind of have to.
The word "support" is too vague, certainly without reference to a
conformance class. Here are several meanings it could have:

  A) Processors must accept XHTML documents.
  B) Documents must use the XHTML syntax.
  C) SH must be compatible with the ecosystem of tools that consume
XHTML today.

If the idea is (A), then that's by and large a given. Over HTTP use the
right media type and you'll be fine; in other contexts make sure your
XHTML is Appendix-C compliant (which is probably a good idea to start
with) and you'll be good too, even with processors that expect HTML.

If the idea is (B), then I would have to disagree. XHTML is a legacy
format, I can't think of an area still maintained that relies on it but
isn't actively moving away. People can certainly use it as a transition
technology, but locking a new format into it would be a strange move.

And if the idea is (C), then it probably depends on the ecosystem but in
general there is no reason why you can't just put an HTML parser in
front of an XML pipeline.

>  For me the need for XHTML is practical. I'm looking for a robust,
> widely recognized standard that can serve the end-to-end goals of
> scholarly publication that include straightforward creation, open source
> tools for our editorial work-flow, accessible publication in the
> immediate and medium term, and long-term archival storage and the
> expectation of readability far into the future.

Very much agreed; and those are all reasons why HTML is a practical
choice :)

>  As in, I use a lot of XSLT and I want rigorous validation;  I look to
> align with epub[4] and other efforts; etc.

The tendency in EPUB is more away from XHTML than towards it. Were we to
use XHTML for EPUB 3 alignment then we should also use epub:type instead
of role, and I think that would be problematic (and also against the
notion of long-term archival).

XSLT and validation are DOM-level operations (or Infoset, or XDM), they
don't apply to syntax. Are you doing anything specific with the syntax
that prevents you from just fronting your XSLT/validation pipeline with
an HTML parser?

-- 
• Robin Berjon - http://berjon.com/ - @robinberjon
• http://science.ai/ — intelligent science publishing
•

Received on Thursday, 3 December 2015 20:21:18 UTC