XHTML 2.0: Where Is It Going?

I have great concerns about the future of XHTML, as I have had for a
long time. The current public status of XHTML 2.0 is given in the HTML
WG roadmap:-

[[[
XHTML 2.0 is a next generation markup language. In this version, the
functionality is expected to remain similar to (or a superset of) that
of XHTML 1.1. However, the markup language may be altered semantically
and syntactically to conform to the requirements of related XML
standards such as XML Linking and XML Schema. The objective of these
changes is to ensure that XHTML 2.0 can be readily supported by XML
browsers that have no arcane knowledge of XHTML semantics such as
linking, image maps, forms, etc. The development of XHTML 2.0 will
likely require the development of new XHTML modules or revisions to
existing XHTML modules.
]]] - http://www.w3.org/MarkUp/xhtml-roadmap/

For a start, this has been the "current" status of XHTML 2.0 over a
year, as far as I can remember. Are we ever going to see materials
pertaining to XHTML 2.0, or are the Working Group still actively
engaged in debating the relative merits of XHTML 1.0/Basic/1.1/m12n? I
thought those were recommendations now?

The main problem, as the subject of this email hints at, is that the
target market for future versions of XHTML 2.0 are rarely discussed by
the W3C, and I think it would be a good idea to do so before it's "too
late".

If (as the roadmap infers) XHTML 2.0 is just a redrafting of XHTML to
make it "pure XML", I think that would be a mistake. People are *not*
going to author XHTML 2.0 if they want it to display on all browsers,
because only a couple of the latest ones support XML technologies to
any usable degree. Backwards compatability is everything, and Web
users are generally very conservative (with a small "c"). We shall be
seeing V4 browsers for a while longer, and browsers such as Lynx and
IE3 which have no capacity for the XML range of technologies
"semantics". If the HTML WG were to pursue this line for XHTML 2.0, I
would be shocked.

There is an alternative route that XHTML 2.0 could take, one that has
been backed up with conversations that I've had with various people.
It is clear from the development community that people are resorting
to server side XML databases, and then transforming them using XSLT on
the fly for output. Hopefully, this method of delivery can be
intertwingled with the CC/PP and other profiling technologies that the
W3C are developing. What this points to is that XHTML 2.0 should be a
semantically rich language that can be transformed into other formats.
Including XHTML 1.0 etc.!

XHTML 2.0 should be something that will introduce pardigm shifts in
the way that people think about User Interface technologies. Jason
White put it excellently on the (public) wai-tech-comments list:-

[[[
I recognize, of course, that HTML can hardly capture the richness and
variability, both structurally and semantically, of the myriad
document types which are made available via the web. All that can be
achieved in the core of XHTML, is to provide markup conventions that
are likely to be used frequently across a variety of document types,
leaving it to the extension mechanism to permit the definition of more
specific and semantically precise constructs as the occasion demands.
Thus, one needs to be very careful in deciding which structures merit
inclusion in the predefined XHTML 2.0 modules, and in determining
which semantic distinctions genuinely need to be preserved, such that
the structures under consideration can not be adequately represented
by more generic elements.
]]] -
http://lists.w3.org/Archives/Public/wai-tech-comments/2000Oct/0005

It is very difficult to develop an XML language that is accessible and
interoperable, and doubly so if it is based upon XHTML 1.0/1.1/m12n.,
because they are so inherently awful, it's difficult to break out of
the pattern.

The XML Accessibility Guidelines [1] under development by the WAI
Protocols and Formats Working Group contain an excellent set of
axioms, and comments that can guide people when defining new XML
languages such as XHTML 2.0. AFAICT, XHTML 1.0 is in violation of 11
guidelines:-

1.2 Define flexible associations, where a given kind of relationship
can link to or from objects of varying types. (<img/> violates this)
2.1 Ensure all semantics are captured in markup in a repurposeable
form. (<hr/> violates this, amongst others)
2.2 Separate presentation properties using stylesheet
technology/styling mechanisms. (many elements violate this, including,
<b>, <font>, <hr/>, <big>, <small>, etc.)
2.3 Use the standard XML linking and pointing mechanisms (XLink and
XPointer). (<a> and <link> etc. violate this)
2.7 Provide a mechanism for identifying summary / abstract / title.
(there is a <title> element and attribute, but nothing for summaries,
or abstracts etc.)
2.10 Allow association of metadata with distinct elements and groups
of elements. (XHTML 1.0 is very poor in this respect)
3.2 Define navigable structures that allow discrete, sequential,
structured, and search navigation functionalities. (XHTML doesn't
define navigational elements)
3.4 Use a device-independent interaction and events model / module.
(XHTML 1.0 still allows "onclick" etc.)
4.3 Provide explicit human readable definitions for markup semantics.
(see the definition of <address> in the HTML 4.01 specification...
terrible!)
4.4 Use schema (in preference to DTD) to provide explicit
documentation/annotation of element/attribute/etc semantics.
4.5 Provide semantic relationships to other schemata where appropriate
and possible. (this one is, admittedly, difficult)

That's a lot of accessibility problems that need to be addressed, and
I can't help thinking that if (as the XHTML roadmap implies) XHTML 2.0
is built on top of XHTML 1.0 et al., these accessibility problems will
be carried over.

People want benefits, and new technologies have to have lots of
benefits before people with give up their old ways. No matter what
happens, XHTML 2.0 is going to have a slim target market. The aim is
to make it as usable as possible. The only benefit that generic XML
content languages offer over generic SGML content languages is their
transformability using XSLT, and to a limited extent, styling using
CSS (implementations of this are buggy). N.B. I have been personally
experimenting with generic server side XML languages for a while now,
and am currently developing a language called "XNote", a reference for
which is available upon request.

In summary, if XHTML 2.0 is a dismal failure (no matter what route is
taken), then we should not be surprised. HTML has been a runaway
success, but people are only interested in what they see as immediate
benefits, and "accessibility and interoperability" aren't usually seen
as being something that people can benefit from. Of course, we know
better. If XHTML 2.0 were to be a very carefully built language, it
should be possible to transform it into WML, XHTML 1.1, XHTML Basic,
DocBook and so on; this is the aim I have with XNote.

But if XHTML 2.0 is at least well built, it will serve as a good
beacon for others designing generic XML content languages, and who
knows, it might even be used a couple of times :-)

[1] http://www.w3.org/WAI/PF/XML/gl-20010807

--
Kindest Regards,
Sean B. Palmer
@prefix : <http://webns.net/roughterms/> .
:Sean :hasHomepage <http://purl.org/net/sbp/> .

Received on Sunday, 12 August 2001 10:27:07 UTC