an XHTML dialect for specs? from Dan Connolly on 1999-12-13 (spec-prod@w3.org from October to December 1999)

From: Dan Connolly <connolly@w3.org>
Date: Mon, 13 Dec 1999 14:38:23 -0600
To: spec-prod@w3.org
Message-ID: <3855593F.987B83E9@w3.org>

The XML spec DTD
	http://www.w3.org/XML/#xmlspec
is pretty cool in a lot of ways, but it seems
to me that it's arbitrarily different from HTML in expensive ways:
you *have* to use a batch process to convert from source form
to preview/display/delivery form.

Back in 1995, I wrote:

	Ideal Solution 

	Source format: HTML dialect 
	     use a strict HTML dialect with: tables, class=abstract, possibly
math. 
	Document Manipulation API: java interface 

	-- http://www.w3.org/MarkUp/SGML/spec-mgmt

and in May 1997, I did a little hacking
	http://www.w3.org/XML/9705/hacking
on a (psuedo-)XML parser and some tools to convert an HTML dialect to
lout
for typesetting.

The example I used was:
	http://www.w3.org/Architecture/NOTE-ioh-arch

i.e. source:
	http://www.w3.org/Architecture/NOTE-ioh-arch.html
output of lout conversion:
	http://www.w3.org/Architecture/NOTE-ioh-arch.lt
postscript output from lout:
	http://www.w3.org/Architecture/NOTE-ioh-arch.ps

along with
	http://www.w3.org/MarkUp/9705/report.dtd
which "refines" HTML:
	http://www.w3.org/MarkUp/9705/html.dtd
to add report titlepage stuff (abstract, ...) a section/subsection
structure,
bibliography stuff, etc.
(note that I used an extension to SGML marked section syntax for
modules)

Now that we have XHTML (nearly) and DOM and XSLT, I hope to revisit
this idea and finish the code, but I haven't managed to find time,
so I'm sending this message to see if anybody else is
interested/motivated.
For example, my conversion program was rules-based, and I hope
that it converts to XSLT straightforwardly:
	http://www.w3.org/XML/9705/report2lout.py

The idea is to use on XHTML dialect for editing *and* delivery.
It has some redundancy that is (or at least: could be) managed by
machine;
for example for quotations:

	<q><c>"</>Hyperdocument<c>"</></q>

If you're using lynx or Mosaic 2.0 or something, you just get
	"Hyperdocuments"
but you can also do a stylesheet that adds real printer's quotes
and supprsesses c as a child of q. If you don't want to add
these <c>"</c> things by hand, you can do it automatically
using DOM scripts or XSLT.

Same goes for all sorts of generated text: tables of contents,
cross references, indexes, ... . Not to mention specialized
notations like grammars and such. The idea is: you generate
those by machine, but you don't treat the results as junk
to be thrown away; you fold it back into the source.

-- 
Dan Connolly
http://www.w3.org/People/Connolly/

Received on Monday, 13 December 1999 15:38:31 UTC