Re: another possible use case for invisible XML from Liam R. E. Quin on 2023-04-19 (public-ixml@w3.org from April 2023)

From: Liam R. E. Quin <liam@fromoldbooks.org>
Date: Wed, 19 Apr 2023 15:29:03 -0400
To: Steven Pemberton <steven.pemberton@cwi.nl>, LdBeth <andpuke@foxmail.com>
Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, ixml <public-ixml@w3.org>
Message-ID: <299039d5ec0f290b6615c1ca59cd32f6788cb176.camel@fromoldbooks.org>

On Wed, 2023-04-19 at 17:28 +0000, Steven Pemberton wrote:
> 
> I did get a reply from Lambert Meertens (one of the two editors still
> available for comment) and he said that the typesetting was done by
> Barry Mailloux, who probably used an existing system, although it
> wasn't beyond his capabilities to create a specific format.

I read that as Barry Manilow at first.

There were vast numbers of typesetting formats in the 1970s. This has
been described as one of the motivations for GML and later SGML - that
there were ten thousand different typesetting formats in use by US
government suppliers.

Some of those formats lived on for a long time.

It should be noted that, since the only validation was running the
typesetting program and looking at the output, there are likely errors
in the source files.

I tend to use a text processing language - Perl often, but outside the
expressions i try and keep it very simple and plain - and write a bunch
of scripts that each do one or two things... use intermediate files for
inspection... and a Makefile to represent the dependencies, so that
only necessary steps are re=run... not so much for saving time as for
avoiding surprises.

Each step should produce secondary output that's a report - e.g.
find-footnote-refs: found 37 footnote references
and then a later script in the pipeline might say
find-footnotes: found 38 footnotes

Each script generates islands of XML; as soon as possible in the
process the intermediate files are well-formed XML even if they are
being processed with a Perl script... and eventually i get to XSLT or
XQuery.

The reason intermediate scripts might use Perl is if they are
considering tags rather than elements, or they are handling overlap, to
turn
[i]italic [b]bold[/i] text[/b]
into
<i>italic <b>bold></b></i><b> text</b>
inside an already-well-formed <p>...</p> element.

File size and line count checks are also useful.

For the example that started this thread, i’d probably look at handing
each line to ixml, but it may be too hard to write a grammar that
doesn't fail somewhere. Then it's possible either to make a script that
changes the input (e.g. fixing a presumed error) or to work on a hand-
edited copy of the file (i don't recommend that although i've done it..
and usually regretted it).

liam

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org

Received on Wednesday, 19 April 2023 19:29:34 UTC