- From: Liam R. E. Quin <liam@fromoldbooks.org>
- Date: Wed, 19 Apr 2023 15:29:03 -0400
- To: Steven Pemberton <steven.pemberton@cwi.nl>, LdBeth <andpuke@foxmail.com>
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, ixml <public-ixml@w3.org>
On Wed, 2023-04-19 at 17:28 +0000, Steven Pemberton wrote: > > I did get a reply from Lambert Meertens (one of the two editors still > available for comment) and he said that the typesetting was done by > Barry Mailloux, who probably used an existing system, although it > wasn't beyond his capabilities to create a specific format. I read that as Barry Manilow at first. There were vast numbers of typesetting formats in the 1970s. This has been described as one of the motivations for GML and later SGML - that there were ten thousand different typesetting formats in use by US government suppliers. Some of those formats lived on for a long time. It should be noted that, since the only validation was running the typesetting program and looking at the output, there are likely errors in the source files. I tend to use a text processing language - Perl often, but outside the expressions i try and keep it very simple and plain - and write a bunch of scripts that each do one or two things... use intermediate files for inspection... and a Makefile to represent the dependencies, so that only necessary steps are re=run... not so much for saving time as for avoiding surprises. Each step should produce secondary output that's a report - e.g. find-footnote-refs: found 37 footnote references and then a later script in the pipeline might say find-footnotes: found 38 footnotes Each script generates islands of XML; as soon as possible in the process the intermediate files are well-formed XML even if they are being processed with a Perl script... and eventually i get to XSLT or XQuery. The reason intermediate scripts might use Perl is if they are considering tags rather than elements, or they are handling overlap, to turn [i]italic [b]bold[/i] text[/b] into <i>italic <b>bold></b></i><b> text</b> inside an already-well-formed <p>...</p> element. File size and line count checks are also useful. For the example that started this thread, i’d probably look at handing each line to ixml, but it may be too hard to write a grammar that doesn't fail somewhere. Then it's possible either to make a script that changes the input (e.g. fixing a presumed error) or to work on a hand- edited copy of the file (i don't recommend that although i've done it.. and usually regretted it). liam -- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org
Received on Wednesday, 19 April 2023 19:29:34 UTC